UMS getting stuck, some hard to spot bug
UMS getting stuck, some hard to spot bug
Hi there,
I am running UMS 24/7 on Linux. It runs for days, then sometimes it gets stuck. I am experiencing this behaviour since 7.0.0 RC.
The symptom is simple: the whole server gets stuck, unable to spawn new processes.
The logs seems to be normal except of a suspicious line:
the rest is all "Sending ALIVE" and "File watcher".
One more clue: while the problem happens, i am unable to login on that box; but, if i leave a ssh open and i am lucky enough to have it crash while it's open, here are more symptoms:
I am running UMS 24/7 on Linux. It runs for days, then sometimes it gets stuck. I am experiencing this behaviour since 7.0.0 RC.
The symptom is simple: the whole server gets stuck, unable to spawn new processes.
The logs seems to be normal except of a suspicious line:
Code: Select all
DEBUG 2018-10-07 18:25:40.528 [HTTPv2 Request Worker 5] Caught exception: unable to create new native thread
One more clue: while the problem happens, i am unable to login on that box; but, if i leave a ssh open and i am lucky enough to have it crash while it's open, here are more symptoms:
- "lsof" hangs forever, even when using the "-b" switch
- I cannot sudo into the ums user, because "Cannot execute /bin/bash: Resource temporarily unavailable"
- Killing the java ums process with SIGKILL (SIGTERM is ignored) restores the system back to normal behavior
Re: UMS getting stuck, some hard to spot bug
You are correct about it "leaking resources". I run UMS on a Linux server 24/7 as well. With mine it does not effect the server it is just that UMS crashes.
I have researched the error I get when it crashes which is a "Java Heap Space" error. All indications are that it is just bad Java code.
The only thing I have been able to do is add a cron job to restart UMS every 2 days.
Did you run UMS in 'Trace" mode which gives you much more information in the log files?
I have researched the error I get when it crashes which is a "Java Heap Space" error. All indications are that it is just bad Java code.
The only thing I have been able to do is add a cron job to restart UMS every 2 days.
Did you run UMS in 'Trace" mode which gives you much more information in the log files?
Re: UMS getting stuck, some hard to spot bug
My crash usually occurs every 15+ days.
I have now tried to set a 1000 processes limit for ums user on "/etc/security/limits.conf"… let's see if this will alter the crash frequency and prevent the whole system from freezing. If it does, it will also confirm that the problem is in process/thread number.
I have now also enabled trace mode, thank you for the hint.
I have now tried to set a 1000 processes limit for ums user on "/etc/security/limits.conf"… let's see if this will alter the crash frequency and prevent the whole system from freezing. If it does, it will also confirm that the problem is in process/thread number.
I have now also enabled trace mode, thank you for the hint.
Re: UMS getting stuck, some hard to spot bug
Yes this does sound similar to the bug that boss has been experiencing for a long time, but his was happening before v7. However the problem is likely different to that one which is caused by boss' TV spamming browse requests in a loop, which causes UMS to eat up a lot more memory than it normally would.
Like boss said, it would be good to have TRACE logs. The fact that it appeared in v7 is good news because it means we can probably fix it. It would be good to really isolate the version, have you definitely verified that it was not happening in the v7 betas, and first appeared in RC?
Like boss said, it would be good to have TRACE logs. The fact that it appeared in v7 is good news because it means we can probably fix it. It would be good to really isolate the version, have you definitely verified that it was not happening in the v7 betas, and first appeared in RC?
Re: UMS getting stuck, some hard to spot bug
I had no issues in v6, but i used it for a quite short period since i am a new user, so i can't really exclude it.
I update ums when i can, i had this issue on 7.0.0, 7.0.0-rc1, 7.0.0-rc2, 7.0.1; my last post is based on 7.3.1.
I have now upgraded to 7.4.0 and waiting for the issue to show up.
I update ums when i can, i had this issue on 7.0.0, 7.0.0-rc1, 7.0.0-rc2, 7.0.1; my last post is based on 7.3.1.
I have now upgraded to 7.4.0 and waiting for the issue to show up.
Re: UMS getting stuck, some hard to spot bug
Ok great, thanks. Please make sure you're running in TRACE logging mode so that we can have the logs if it happens again
Re: UMS getting stuck, some hard to spot bug
It happened again after about 2 months of uptime. Looks like it is improving.
Trace logs have nothing unusual (UPNPHelper receiving Notify and UPNP-AliveMessageSender sending alive the whole time).
The only thing a bit more weird is File watcher logging an ENTRY_MODIFY line for every single file in the collection, every day around 6AM.
I am now updating to 7.7.0 and restarting it.
Trace logs have nothing unusual (UPNPHelper receiving Notify and UPNP-AliveMessageSender sending alive the whole time).
The only thing a bit more weird is File watcher logging an ENTRY_MODIFY line for every single file in the collection, every day around 6AM.
Code: Select all
DEBUG 2018-12-22 06:25:02.149 [File watcher] net.pms.util.FileWatcher ENTRY_MODIFY (ct=1): <filename>
Re: UMS getting stuck, some hard to spot bug
Hi bodom,
We have made some big improvements to long-term stability in 8.0.0 so you might be interested in testing that too. Two months is a nice run, maybe you can get even longer in v8
We have made some big improvements to long-term stability in 8.0.0 so you might be interested in testing that too. Two months is a nice run, maybe you can get even longer in v8
Re: UMS getting stuck, some hard to spot bug
Thank you!
I'll try the 8.x next
I'll try the 8.x next
Re: UMS getting stuck, some hard to spot bug
Not a real update but just to let you know that it got stuck again and i am now about to try 8.0.0-b1