UMS getting stuck, some hard to spot bug

General discussion about Universal Media Server (no support or requests)
bodom
Posts: 24
Joined: Wed Nov 29, 2017 9:18 am

UMS getting stuck, some hard to spot bug

Post by bodom »

Hi there,

I am running UMS 24/7 on Linux. It runs for days, then sometimes it gets stuck. I am experiencing this behaviour since 7.0.0 RC.

The symptom is simple: the whole server gets stuck, unable to spawn new processes.

The logs seems to be normal except of a suspicious line:

Code: Select all

DEBUG 2018-10-07 18:25:40.528 [HTTPv2 Request Worker 5] Caught exception: unable to create new native thread
the rest is all "Sending ALIVE" and "File watcher".

One more clue: while the problem happens, i am unable to login on that box; but, if i leave a ssh open and i am lucky enough to have it crash while it's open, here are more symptoms:
  • "lsof" hangs forever, even when using the "-b" switch
  • I cannot sudo into the ums user, because "Cannot execute /bin/bash: Resource temporarily unavailable"
  • Killing the java ums process with SIGKILL (SIGTERM is ignored) restores the system back to normal behavior
My guess is that UMS somehow leaks resources: when left open for a long time, it creates new "something" (files handles? threads?) until it reaches the OS limits; when this happens, it keeps running, but unable to answer new requests, just like any other process on the system.
boss
Posts: 343
Joined: Thu Jun 30, 2016 1:07 pm

Re: UMS getting stuck, some hard to spot bug

Post by boss »

You are correct about it "leaking resources". I run UMS on a Linux server 24/7 as well. With mine it does not effect the server it is just that UMS crashes.
I have researched the error I get when it crashes which is a "Java Heap Space" error. All indications are that it is just bad Java code.

The only thing I have been able to do is add a cron job to restart UMS every 2 days.

Did you run UMS in 'Trace" mode which gives you much more information in the log files?
bodom
Posts: 24
Joined: Wed Nov 29, 2017 9:18 am

Re: UMS getting stuck, some hard to spot bug

Post by bodom »

My crash usually occurs every 15+ days.

I have now tried to set a 1000 processes limit for ums user on "/etc/security/limits.conf"… let's see if this will alter the crash frequency and prevent the whole system from freezing. If it does, it will also confirm that the problem is in process/thread number.

I have now also enabled trace mode, thank you for the hint.
User avatar
SubJunk
Lead Developer
Posts: 3705
Joined: Sun May 27, 2012 4:12 pm

Re: UMS getting stuck, some hard to spot bug

Post by SubJunk »

Yes this does sound similar to the bug that boss has been experiencing for a long time, but his was happening before v7. However the problem is likely different to that one which is caused by boss' TV spamming browse requests in a loop, which causes UMS to eat up a lot more memory than it normally would.

Like boss said, it would be good to have TRACE logs. The fact that it appeared in v7 is good news because it means we can probably fix it. It would be good to really isolate the version, have you definitely verified that it was not happening in the v7 betas, and first appeared in RC?
bodom
Posts: 24
Joined: Wed Nov 29, 2017 9:18 am

Re: UMS getting stuck, some hard to spot bug

Post by bodom »

I had no issues in v6, but i used it for a quite short period since i am a new user, so i can't really exclude it.

I update ums when i can, i had this issue on 7.0.0, 7.0.0-rc1, 7.0.0-rc2, 7.0.1; my last post is based on 7.3.1.

I have now upgraded to 7.4.0 and waiting for the issue to show up.
User avatar
SubJunk
Lead Developer
Posts: 3705
Joined: Sun May 27, 2012 4:12 pm

Re: UMS getting stuck, some hard to spot bug

Post by SubJunk »

Ok great, thanks. Please make sure you're running in TRACE logging mode so that we can have the logs if it happens again
bodom
Posts: 24
Joined: Wed Nov 29, 2017 9:18 am

Re: UMS getting stuck, some hard to spot bug

Post by bodom »

It happened again after about 2 months of uptime. Looks like it is improving.

Trace logs have nothing unusual (UPNPHelper receiving Notify and UPNP-AliveMessageSender sending alive the whole time).

The only thing a bit more weird is File watcher logging an ENTRY_MODIFY line for every single file in the collection, every day around 6AM.

Code: Select all

DEBUG 2018-12-22 06:25:02.149 [File watcher] net.pms.util.FileWatcher ENTRY_MODIFY (ct=1): <filename>
I am now updating to 7.7.0 and restarting it.
User avatar
SubJunk
Lead Developer
Posts: 3705
Joined: Sun May 27, 2012 4:12 pm

Re: UMS getting stuck, some hard to spot bug

Post by SubJunk »

Hi bodom,
We have made some big improvements to long-term stability in 8.0.0 so you might be interested in testing that too. Two months is a nice run, maybe you can get even longer in v8
bodom
Posts: 24
Joined: Wed Nov 29, 2017 9:18 am

Re: UMS getting stuck, some hard to spot bug

Post by bodom »

Thank you!

I'll try the 8.x next
bodom
Posts: 24
Joined: Wed Nov 29, 2017 9:18 am

Re: UMS getting stuck, some hard to spot bug

Post by bodom »

Not a real update but just to let you know that it got stuck again and i am now about to try 8.0.0-b1
Post Reply