UMS and Special Characters in Subtitles
Forum rules
Please make sure you follow the Problem Reporting Guidelines before posting if you want a reply
Please make sure you follow the Problem Reporting Guidelines before posting if you want a reply
UMS and Special Characters in Subtitles
This is not a bug per se, more like UMS chooses poorly(?)
I have in the past left the non-unicode subtitles encoding detection to Auto-detect. However, I don't know what encoding it's using, but the auto detection will not display curly quotes, N or M dashes, or ellipses correctly. For example, it’s is rendered as it’s. However, manually setting the code to Windows 1252 will solve these problems. Not sure if UMS needs to improve the detection or not. Just reporting what I have noticed.
I have in the past left the non-unicode subtitles encoding detection to Auto-detect. However, I don't know what encoding it's using, but the auto detection will not display curly quotes, N or M dashes, or ellipses correctly. For example, it’s is rendered as it’s. However, manually setting the code to Windows 1252 will solve these problems. Not sure if UMS needs to improve the detection or not. Just reporting what I have noticed.
Re: UMS and Special Characters in Subtitles
Character set detection is very tricky, especially for short texts. UMS uses ICU4J for this. I don't think we would be able to "improve" the detection, some of the character sets are very similar and some texts simply don't use words that reveal the difference.
What you should do instead is use UTF-8. Converting them to UTF-8 can be done easily for example by using Notepad++
What you should do instead is use UTF-8. Converting them to UTF-8 can be done easily for example by using Notepad++
Re: UMS and Special Characters in Subtitles
Thank you for your explanation! Much appreciated.