MobileSheets Forums

Full Version: Encoding problem
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I cannot get the attached file to import correctly. The file is properly encoded (UTF-8) but MSPro interprets it as ISO-8859.1 (so the ’ shows as ’).

I have many UTF-8 files that are interpreted correctly but I cannot find out why this one is not.

[attachment=798]
I'll try to look into it, but I'm using the CharsetDetector from the ICU library, so I don't really have a lot of insight into how it determines what encoding to use. It's possible that this is just one file where you will have to manually change the encoding in MobileSheetsPro.
I don't know if it matters, but the file doesn't have a BOM (Byte Order Mark) identifying it as UTF-8. If I load the file into Notepad++ and change the encoding to UTF-8-BOM, save and load the new file into MsPro I don't get the odd characters you show.

I just checked with your original file and I do get the odd characters. So adding the BOM improves/changes things.

HTH

Andy
Yes, adding a BOM helps, although a BOM should not be necessary for UTF-8 encoded data. And many apps do not support UTF-8 BOMs.
I agree the BOM shouldn't be necessary, but it might help Mike track down the problem. Really you should be able to treat everything as UTF-8 unless you find a marker to the contrary. It's the unicode files without a BOM that are "wrong". (Although apps that don't accept a BOM are suspect :-) Text handling is a minefield best left to the experts.

Andy
(08-26-2017, 05:32 AM)AndyL Wrote: [ -> ]Really you should be able to treat everything as UTF-8 unless you find a marker to the contrary. It's the unicode files without a BOM that are "wrong".

These two sentences confirm your conclusion that "Text handling is a minefield best left to the experts." Smile

The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use. (emphasis mine)