I cannot get the attached file to import correctly. The file is properly encoded (UTF-8) but MSPro interprets it as ISO-8859.1 (so the ’ shows as ’).
I have many UTF-8 files that are interpreted correctly but I cannot find out why this one is not.
I'll try to look into it, but I'm using the CharsetDetector from the ICU library, so I don't really have a lot of insight into how it determines what encoding to use. It's possible that this is just one file where you will have to manually change the encoding in MobileSheetsPro.
08-25-2017, 07:34 PM (This post was last modified: 08-25-2017, 08:50 PM by AndyL.)
I don't know if it matters, but the file doesn't have a BOM (Byte Order Mark) identifying it as UTF-8. If I load the file into Notepad++ and change the encoding to UTF-8-BOM, save and load the new file into MsPro I don't get the odd characters you show.
I just checked with your original file and I do get the odd characters. So adding the BOM improves/changes things.
I agree the BOM shouldn't be necessary, but it might help Mike track down the problem. Really you should be able to treat everything as UTF-8 unless you find a marker to the contrary. It's the unicode files without a BOM that are "wrong". (Although apps that don't accept a BOM are suspect :-) Text handling is a minefield best left to the experts.
08-26-2017, 06:04 AM (This post was last modified: 08-26-2017, 06:04 AM by sciurius.)
(08-26-2017, 05:32 AM)AndyL Wrote: Really you should be able to treat everything as UTF-8 unless you find a marker to the contrary. It's the unicode files without a BOM that are "wrong".
These two sentences confirm your conclusion that "Text handling is a minefield best left to the experts."
The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use. (emphasis mine)