MobileSheets Forums

Full Version: Small problem in searching
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Searching for songs with non-English characters in the title (e.g., with umlauts) is working fine mostly, but the Norwegian o with a slash (ø) isn't showing up if you only enter the unmodified character. For example, Søreide doesn't get found if you enter Soreide. This is true in Companion as well as on the tablet.
Thanks!
Jeanne
Hi Jeanne,

I've enclosed two screenshots that seem to indicate that search
works as expected at least for the Title and Composer fields in
the current MobilsSheetsPro.

Since I no longer have a tablet with the (original) Mobilesheets,
I'm unable to test that.


Ketil
Hi,
It's searching correctly when you put the special character in the search box, which is what your screenshots show. It should also find those words if you only put a regular character (makes it quicker to search on a standard English keyboard, and also makes behavior consistent with Google and other search engines). So: if I search for "soreide" it should retrieve both "Søreide" and "Soreide". It works correctly for ö but not ø.
Thanks,
Jeanne
I'm guessing Mike doesn't have (hasn't been told) the equivalent combinations to include.

If you want

æ  also = a or ae
ø also = o or oe
å also = a or aa

you probably just have to tell Mike about it.

Personally, I'm not crazy about the "single-letter" equivalents because it leads to too
many false results, but go ahead and suggest it to Mike.

Ketil
This is going to get a bit technical, so forgive me, but the way words are transformed before searches is as follows: The words are normalized using Normalization Form D - Canonical Decomposition. The definition of that is: Characters are decomposed by canonical equivalence, and multiple combining characters are arranged in a specific order. This removes special characters so that something like å is replaced by a and a diacritical mark (which I remove, leaving just 'a'). The transformation is applied both to the search term entered in the box as well as all of the song values so that no special characters are used in the comparisons.

This handles most cases, but it would appear that ø is not just an o with a diacritical mark. It's an entirely separate character, so in order to handle that, I would have to add specific character substitutions during my search. I'll have to consider how I want to handle that going forward. If this is going to be important to a lot of users, I can consider performing a replacement of ø with o before comparions as well, but I don't want to add the overhead unless it's actually useful to the majority of people.

Mike
Fiddler Jeanne's point is that Google's normalization works "both ways":
it's possible to enter "oe" and get results containing "ø", and it's even
possible to enter "o" and get results containing "ø".

So all three searches on the composer field should give results that
include "Løvland" (see attachments).

Ketil
Well if someone knows an implementation (i.e. link to code snippet) that can handle normalizing across all languages so that standard characters can be used, please let me know. Google may be doing a lot of language specific processing on the search phrase and results. In theory I can do the same, but I'm not familiar enough the rules and characters of each language to know what replacements are proper without considerable research.
I'm considering adding a language check so that if the app language is set to Norwegian, it will perform the replacement of ø. Would that be sufficient, or do you also want it to work when the app language is set to English? I'm just trying to avoid having to perform multiple string substitutions/operations while comparing the search string to every field in the library (I basically have to normalize and substitute not only the search term but everything I compare the search term against as special characters cannot be present in either). I'm not sure what the performance impact will be, but unless it has a significant benefit, I'm looking to avoid it altogether.

Thanks,
Mike
I'm afraid non-Norwegians may have Norwegian songs in their songbooks...
Also, my language is set to English despite I'm Dutch.
I agree with you, sciurius,

this is of most interest when you don't have the keyboard for the national
characters you're searching for. 

Ketil
I'll just go ahead and make the replacement always then. I'll test with a library of 5000 songs and if I don't notice a performance issue, I'll consider this particular issue resolved.

Thanks,
Mike
(02-08-2016, 03:10 AM)Ketil Wrote: [ -> ]this is of most interest when you don't have the keyboard for the national characters you're searching for.

Is that really an issue? This is not 1980 anymore. I haven't seen a computer system that didn't have the possibility to enter virtually any character using selected input methods (e.g., dead keys, compose keys). Tablets and phones as well.
Great, you learn something new every day:

Please tell me what combinations of keys/long holds I use to produce
"à", "â", "Ç", "Ê", "ë", "î" "ô", "ù", "Û" using Swiftkey on my tablet?

Ketil
On the standard Android keyboard, you long press "a" (and several other keys) and an extension panel appears where you can directly choose the accented variants.
I use the Hackers keyboard, which does the same.
No doubt Swiftkey has a similar function, but I don't use it so I cannot tell you. Interesting that you install a replacement input method without knowing what the standard tool does...
BTW, wrt the original question, a long press on "o" also includes "ø" and "œ".
yes,

although different keyboards have different panels (my Archos FamilyPad has another variant). 
I've been using SwiftKey for the last five years because of it's strong prediction engine.

Ketil
Pages: 1 2