Can anyone help me with a javascript regular expression that I can use to compare strings that are the same, taking into acccount their non-Umlaut-ed versions.
for example, in German the word Grüße
can also be written Gruesse
. These two strings are to be considered identical. The mappings (ignoring casings for the moment) are:
As there are not many "couplets" to consider I could do a replace for each variation, but I'm wondering if there is a more elegant way, especially as this use case might need to be extended in future to include e.g. Scandanavian characters...
something like
tr = {"ä":"ae", "ü":"ue", "ö":"oe", "ß":"ss" }
replaceUmlauts = function(s) {
return s.replace(/[äöüß]/g, function($0) { return tr[$0] })
}
compare = function(a, b) {
return replaceUmlauts(a) == replaceUmlauts(b)
}
alert(compare("grüße", "gruesse"))
you can easily extends this by adding more entries to "tr"
not quite elegant, but works