I have a large number of names, mostly using a German character set, i.e., ASCII plus ä,ö,ü,ß. Some names use special characters (e.g. ğ) which I would like to transliterate into the German version. So, "Özoğuz" should become "Özoguz".
I have tried
stri_trans_general("Özoğuz", "de-ASCII")
but that will result in "Oezoguz" not the desired "Özoguz".
The de-ASCII
rule set translates Ö
to Oe
. If you want to deviate from this rule but otherwise maintain the German ASCII rule set, the stringi
docs state that Custom rule-based transliteration is also supported.
We can define rules which translate (upper and lower case) Ö
to a third character, apply the de-ASCII
rules to everything else, then translates the third character back to Ö
:
id <- "
Ö > \u2135;
ö > \u2136;
:: de-ASCII;
\u2135 > Ö;
\u2136 > ö
"
stringi::stri_trans_general("Özoğuz", id, rules = TRUE)
# [1] "Özoguz"
I have used "ℵ"
and "ℶ"
for upper and lower case Ö
respectively, but any utf-8 characters you are sure will not be in your string should work.