I'm trying to do a regex operations in Java. But when I search in the Turkish text , I'm having trouble . For example;
Search Text = "Ahmet Yıldırım" or "Esin AYDEMİR"
//The e-mail stated in part(Ex: yildirim@example.com) , trying to look in name.
Regex Strings = "yildirim" or "aydemir".
Searched text is dynamically changing.Therefore , how can I solve this by using java regex pattern. Or How do I convert Turkish characters(Ex: AYDEMİR convert to AYDEMIR
or Yıldırım -> Yildirim
).
Sorry, about my grammer mistakes!...
Use Pattern.CASE_INSENSITIVE
and Pattern.UNICODE_CASE
flag:
Pattern p = Pattern.compile("yildirim", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Pattern.CASE_INSENSITIVE
by default only match case-insensitively for characters in US-ASCII character set. Pattern.UNICODE_CASE
modifies the behavior to make it match case-insensitively for all Unicode characters.
Do note that Unicode case-insensitive matching in Java regex is done in a culture-insensitive manner. Therefore, ı
, i
, I
, İ
are considered the same character.
Depending on your use case, you might want to use Pattern.LITERAL
if you want to disable all metacharacters in the pattern, or only escape literal parts of the pattern with Pattern.quote()