mysqlregexunicoderlike

same queries using 'regexp' gives different result in mysql


Basically, what I want, is to understand why

select 'aa' regexp '[h]' returns 0 and

select 'აა' regexp '[ჰ]' returns 1 ?

check FIDDLE


Solution

  • I think MqSQL regex does not support utf-8 yet. See bug 30241 and 12.5.2 Regular Expressions.

    Warning

    The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multibyte safe and may produce unexpected results with multibyte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.

    You could match the byte sequence without character class: SELECT 'აა' REGEXP 'ჰ' returns 0.