phpregexunicodecyrillic

regexp with russian lang


I can't solve my problem with regexp.

Ok, when i type:

$string = preg_replace("#\[name=([a-zA-Z0-9 .-]+)*]#","$name_start $1 $name_end",$string);

everything is ok, except situation with Russian language.

so, i try to re-type this reg-exp:

$string = preg_replace("#\[name=([a-zA-Z0-9**а-яА-Я** .-]+)*]#","$name_start $1 $name_end",$string);

but this not working,

i know some idea, just write:

$string = preg_replace("#\[name=([a-zA-Z0-9йцукенгшщзхъфывапролджэячсмитьбю .-]+)*]#","$name_start $1 $name_end",$string);

but this is crazy :D

please, give me simple variant


Solution

  • Try a Unicode range:

    '/[\x{0410}-\x{042F}]/u'  // matches a capital cyrillic letter in the range A to Ya
    

    Don't forget the /u flag for Unicode.

    In your case:

    "#\[name=([a-zA-Z0-9\x{0430}-\x{044F}\x{0410}-\x{042F} .-]+)*]#u"
    

    Note that the STAR in your regex is redundant. Everything already gets "eaten" by the PLUS. This would do the same:

    "#\[name=([a-zA-Z0-9\x{0430}-\x{044F}\x{0410}-\x{042F} .-]+)]#u"