mysqlregexcpu-wordword-boundaries

Is space considered a word boundary when using [[:<:]][[:>:]]?


I'm trying to create a pretty simple search query which matches whatever the search term but on full words so partial matches are out of the question e.g.

The string to look for is:

" (This is) my test string "

and the matching queries could be:

"This is"
"(This is)"
"(This is"
"This is)"
"my test"
"string"

etc. but not:

"This is my"
"(thi"
"my tes"

etc. as these are only partial matches. LIKE is of no use here so I'm trying with REGEXP and word boundaries:

1. SELECT " (This is) my test string " REGEXP "(^|[[:<:]])this is([[:>:]]|$)"; -> 1
2. SELECT " (This is) my test string " REGEXP "(^|[[:<:]])\\(this is\\)([[:>:]]|$)"; -> 0
3. SELECT " This is my test string " REGEXP "(^|[[:<:]])this is([[:>:]]|$)"; -> 1

The question is now why don't boundaries in pt.2 cover spaces whereas they seem to in pt.3 (round brackets removed thos show the difference)? What am I missing here ?

Thanks


Solution

  • As documented under Regular Expressions:

    These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).

    Since ) is not a word character, the pattern \)[[:>:]] can never be matched; [[:>:]]\), on the other hand, would be matched.