I'm quite confused. I have a source string in the database; some HTML:
"body": "<html><head>\r\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\"></head><body><div dir=\"auto\">Nope no attachment</div><br><div class=\"gmail_quote\"><div dir=\"ltr\" class=\"gmail_attr\">
Here's the SELECT:
SELECT * FROM table1 WHERE column_details REGEXP '(nope.no).attach';
When I select this with REGEXP of '(nope.no).attach' it fails to match. When I uppercase the Nope, it matches. When I LCASE(column_details) and go back to lower-case nope, it matches. What's going on here? My understanding is REGEXP is case insensitive, and as far as I know I'm not enforcing binary mode... or is it binary by default? If so, how do I make it case insensitive and disable binary mode matching?
Thanks!
It's not that REGEXP is case-insensitive. It depends on the collation of the column. REGEXP can be case-sensitive or insensitive.
mysql> set @h = '<html><head>\r\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\"></head><body><div dir=\"auto\">Nope no attachment</div><br><div class=\"gmail_quote\"><div dir=\"ltr\" class=\"gmail_attr\">';
Query OK, 0 rows affected (0.00 sec)
mysql> select @h regexp '(nope.no).attach';
+------------------------------+
| @h regexp '(nope.no).attach' |
+------------------------------+
| 1 |
+------------------------------+
1 row in set (0.00 sec)
mysql> select @h collate utf8mb4_bin regexp '(nope.no).attach';
+--------------------------------------------------+
| @h collate utf8mb4_bin regexp '(nope.no).attach' |
+--------------------------------------------------+
| 0 |
+--------------------------------------------------+
I'd check the collation of your column:
SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'table1' AND COLUMN_NAME = 'column_details';