Requirement
I want to make it in one query. How to make an efficient query?
The followings are querys that i made with REGEXP and FULLTEXT.
REGEXP
(SELECT * FROM board
WHERE
title rlike 'first' AND title rlike 'second'
ORDER BY board_id DESC LIMIT 1000)
UNION
(SELECT * FROM board
WHERE
content rlike 'first' AND content rlike 'second'
ORDER BY board_id DESC LIMIT 1000)
UNION
(SELECT * FROM board
WHERE
tag rlike 'first' AND tag rlike 'second'
ORDER BY board_id DESC LIMIT 1000)
LIMIT 1000;
FULLTEXT
(SELECT * FROM board
WHERE
match(title) AGAINST('+"first" +"second"' in boolean mode)
ORDER BY board_id DESC LIMIT 1000)
UNION
(SELECT * FROM board
WHERE
match(content) AGAINST('+"first" +"second"' in boolean mode)
ORDER BY board_id DESC LIMIT 1000)
UNION
(SELECT * FROM board
WHERE
match(tag) AGAINST('+"first" +"second"' in boolean mode)
ORDER BY board_id DESC LIMIT 1000)
LIMIT 1000;
As i now, REGEXP doesn't use index but it is faster than fulltext. I don't understand why it happens too.
CREATE STATEMENT
CREATE TABLE `board` (
`board_id` bigint NOT NULL AUTO_INCREMENT,
`user_id` bigint NOT NULL,
`nickname` varchar(255) NOT NULL,
`category` int NOT NULL,
`title` varchar(255) NOT NULL,
`content` text NOT NULL,
`likes` int NOT NULL DEFAULT '0',
`hits` int NOT NULL DEFAULT '0',
`tag` varchar(255) DEFAULT NULL,
`create_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`modify_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`board_id`),
KEY `popular` (`create_date`,`likes`),
FULLTEXT KEY `fttitle` (`title`),
FULLTEXT KEY `ftcontent` (`content`),
FULLTEXT KEY `fttag` (`tag`)
) ENGINE=InnoDB AUTO_INCREMENT=60027 DEFAULT CHARSET=utf8;
The regexp and Union won't be in any particular order.
In most formulations, it would be better to fetch the ids in the unions first, then JOIN to get the rest of the columns from the few rows that result.
Be aware that UNION
means UNION DISTINCT
and has a dedupping pass. (This is probably what you want, even though it is slower than UNION ALL
.)
Most of the ORDER BY
you have shown are useless -- they will do nothing, except (maybe) waste time. ORDER BY
and LIMIT
need to be paired up for each subquery. And then again after doing the UNION
. A related topic: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or
nnichols's answer fails to dedup. That can be remedied by an extra (outer) Select that Groups the answers and picks MIN(col_sort)
with GROUP BY id
before doing the order by.
More
SELECT c.*
FROM (
SELECT id, MIN(col_sort) AS col_sort2
FROM (
SELECT id, 1 AS col_sort
FROM ...
WHERE MATCH ...
LIMIT 100
UNION ALL
SELECT id, 2 AS col_sort
FROM ...
WHERE MATCH ...
LIMIT 100
UNION ALL
SELECT id, 3 AS col_sort
FROM ...
WHERE MATCH ...
LIMIT 100
) AS a
GROUP BY id
) AS b
JOIN board AS c USING(id)
ORDER BY b.col_sort2
LIMIT 100;
Don't use a huge number like 1000 for the LIMIT. It does get rather messy to maintain the 'relevance' through the queries.