I have a website connected to a MySQL database of songs, and am working to improve full-text search on the site. A simple query with MATCH() works as expected, but as soon as I add an aggregate function to the SELECT statement, MATCH() no longer returns relevant results.
The query I'm using is somewhat complicated, but I was able to boil it down to a minimal example:
CREATE TABLE `Song` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(300) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `FTS_Songs` (`title`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO Song (`title`)
VALUES ('The Morning Breaks'), ('The Spirit of God'), ('Now Let Us Rejoice');
SELECT id, title,
MATCH(title) AGAINST ('"the spirit of god"' IN BOOLEAN MODE) AS exactTitleRelevance,
MATCH(title) AGAINST ('+spirit* +god*' IN BOOLEAN MODE) AS titleRelevance
FROM Song
GROUP BY id
ORDER BY exactTitleRelevance DESC, titleRelevance DESC;
Correct result: The row for "The Spirit of God" has a high relevance score, and other rows have a relevance of 0.
SELECT id, title,
MATCH(title) AGAINST ('"the spirit of god"' IN BOOLEAN MODE) AS exactTitleRelevance,
MATCH(title) AGAINST ('+spirit* +god*' IN BOOLEAN MODE) AS titleRelevance,
COUNT(*) AS aggregateColumn
FROM Song
GROUP BY id
ORDER BY exactTitleRelevance DESC, titleRelevance DESC;
Incorrect result: The row for "The Spirit of God" has a relevance score of 0, and a different row that doesn't match the query has a high relevance score.
What am I doing wrong? Is it possible to use MATCH() and aggregate functions like COUNT() or GROUP_CONCAT() together in the same query?
I’ve tried and gotten the same results on two different servers – one with MySQL 8.0.37, and the other with MySQL 8.3.0.
You can use EXPLAIN
to check how MySQL executes the query:
EXPLAIN SELECT id, title,
MATCH(title) AGAINST ('"the spirit of god"' IN BOOLEAN MODE) AS exactTitleRelevance,
MATCH(title) AGAINST ('+spirit* +god*' IN BOOLEAN MODE) AS titleRelevance,
COUNT(*) AS aggregateColumn
FROM Song
GROUP BY id
ORDER BY exactTitleRelevance DESC, titleRelevance DESC;
This will give:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | Song | NULL | index | PRIMARY,FTS_Songs | PRIMARY | 4 | NULL | 3 | 100.00 | Using temporary; Using filesort |
The aggregate function COUNT()
along with GROUP BY
changes how MySQL processes the query. MySQL uses "id" (PRIMARY) as an index instead of FTS_Songs (FULLTEXT) causing MATCH()
to not work as expected.
To make sure MATCH()
calculates relevance accurately:
e.g.
WITH lookup AS (
SELECT
id,
title,
MATCH(title) AGAINST ('"the spirit of god"' IN BOOLEAN MODE) AS exactTitleRelevance,
MATCH(title) AGAINST ('+spirit* +god*' IN BOOLEAN MODE) AS titleRelevance
FROM Song
WHERE MATCH(title) AGAINST ('"the spirit of god"' IN BOOLEAN MODE)
)
SELECT
l.id,
l.title,
l.exactTitleRelevance,
l.titleRelevance,
COUNT(*) AS aggregateColumn
FROM lookup l
GROUP BY l.id, l.title, l.exactTitleRelevance, l.titleRelevance;