I have created the following table fruits
-
CREATE TABLE `fruits` (
`id` tinyint unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(200) NOT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `ft_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Then I entered the following values in table fruits
-
SELECT * FROM fruits;
+----+---------------+
| id | name |
+----+---------------+
| 1 | apple, orange |
| 2 | apple, mango |
| 3 | mango, kiwi |
| 4 | mango, guava |
| 5 | apple, banana |
+----+---------------+
Now I run the following three SQL queries -
Query 1:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+apple' IN BOOLEAN MODE);
+----+---------------+
| id | name |
+----+---------------+
| 1 | apple, orange |
| 2 | apple, mango |
| 5 | apple, banana |
+----+---------------+
Query 2:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+apple -orange' IN BOOLEAN MODE);
+----+---------------+
| id | name |
+----+---------------+
| 2 | apple, mango |
| 5 | apple, banana |
+----+---------------+
Query 3:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+apple ~orange' IN BOOLEAN MODE);
+----+---------------+
| id | name |
+----+---------------+
| 1 | apple, orange |
| 2 | apple, mango |
| 5 | apple, banana |
+----+---------------+
As per MySQL developer website following is the function of ~
(tilde) operator in 'Boolean Full-Text Searches'
https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html
- '+apple ~macintosh'
Find rows that contain the word “apple”, but if the row also contains the word “macintosh”, rate it lower than if row does not. This is “softer” than a search for '+apple -macintosh', for which the presence of “macintosh” causes the row not to be returned at all.
I have tried the ~
(tilde) operator in 'Query 3' but the output is certainly not what is expected. Here, the expected behavior is row with id = 1
coming at last.
P.S. I am using MySQL version - 8.0.26-0ubuntu0.20.04.2 for Linux on x86_64 ((Ubuntu))
Though there is no documentation regarding my answer anywhere, after thorough experimentation I have arrived at this most logical conclusion -
Presence of '+' operator nullifies any effect of '~' operator
I have updated my table fruits
with the following values -
SELECT * FROM fruits;
+----+-------------------------+
| id | name |
+----+-------------------------+
| 1 | apple orange watermelon |
| 2 | apple mango pomegranate |
| 3 | apple mango banana |
| 4 | mango kiwi pomegranate |
| 5 | mango guava watermelon |
| 6 | apple banana kiwi |
+----+-------------------------+
Query 1:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('apple mango ~pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 3 | apple mango banana |
| 1 | apple orange watermelon |
| 5 | mango guava watermelon |
| 6 | apple banana kiwi |
| 2 | apple mango pomegranate |
| 4 | mango kiwi pomegranate |
+----+-------------------------+
Query 2:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('apple ~pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 1 | apple orange watermelon |
| 3 | apple mango banana |
| 6 | apple banana kiwi |
| 2 | apple mango pomegranate |
+----+-------------------------+
Query 3:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('mango ~pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 3 | apple mango banana |
| 5 | mango guava watermelon |
| 2 | apple mango pomegranate |
| 4 | mango kiwi pomegranate |
+----+-------------------------+
Here, in the queries 1, 2 & 3 no operator precedes the values apple
and mango
and ~
operator precedes the value pomegranate
. This makes sure that the rows having the word pomegranate
are ranked lower than others.
Query 4:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+apple +mango ~pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 2 | apple mango pomegranate |
| 3 | apple mango banana |
+----+-------------------------+
Query 5:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+apple ~pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 1 | apple orange watermelon |
| 2 | apple mango pomegranate |
| 3 | apple mango banana |
| 6 | apple banana kiwi |
+----+-------------------------+
Query 6:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+mango ~pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 2 | apple mango pomegranate |
| 3 | apple mango banana |
| 4 | mango kiwi pomegranate |
| 5 | mango guava watermelon |
+----+-------------------------+
Here, in the queries 4, 5 & 6 +
operator precedes the values apple
and mango
and ~
operator precedes the value pomegranate
. Clearly the presence of +
operator nullifies any effect of ~
operator.
Query 7:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+apple +mango <pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 3 | apple mango banana |
| 2 | apple mango pomegranate |
+----+-------------------------+
Query 8:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+apple <pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 1 | apple orange watermelon |
| 3 | apple mango banana |
| 6 | apple banana kiwi |
| 2 | apple mango pomegranate |
+----+-------------------------+
Query 9:
SELECT id, name FROM fruits
-> WHERE MATCH(name) AGAINST
-> ('+mango <pomegranate'
-> IN BOOLEAN MODE);
+----+-------------------------+
| id | name |
+----+-------------------------+
| 3 | apple mango banana |
| 5 | mango guava watermelon |
| 2 | apple mango pomegranate |
| 4 | mango kiwi pomegranate |
+----+-------------------------+
Here, in the queries 7, 8 & 9 +
operator precedes the values apple
and mango
and <
operator precedes the value pomegranate
. This makes sure that the rows having the word pomegranate
are ranked lower than others.
Thus, what can be deduced from here is that -
if +
operator is present, use <
operator instead of ~
operator
UPDATE
Upon extensive calculation I have created the table fruits_score_count
which shows the score
of each fruit
when done Boolean FULLTEXT search.
SELECT * FROM fruits_score_count;
+----+-------------+---------------------+----------------------+
| id | fruit_name | row_numbers_matched | score |
+----+-------------+---------------------+----------------------+
| 1 | apple | 4 | 0.031008131802082062 |
| 2 | banana | 2 | 0.22764469683170319 |
| 3 | guava | 1 | 0.6055193543434143 |
| 4 | kiwi | 2 | 0.22764469683170319 |
| 5 | mango | 4 | 0.031008131802082062 |
| 6 | orange | 1 | 0.6055193543434143 |
| 7 | pomegranate | 2 | 0.22764469683170319 |
| 8 | watermelon | 2 | 0.22764469683170319 |
+----+-------------+---------------------+----------------------+
Query 1:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('apple mango ~pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 3 | apple mango banana | 0.062016263604164124 |
| 1 | apple orange watermelon | 0.031008131802082062 |
| 5 | mango guava watermelon | 0.031008131802082062 |
| 6 | apple banana kiwi | 0.031008131802082062 |
| 2 | apple mango pomegranate | -0.7103390693664551 |
| 4 | mango kiwi pomegranate | -0.7413471937179565 |
+----+-------------------------+----------------------+
Query 2:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('apple ~pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 1 | apple orange watermelon | 0.031008131802082062 |
| 3 | apple mango banana | 0.031008131802082062 |
| 6 | apple banana kiwi | 0.031008131802082062 |
| 2 | apple mango pomegranate | -0.7413471937179565 |
+----+-------------------------+----------------------+
Query 3:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('mango ~pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 3 | apple mango banana | 0.031008131802082062 |
| 5 | mango guava watermelon | 0.031008131802082062 |
| 2 | apple mango pomegranate | -0.7413471937179565 |
| 4 | mango kiwi pomegranate | -0.7413471937179565 |
+----+-------------------------+----------------------+
Query 4:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+apple +mango ~pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 2 | apple mango pomegranate | 0.062016263604164124 |
| 3 | apple mango banana | 0.062016263604164124 |
+----+-------------------------+----------------------+
Query 5:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+apple ~pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 1 | apple orange watermelon | 0.031008131802082062 |
| 2 | apple mango pomegranate | 0.031008131802082062 |
| 3 | apple mango banana | 0.031008131802082062 |
| 6 | apple banana kiwi | 0.031008131802082062 |
+----+-------------------------+----------------------+
Query 6:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+mango ~pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 2 | apple mango pomegranate | 0.031008131802082062 |
| 3 | apple mango banana | 0.031008131802082062 |
| 4 | mango kiwi pomegranate | 0.031008131802082062 |
| 5 | mango guava watermelon | 0.031008131802082062 |
+----+-------------------------+----------------------+
Query 7:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+apple +mango <pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 3 | apple mango banana | 0.062016263604164124 |
| 2 | apple mango pomegranate | -0.7103390693664551 |
+----+-------------------------+----------------------+
Query 8:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+apple <pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 1 | apple orange watermelon | 0.031008131802082062 |
| 3 | apple mango banana | 0.031008131802082062 |
| 6 | apple banana kiwi | 0.031008131802082062 |
| 2 | apple mango pomegranate | -0.7413471937179565 |
+----+-------------------------+----------------------+
Query 9:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+mango <pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 3 | apple mango banana | 0.031008131802082062 |
| 5 | mango guava watermelon | 0.031008131802082062 |
| 2 | apple mango pomegranate | -0.7413471937179565 |
| 4 | mango kiwi pomegranate | -0.7413471937179565 |
+----+-------------------------+----------------------+
Here, Query 1, Query 2, Query 3, Query 7, Query 8, Query 9 behaves as expected.
But from Query 4, Query 5, Query 6 it is clear that -
In the presence of +
operator preceding a value with the ~
operator basically makes the value invisible.
Also careful observation reveals that -
x ~y
and +x <y
are equivalent
FURTHER EXPERIMENTATION
Query 1:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+mango apple ~pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 3 | apple mango banana | 0.062016263604164124 |
| 4 | mango kiwi pomegranate | 0.031008131802082062 |
| 5 | mango guava watermelon | 0.031008131802082062 |
| 2 | apple mango pomegranate | -0.7103390693664551 |
+----+-------------------------+----------------------+
id = 3
gets maximum score which is the sum of scores of mango
and apple
.id = 4
gets second maximum score which is the score of mango
. Presence of +
operator in front of mango
makes ~pomegranate
of search phrase irrelevant.id = 5
gets the same score as that of Row 2. But it is placed lower than Row 2 because when scores are same, rows are ranked in increasing order of primary key
, here id
is primary key
.id = 2
gets lowest score and hence comes last. Here since the word apple
is present and in the search phrase there is no +
operator preceding apple
, hence ~pomegranate
in search phrase is taken into consideration, which lowers the score significantly.Query 2:
SELECT id, name, score FROM
-> (SELECT id, name, MATCH(name) AGAINST
-> ('+mango apple <pomegranate' IN BOOLEAN MODE)
-> AS score FROM fruits ORDER BY score DESC)
-> AS temp WHERE score != 0;
+----+-------------------------+----------------------+
| id | name | score |
+----+-------------------------+----------------------+
| 3 | apple mango banana | 0.062016263604164124 |
| 5 | mango guava watermelon | 0.031008131802082062 |
| 2 | apple mango pomegranate | -0.7103390693664551 |
| 4 | mango kiwi pomegranate | -0.7413471937179565 |
+----+-------------------------+----------------------+
This again illustrates that <
operator takes effect even in the presence of +
operator.
This further reinforces my earlier observation that -
if +
operator is present, use <
operator instead of ~
operator