mysqlfull-text-search

Why mysql fulltext-search so slowly?


The table xx_user_detail, user_id is int primary index, site1-site9 is combined fulltext index, the total row count is 5 million

SELECT * FROM xx_user_detail WHERE (user_id=14) AND MATCH (site1,site2,site3,site4,site5,site6,site7,site8,site9) AGAINST ('苏娟的食品店');

this SQL1 use 0.3s

SELECT * FROM xx_user_detail WHERE (user_id=14 or user_id=15) AND MATCH (site1,site2,site3,site4,site5,site6,site7,site8,site9) AGAINST ('苏娟的食品店');

this SQL2 use 2.5s

1 Why SQL2 so slowly than SQL1 by just add one row? How to improve SQL2?
2 The SQL1, use 0.3s for just scan one row, is it normal? If not, how to improve it?

the table structure is:

CREATE TABLE xx_user_detail (
  user_id int(11) NOT NULL,
  site1 varchar(20) NOT NULL,
  site2 varchar(20) NOT NULL,
  site3 varchar(20) NOT NULL,
  site4 varchar(20) NOT NULL,
  site5 varchar(20) NOT NULL,
  site6 varchar(20) NOT NULL,
  site7 varchar(20) NOT NULL,
  site8 varchar(20) NOT NULL,
  site9 varchar(20) NOT NULL,
  PRIMARY KEY (user_id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

ALTER TABLE xx_user_detail ADD FULLTEXT INDEX the_site_index(site1,site2,site3,site4,site5,site6,site7,site8,site9) WITH PARSER ngram;

Solution

  • The first query statement took 0.3 seconds. It mainly depends on the size of each row of data and the performance of your disk. If your disk performance is average and each row of data is large, then 0.3 seconds may be a normal value. I have checked the execution plans for your two statements. The first statement used the Primary Key. After matching to a specific row of data, it directly searched for the keyword "苏娟的食品店" within this row of data. It didn't use the full-text index, so compared to the second statement, its performance will be better.

    mysql>
    mysql> explain SELECT * FROM xx_user_detail WHERE  (user_id=14) AND MATCH (site1,site2,site3,site4,site5,site6,site7,site8,site9)  AGAINST ('苏娟的食品店');
    +----+-------------+----------------+------------+-------+------------------------+---------+---------+-------+------+----------+-------------+
    | id | select_type | table          | partitions | type  | possible_keys          | key     | key_len | ref   | rows | filtered | Extra       |
    +----+-------------+----------------+------------+-------+------------------------+---------+---------+-------+------+----------+-------------+
    |  1 | SIMPLE      | xx_user_detail | NULL       | const | PRIMARY,the_site_index | PRIMARY | 4       | const |    1 |   100.00 | Using where |
    +----+-------------+----------------+------------+-------+------------------------+---------+---------+-------+------+----------+-------------+
    1 row in set, 1 warning (0.00 sec)
    
    mysql>
    mysql> explain SELECT * FROM xx_user_detail WHERE  (user_id=14 or user_id=15) AND MATCH (site1,site2,site3,site4,site5,site6,site7,site8,site9)  AGAINST ('苏娟的食品店');
    +----+-------------+----------------+------------+----------+------------------------+----------------+---------+-------+------+----------+-------------+
    | id | select_type | table          | partitions | type     | possible_keys          | key            | key_len | ref   | rows | filtered | Extra       |
    +----+-------------+----------------+------------+----------+------------------------+----------------+---------+-------+------+----------+-------------+
    |  1 | SIMPLE      | xx_user_detail | NULL       | fulltext | PRIMARY,the_site_index | the_site_index | 0       | const |    1 |   100.00 | Using where |
    +----+-------------+----------------+------------+----------+------------------------+----------------+---------+-------+------+----------+-------------+
    1 row in set, 1 warning (0.00 sec)
    

    The second statement, as seen from the execution plan, uses a full - text index. Given the large amount of data in the table, 2.5 (seconds) might be a normal value. You can optimize it through the following two methods:

    Since there was no data volume when I was doing the testing, the execution plan might be different from yours. You could post your execution plan and the cost time after optimization.