mysqlquery-optimizationperconamysql-5.6

MySql optimizer doesn't use the index on varchar


I have a table defined as

CREATE TABLE `article` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `field1` varchar(1024) NOT NULL,
  `priority` int(11) NOT NULL,
  `prodcode` varchar(64) NOT NULL,
  `status` int(8) NOT NULL,
  `error` varchar(1024) DEFAULT NULL,
  `ctime` datetime DEFAULT CURRENT_TIMESTAMP,
  `mtime` datetime DEFAULT NULL,
  `event` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `article_prodcode_idx` (`prodcode`),
  KEY `article_status_idx` (`status`),
  KEY `article_priority_idx` (`priority`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

This table contains about 40 million of record. When I run a query like

SELECT * FROM article WHERE prodcode='a-4536-x-bef45-green';

the optimizer "decides" that such query doesn't have to use any index (an "EXPLAIN SELECT..." results in a

+----+-------------+----------+------+---------------+------+---------+------+----------+-------------+
| id | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | Extra       |
+----+-------------+----------+------+---------------+------+---------+------+----------+-------------+
|  1 | SIMPLE      | article  | ALL  | NULL          | NULL | NULL    | NULL | 39415251 | Using where |
+----+-------------+----------+------+---------------+------+---------+------+----------+-------------+

If I add e.g. another field like priority or status, the optimizer uses just such indexes, but NOT the article_prodcode_idx one. The problem is that the query scans the whole 40 million records, and the result comes after 100 seconds. Why the index is not used there?

I checked this answer too: MySQL partial indexes on varchar fields and group by optimization but I didn't find any answer to my problem. What should I do to let the query return results... quickly?

Thank you


Solution

  • After a deep check, I solved the problem. In some ways, I did a mistake in describing the problem. In fact, in my script, I had something like

    SET @pc='a-4536-x-bef45-green'; 
    ...
    SELECT * FROM article WHERE prodcode=@pc;
    

    while in some other points I had

    SELECT * FROM article WHERE prodcode='a-4536-x-bef45-green';
    

    In the first case, no indexes are used. In the second one, everything works as expected. This is the explain of the first query:

    EXPLAIN FORMAT=JSON SELECT * FROM article WHERE prodcode=@pc;
    {
      "query_block": {
      "select_id": 1,
      "table": {
        "table_name": "article",
        "access_type": "ALL",
        "rows": 39498773,
        "filtered": 100,
        "attached_condition": "(convert(`shop`.`article`.`prodcode` using utf8mb4) = (@`pc`))"
        }
      }
    }
    

    And this is the second one

    EXPLAIN FORMAT=JSON SELECT * FROM article WHERE prodcode='a-4536-x-bef45-green';
    
    {
    "query_block": {
      "select_id": 1,
      "table": {
        "table_name": "article",
        "access_type": "ref",
        "possible_keys": [
          "article_prodcode_idx"
        ],  
        "key": "article_prodcode_idx",
        "used_key_parts": [
          "prodcode"
        ],
        "key_length": "194",
        "ref": [
          "const"
        ],
        "rows": 2,
        "filtered": 100,
        "index_condition": "(`shop`.`article`.`prodcode` = 'a-4536-x-bef45-green')"
        }
      }
    }
    

    I tried a lot of times, using e.g.

    SELECT * FROM article WHERE prodcode in (@pc, 'another code');
    

    but the result is always the same: if a variable is involved, NO indexes are used.

    Knowing, that, I changed my scripts in order to not use SET anymore, and everything started running fine.

    I still would like to know why... but at least the problem is solved.