mysqlsqlperformance

MySQL slow query when combining two very fast queries


I have two MySQL queries that run very fast, but when I combine them the new query is very slow.

Fast (<1 second, 15 results):

SELECT DISTINCT 
Id, Name, Company_Id
FROM people
where Company_Id in (5295, 1834)
and match(Locations) against('austin')

Fast (<1 second, 2970 results):

select distinct Company_Id from technologies
    where match(Name) against('elastic')
      and Company_Id is not null

When I combine the two like this:

SELECT DISTINCT Id, Name, Company_Id
FROM people
where Company_Id in
    ( select Company_Id from technologies
        where match(Name) against('elastic')
          and Company_Id is not null
    )
  and match(Locations) against('austin')

The result query takes over 2 minutes to complete. It has 278 rows hit.

I've tried rewriting the slow query a few ways. One other example is like this:

SELECT DISTINCT 
`Extent1`.`Id`, `Extent1`.`Name`, `Extent1`.`Company_Id`
FROM `people` AS `Extent1` 
INNER JOIN `technologies` AS `Extent2`
     ON (`Extent1`.`Company_Id` = `Extent2`.`Company_Id`)
WHERE (`Extent1`.`Company_Id` IS NOT NULL) 
 AND ((match(`Extent1`.`Locations`) against('austin')) 
 AND  (match(`Extent2`.`Name`) against('elastic')))

I'm using MySQL 5.7 on Windows. I have full text index on the Name and Location columns. My InnoDB Buffer Usage never goes above 40%. I tried to use MySQL workbench to look at the execution plan, but it shows "Explain data not available for statement"

Please let me know if you see anything I could improve or try. Thank you.


Solution

  • IN ( SELECT ... ) is poorly optimized, at least in older versions of MySQL. What version are you using?

    When using a FULLTEXT index (MATCH...), that part is performed first, if possible. This is because nearly always the FT lookup is faster than whatever else is going on.

    But when using two fulltext queries, it picks one, then can't use fulltext on the other.

    Here's one possible workaround:

    If necessary, AND that with something to verify that it is not, for example, finding a person named 'Austin'.

    Another possibility:

    5.7 (or 5.6?) might be able to optimize this by creating indexes on the subqueries:

    SELECT ...
        FROM ( SELECT Company_Id FROM ... MATCH(Name) ... ) AS x
        JOIN ( SELECT Company_Id FROM ... MATCH(Locations) ... ) AS y
            USING(Company_id);
    

    Provide the EXPLAIN; I am hoping to see <auto-key>.

    Test that. If it is 'fast', then you may need to add on another JOIN and/or WHERE. (I am unclear what your ultimate query needs to be.)