Say you have 5 tables, with each their columns:
And say all 'id' columns and column ending with '_id' already have indexes.
(Actually my query has nothing to do with houses or streets. They're for arguments sake.)
Now say you would like to get each street into a column, and if that street or its houses has photo's, you would like the photos in the next column...
The tricky bit here is that all houses are in one table. And all photos in another table. But to link the 2, we need to access all five tables.
I came up with the following query, containing 4 JOINs:
SELECT
street.name
,group_concat(distinct photos.name SEPARATOR '\n') as photos
FROM
house
INNER JOIN street ON
house.street_id = street.id
LEFT JOIN house_photos ON
house.id = house_photos.house_id
LEFT JOIN street_photos ON
street.id = street_photos.street_id
LEFT JOIN photos ON
photos.id = house_photos.photo_id
OR photos.id = street_photos.photo_id
GROUP BY
street.name
The distinct is there to filter out doubles, as they will be generated when you have more than 1 photo for a house, and more than 1 photo for that house's street. (Carthesian product) But this has nothing to do with my problem.
The problem I have is the query is SUPER slow. (Taking over 1 minute or even longer to finish)
When I ask MySQL to analyze the query ('explain extended'), I see that it won't use the available indexes when processing that last JOIN (which has an OR in the ON clause).
If I split the last JOIN into 2 JOINS, (thus ADDING a fifth JOIN) , the query becomes very fast again.(Taking a split second to finish.)
SELECT
street.name
,concat(
group_concat(distinct photos_from_house.name SEPARATOR '\n')
,'\n'
,group_concat(distinct photos_from_street.name SEPARATOR '\n')
) as photos
FROM
house
INNER JOIN street ON
house.street_id = street.id
LEFT JOIN house_photos ON
house.id = house_photos.house_id
LEFT JOIN street_photos ON
street.id = street_photos.street_id
LEFT JOIN photos photos_from_house ON
photos_from_house.id = house_photos.photo_id
LEFT JOIN photos photos_from_street ON
photos_from_street.id = street_photos.photo_id
GROUP BY
street.name
My question now is: why is introducing an OR in the ON clause, making MySQL not use the available indexes/keys for that JOIN ?
I have ALREADY tried using USE INDEX and FORCE INDEX, but it won't budge.
Any explanation / help is welcome.
The MySQL docs have this to say:
Minimize the OR keywords in your WHERE clauses. If there is no index that helps to locate the values on both sides of the OR, any row could potentially be part of the result set, so all rows must be tested, and that requires a full table scan. If you have one index that helps to optimize one side of an OR query, and a different index that helps to optimize the other side, use a UNION operator to run separate fast queries and merge the results afterward.
Unfortunately that doesn't really answer your question, as you said that you have indexes on all relevant columns.