I have to select a list of Catalogs from one table, and perform counts in two other tables: Stores and Categories. The counters should show how many Stores and Categories are linked to each Catalog. I have managed to get the functionality I need using this SQL query:
SELECT `catalog`.`id` AS `id`,
`catalog`.`name` AS `name`,
(
SELECT COUNT(*)
FROM `category`
WHERE `category`.`catalog_id` = `catalog`.`id`
AND `category`.`is_archive` = 0
AND `category`.`company_id` = 2
) AS `category_count`,
(
SELECT COUNT(*)
FROM `store`
WHERE `store`.`catalog_id` = `catalog`.`id`
AND `store`.`is_archive` = 0
AND `store`.`company_id` = 2
) AS `store_count`
FROM `catalog`
WHERE `catalog`.`company_id` = 2
AND `catalog`.`is_archive` = 0
ORDER BY `catalog`.`id` ASC;
This works as expected. But I don't like to perform sub-queries, as they are slow and this query may perform badly on LARGE lists.. Is there any method of optimizing this SQL using JOINs? Thanks in advance.
You can make this a lot faster by refactoring the dependent subqueries in your SELECT
clause into, as you mention, JOIN
ed aggregate subqueries.
The first subquery you can write this way.
SELECT COUNT(*) num, catalog_id, company_id
FROM category
WHERE is_archive = 0
GROUP BY catalog_id, company_id
The second one like this.
SELECT COUNT(*) num, catalog_id, company_id
FROM store
WHERE is_archive = 0
GROUP BY catalog_id, company_id
Then, use those in your main query aas if they were tables containing the counts you want.
SELECT catalog.id,
catalog.name,
category.num category_count,
store.num store_count
FROM catalog
LEFT JOIN (
SELECT COUNT(*) num, catalog_id, company_id
FROM category
WHERE is_archive = 0
GROUP BY catalog_id, company_id
) category ON catalog.id = category.catalog_id
AND catalog.company_id = category.company_id
LEFT JOIN (
SELECT COUNT(*) num, catalog_id, company_id
FROM store
WHERE is_archive = 0
GROUP BY catalog_id, company_id
) store ON catalog.id = store.catalog_id
AND catalog.company_id = store.company_id
WHERE catalog.is_archive = 0
AND catalog.company_id = 2
ORDER BY catalog.id ASC;
This is faster than your example because each subquery need only run once, rather than once per catalog entry. It also has the nice feature that you only need say WHERE catalog.company_id = 2
once. The MySQL optimizer knows what to do with that.
I suggest LEFT JOIN
operations so you'll still see catalog entries even if they're not mentioned in your category or store tables.