mysqloptimizationquery-optimizationmysql-dependent-subquery

Optimize MySQL query with dependent sub-query


I need to find a way to eliminate a dependent sub-query.

I have a table of articles that can have multiple languages. The simplified table structure is as follows:

id, title, language, translation_set_id

1 A    en 0
2 B    en 2
3 B_ru ru 2
4 C    en 4
5 C_ru ru 4
6 D    en 6
7 D_fr fr 6

The translation_set_id is 0 when an article doesn't have translations, or is set to the id of the base translation. So B is the original English article, and B_ru is the Russian translation of the article.

I need a query that would allow me to return all Russian articles, or if they don't exist the original language articles. So it would return.

1 A    en 0
3 B_ru ru 2
5 C_ru ru 4
6 D    en 6

So far I have this:

SELECT id, title, language, translation_set_id
FROM articles a
WHERE 
  a.translation_set_id = 0
  OR (a.language = 'ru')
  OR (a.id = a.translation_set_id AND
       0 = (SELECT COUNT(ac.id)
            FROM articles ac
            WHERE ac.translation_set_id = a.translation_set_id 
            AND ac.language = 'ru')
     )

But this executes the sub-query for each row, creating a dependent query. Is there a way to eliminate the dependent query?

UPDATE: It seems that Neels's solution works, thanks!

But I was wondering if there is a way to generalize the solution to multiple language fallbacks? First try to get French, if that's not present, try Russian, and if that's not present, show base translation (English, or any other, depending on the original creation language)?

UPDATE2: I've built the query I needed for the updated question using Neel's solution and DRapp's solution. It can be found here http://www.sqlfiddle.com/#!2/28ca8/18 but I'll also past the queries here, for completeness sake.

Revised Data:

CREATE TABLE articles (
  id INT,
  title VARCHAR(20),
  language VARCHAR(20),
  translation_set_id INT);

INSERT INTO articles values
  (1,'A','en',0),
  (2,'B','en',2),
  (3,'B_ru','ru',2),
  (4,'C','en',4),
  (5,'C_ru','ru',4),
  (6,'D','en',6),
  (7,'D_fr','fr',6),
  (8,'E_ru','ru', 0),
  (9,'F_fr','fr', 0),
  (10,'G_ru','ru', 10),
  (11,'G_fr','fr', 10),
  (12,'G_en','en', 10);

Original query with 2 correlated sub-queries:

SELECT id, title, language, translation_set_id
FROM articles a
WHERE
  a.translation_set_id = 0
  OR (a.language = 'fr')
  OR (a.language = 'ru' AND
       0 = (SELECT COUNT(ac.id)
            FROM articles ac
            WHERE ac.translation_set_id = a.translation_set_id
            AND ac.language = 'fr'))
  OR (a.id = a.translation_set_id AND
       0 = (SELECT COUNT(ac.id)
            FROM articles ac
            WHERE ac.translation_set_id = a.translation_set_id
            AND (ac.language = 'fr' OR ac.language = 'ru'))
     );

Revised query:

SELECT  a.*
FROM articles a
LEFT JOIN articles ac ON ac.translation_set_id = a.id
  AND ac.language = 'fr'
LEFT JOIN articles ac2 ON ac2.translation_set_id = a.id
  AND ac2.language = 'ru'
WHERE a.translation_set_id = 0
  OR a.language = 'fr'
  OR (a.language = 'ru' AND ac.id IS NULL)
  OR (a.id = a.translation_set_id AND ac2.id IS NULL AND ac.id IS NULL);

Solution

  • Check out this SQL Fiddle:

    http://www.sqlfiddle.com/#!2/c05d0/15

    You can use this simple query to achieve your result.

    SELECT  a.*
    FROM articles a
    LEFT OUTER JOIN articles ac ON ac.translation_set_id = a.translation_set_id 
    AND ac.language = 'ru'
    WHERE a.translation_set_id = 0
    OR a.language = 'ru'
    OR (a.id = a.translation_set_id AND ac.id IS NULL);