mysqlmyisam

Is using SELECT DISTINCT without UNION more efficient? Is there a way to optimize the following query?


I have 3 tables. Each of them contain (among others) the columns UserID and Period.

In order to get all the Periods associated with a specific user, I'm using unions like this:

# Original Query
SELECT Period FROM table_1 WHERE UserID = :user
UNION SELECT Period FROM table_2 WHERE UserID = :user
UNION SELECT Period FROM table_3 WHERE UserID = :user
ORDER BY Period ASC;

The results I get are what I expect. Since I'm not using UNION ALL, all duplicate results get automatically filtered out.

The debugger this project is using thinks the query MIGHT be slow. This is the query plan for it-

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY table_1 ref UserID UserID 4 const 29 NULL
2 UNION table_2 ref UserID UserID 4 const 5 NULL
3 UNION table_3 ref UserID UserID 4 const 4 NULL
NULL UNION RESULT <union1,2,3> ALL NULL NULL NULL NULL NULL Using temporary

I thought maybe using SELECT DISTINCT instead of SELECT would help since each SELECT result set would get smaller before the UNION.

# Updated Query
SELECT DISTINCT Period FROM table_1 WHERE UserID = :user
UNION SELECT DISTINCT Period FROM table_2 WHERE UserID = :user
UNION SELECT DISTINCT Period FROM table_3 WHERE UserID = :user
ORDER BY Period ASC;

However this only seems to make the query plan use more temp tables.

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY table_1 ref UserID UserID 4 const 29 Using where; Using temporary
2 UNION table_2 ref UserID UserID 4 const 5 Using where; Using temporary
3 UNION table_3 ref UserID UserID 4 const 4 Using where; Using temporary
NULL UNION RESULT <union1,2,3> ALL NULL NULL NULL NULL NULL Using temporary

I've also tried running this same query repeating lines 2-3 (the UNIONs) across 500 lines to see if there were noticeable differences. Time-wise, the results were pretty similar.

Analyzing the queries, using DISTINCT seemed to make the queries take less time returning the results. However, the query now had to clean up all temp tables, which ended up making the final time for both queries very similar.


My question(s) to all MySQL gurus, is there a way to make this query faster or more optimized without changing the table structures?

Maybe with some indexes? (On that note, Period is a VARCHAR and the tables are all MyISAM)

Do the order of the tables matter in the query? I purposedly tried to place the biggest table first.


Solution

  • # Original Query
    SELECT Period FROM table_1 WHERE UserID = :user
    UNION SELECT Period FROM table_2 WHERE UserID = :user
    UNION SELECT Period FROM table_3 WHERE UserID = :user
    ORDER BY Period ASC;
    

    It is processed as:

    1. Execute subquery 1 (output A)
    2. Execute subquery 2 (output B)
    3. Combine outputs A and B (output С)
    4. Sort combined output C and remove duplicates (output D)
    5. Execute subquery 3 (output E)
    6. Combine outputs C and E (output F)
    7. Sort combined output E and remove duplicates (output G)
    8. Return output G
    # Updated Query
    SELECT DISTINCT Period FROM table_1 WHERE UserID = :user
    UNION SELECT DISTINCT Period FROM table_2 WHERE UserID = :user
    UNION SELECT DISTINCT Period FROM table_3 WHERE UserID = :user
    ORDER BY Period ASC;
    

    It is processed as:

    1. Execute subquery 1 (output A)
    2. Sort output A and remove duplicates (output B)
    3. Execute subquery 2 (output C)
    4. Sort output C and remove duplicates (output D)
    5. Combine outputs B and D (output E)
    6. Sort output E and remove duplicates (output F)
    7. Execute subquery 3 (output G)
    8. Sort output G and remove duplicates (output H)
    9. Combine outputs F and H (output I)
    10. Sort output I and remove duplicates (output J)
    11. Return output J

    # Recommended Query
    SELECT DISTINCT Period
    FROM ( SELECT Period FROM table_1 WHERE UserID = :user
           UNION ALL
           SELECT Period FROM table_2 WHERE UserID = :user
           UNION ALL
           SELECT Period FROM table_3 WHERE UserID = :user ) AS total
    ORDER BY Period ASC;
    
    1. Execute subquery 1 (output A)
    2. Execute subquery 2 (output B)
    3. Combine outputs A and B (output С)
    4. Execute subquery 3 (output D)
    5. Combine outputs C and D (output E)
    6. Sort combined output E and remove duplicates (output F)
    7. Return output F