I've been learning about optimizer hints in MySQL (my version is 8.0.36), and I've been able to experiment with them in very simplified scenarios and it works. However, as soon as I add a nested level, it all breaks down.
For example, assume I have 4 tables:
activities
: id
, rubric_id
, group_id
rubric_templates
: id
, title
rubric_sessions
: id
, template_id
, name
rubric_session_elements
: id
, rubric_session_id
, value
All ids have foreign keys established. There are many rubric session elements for a given rubric session, and many rubric sessions for a given rubric template. Many activities can use the same rubric, and there are many activities in a group. So no unique keys anywhere. Everything is a many-to-one.
If I instruct the optimizer to use DUPSWEEDOUT
or MATERIALIZATION
where there is only two tables, it works:
-- Get all rubric sessions for a group
EXPLAIN FORMAT=TREE
SELECT /*+ SEMIJOIN(@subq1 MATERIALIZATION) */ rs.*
FROM rubric_sessions rs
WHERE rs.template_id IN (
SELECT /*+ QB_NAME(subq1) */ a.rubric_template_id
FROM activities a
WHERE group_id = 123
);
/*
-> Nested loop inner join (cost=24.6 rows=48)
-> Filter: (`<subquery2>`.rubric_template_id is not null) (cost=0.00417..0.00417 rows=1)
-> Table scan on <subquery2> (cost=3.46..3.46 rows=1)
-> Materialize with deduplication (cost=0.951..0.951 rows=1)
-> Filter: (a.rubric_template_id is not null) (cost=0.851 rows=1)
-> Index lookup on a using group_id (group_id=123) (cost=0.851 rows=1)
-> Index lookup on rs using template_id (template_id=`<subquery2>`.rubric_template_id), with index condition: (rs.template_id = `<subquery2>`.rubric_template_id) (cost=24.6 rows=48)
*/
-- Replace MATERIALIZATION with DUPSWEEDOUT changes the plan to this:
/*
-> Remove duplicate rs rows using temporary table (weedout) (cost=25.4 rows=48)
-> Nested loop inner join (cost=25.4 rows=48)
-> Filter: (a.rubric_template_id is not null) (cost=0.851 rows=1)
-> Index lookup on a using group_id (group_id=123) (cost=0.851 rows=1)
-> Index lookup on rs using template_id (template_id=a.rubric_template_id), with index condition: (rs.template_id = a.rubric_template_id) (cost=24.5 rows=48)
*/
However if I want to take it a step further, it will only ever do weedout. It refuses to do materialization.
-- Get all rubric session elements for a group
EXPLAIN FORMAT=TREE
SELECT /*+ SEMIJOIN(@subq1 MATERIALIZATION) */ rse.*
FROM rubric_session_elements rse
WHERE rse.rubric_session_id IN (
SELECT rs.id
FROM rubric_sessions rs
WHERE rs.template_id IN (
SELECT /*+ QB_NAME(subq1) */ a.rubric_template_id
FROM activities a
WHERE group_id = 123
)
);
/*
-> Remove duplicate (rs, rse) rows using temporary table (weedout) (cost=527 rows=572)
-> Nested loop inner join (cost=527 rows=572)
-> Nested loop inner join (cost=5.92 rows=48)
-> Filter: (a.rubric_template_id is not null) (cost=0.851 rows=1)
-> Index lookup on a using group_id (group_id=123) (cost=0.851 rows=1)
-> Filter: (rs.template_id = a.rubric_template_id) (cost=5.07 rows=48)
-> Covering index lookup on rs using template_id (template_id=a.rubric_template_id) (cost=5.07 rows=48)
-> Index lookup on rse using rubric_session_elements_rubric_session_id_foreign (rubric_session_id=rs.id) (cost=9.68 rows=11.9)
*/
No matter how I change it, I cannot convince the optimizer to use MATERIALIZATION. I've put the semijoin hint on both outer and middle selects. I've put QB_NAME on both middle and inner selects. I even tried:
SET SESSION optimizer_switch = 'duplicateweedout=off';
but it still comes out as "weedout". If I specify NO_SEMIJOIN(DUPSWEEDOUT)
then it seems to revert to hash join, but that's not MATERIALIZATION. It doesn't look like I violated in the rules specified in https://dev.mysql.com/doc/refman/8.4/en/subquery-materialization.html, so what am I doing wrong?
I found that if I put the subquery into a view with ALGORITHM=TEMPTABLE, then it would materialize like I expect. That got me thinking that maybe I can achieve the same using Common Table Expressions. Which lead me to the NO_MERGE optimization:
The MERGE and NO_MERGE optimizer hints can be applied to CTEs. Each CTE reference in the top-level statement can have its own hint, permitting CTE references to be selectively merged or materialized. The following statement uses hints to indicate that cte1 should be merged and cte2 should be materialized:
WITH cte1 AS (SELECT a, b FROM table1), cte2 AS (SELECT c, d FROM table2) SELECT /*+ MERGE(cte1) NO_MERGE(cte2) */ cte1.b, cte2.d FROM cte1 JOIN cte2 WHERE cte1.a = cte2.c;
By extracting the subquery into a CTE, then adding the NO_MERGE() hint, I was able to get the optimizer to materialize the subquery, but not do a full table scan on rse:
EXPLAIN FORMAT=TREE
WITH cte1 AS (
SELECT rs.id
FROM rubric_sessions rs
WHERE rs.template_id IN (
SELECT a.rubric_template_id
FROM activities a
WHERE group_id = 123
)
)
SELECT /*+ NO_MERGE(@subq1) */ rse.*
FROM rubric_session_elements rse
WHERE rse.rubric_session_id IN (
SELECT /*+ QB_NAME(subq1) */ rs.id
FROM cte1 as rs
);
/*
-> Nested loop inner join (cost=222 rows=572)
-> Table scan on <subquery2> (cost=18.6..21.7 rows=48)
-> Materialize with deduplication (cost=18.6..18.6 rows=48)
-> Table scan on rs (cost=10.7..13.8 rows=48)
-> Materialize CTE cte1 (cost=10.7..10.7 rows=48)
-> Nested loop inner join (cost=5.86 rows=48)
-> Filter: (`<subquery4>`.rubric_template_id is not null) (cost=0.00417..0.00417 rows=1)
-> Table scan on <subquery4> (cost=3.45..3.45 rows=1)
-> Materialize with deduplication (cost=0.939..0.939 rows=1)
-> Filter: (a.rubric_template_id is not null) (cost=0.839 rows=1)
-> Index lookup on a using group_id (group_id=123) (cost=0.839 rows=1)
-> Filter: (rs.template_id = `<subquery4>`.rubric_template_id) (cost=5.86 rows=48)
-> Covering index lookup on rs using template_id (template_id=`<subquery4>`.rubric_template_id) (cost=5.86 rows=48)
-> Index lookup on rse using rubric_session_elements_rubric_session_id_foreign (rubric_session_id=`<subquery2>`.id) (cost=144 rows=11.9)
*/
Which in my scenario, where there is a lot of duplicate rubric template ids, greatly sped up the results.
However, despite figuring how to to use CTEs, I found that making views are an easier, less brittle, solution. They don't rely on obscure optimizer hints, but a straightforward ALGORITHM
can enforce how you believe the optimizer should function.