Oracle 18c:
I have a query that joins from a view to a subquery (related post: Join between XML queries).
with
subtype as (select * from sub_lc_events_asset_class_activity_vw),
domain as (select
cast(rownum as number(38,0)) as rownum_,
x.code,
x.description,
i.name as domain_name
from
sde.gdb_items_vw i
cross apply xmltable(
'/GPCodedValueDomain2/CodedValues/CodedValue'
passing xmltype(i.definition)
columns
code varchar2(255) path './Code',
description varchar2(255) path './Name'
) x
where
i.name in('ACTIVITY_ATN','ACTIVITY_GCSM','ACTIVITY_MS','ACTIVITY_RD','ACTIVITY_SS_SL','ACTIVITY_WAT_RES','ACTIVITY_SWM_FACILITY','ACTIVITY_UND_SERV','ACTIVITY_BARRIER','ACTIVITY_WM_SAN')
and i.name is not null)
select
d.code as domain_code,
d.description as domain_description,
d.domain_name,
s.subtype_code,
s.subtype_description,
s.subtype_field,
s.subtype_field_domain,
s.table_name as table_name
from
subtype s
left join
domain d
on s.subtype_field_domain = d.domain_name
Normally, I'd include the explain plan as text, not as a screenshot. But this particular explain plain is a nightmare due to the XML extraction and the underlying data source (a view called GDB_ITEMS_VW) that is beyond my control. If I were to provide the explain plan as text in this post, it would be unreadable, and make the post too long. So I've included a screenshot link from SQL Developer instead.
DOMAIN_COD DOMAIN_DESCRIPTION DOMAIN_NAME SUBTYPE_CODE SUBTYPE_DESCRIPTION SUBTYPE_FIELD SUBTYPE_FIELD_DOMAIN TABLE_NAME
---------- ---------------------------------------- ------------- ------------- -------------------- ------------- -------------------- ------------------
RECON_CL RECONSTRUCT CYCLING LANE ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_CRS RECONSTRUCT CYCLING ROUTE - SIGNED ONLY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_CRU RECONSTRUCT CYCLING ROUTE - URBAN PAVED ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_FW RECONSTRUCT FACILITY WALKWAY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_MUP RECONSTRUCT MULTI-USE PATH ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_NT RECONSTRUCT NATURE TRAIL ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_PP RECONSTRUCT PARK PATHWAY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_R RECONSTRUCT RAMP ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_S RECONSTRUCT SIDEWALK ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_STWY RECONSTRUCT STAIRWAY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
...
121 rows selected.
The query is very fast: 0.09 seconds.
However, that query is hardcoded to certain domain names:
where
i.name in('ACTIVITY_ATN','ACTIVITY_GCSM','ACTIVITY_MS','ACTIVITY_RD','ACTIVITY_SS_SL','ACTIVITY_WAT_RES','ACTIVITY_SWM_FACILITY','ACTIVITY_UND_SERV','ACTIVITY_BARRIER','ACTIVITY_WM_SAN')
and i.name is not null)
I want to avoid hardcoding the domain name if I can help it.
If I remove the hardcoded domain names from the WHERE clause, then the query produces the same result, which is expected, due to the left join
.
...
where
--I removed the domain names from the WHERE clause.
i.name is not null)
...
DOMAIN_COD DOMAIN_DESCRIPTION DOMAIN_NAME SUBTYPE_CODE SUBTYPE_DESCRIPTION SUBTYPE_FIELD SUBTYPE_FIELD_DOMAIN TABLE_NAME
---------- ---------------------------------------- ------------- ------------- -------------------- ------------- -------------------- ------------------
RECON_CL RECONSTRUCT CYCLING LANE ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_CRS RECONSTRUCT CYCLING ROUTE - SIGNED ONLY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_CRU RECONSTRUCT CYCLING ROUTE - URBAN PAVED ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_FW RECONSTRUCT FACILITY WALKWAY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_MUP RECONSTRUCT MULTI-USE PATH ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_NT RECONSTRUCT NATURE TRAIL ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_PP RECONSTRUCT PARK PATHWAY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_R RECONSTRUCT RAMP ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_S RECONSTRUCT SIDEWALK ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
RECON_STWY RECONSTRUCT STAIRWAY ACTIVITY_ATN 0 ACTIVE TRANSPORTATIO ACTIVITY ACTIVITY_ATN INFRASTR.LC_EVENTS
...
121 rows selected.
Unfortunately, the query is now extremely slow: 35 seconds.
I don't understand why the query is so slow now. I wouldn't have expected the domain names in the WHERE clause to have served much of a purpose, since the left join
achieves the same thing; the join omits the same rows.
Question:
Why is the query only performant when the domains are hardcoded in the WHERE clause?
Apologies if I haven't included enough information. But I felt this post was long enough already. It's hard to know what amount of information would be right in the "sweet spot" (not too much, not too little). If more information is needed, then feel free to let me know.
There is a combination of factors:
ROWNUM
Oracle must materialise the sub-query to generate the row numbering and generates all the rows from the XMLTABLE
at that point (see 1, 2, 3 for examples of problems which require materialising the sub-query).i.name in(...)
then the explain plan shows that the SQL engine can use an index on the column and without the filter it is not using the index and is performing a full table scan (which is much slower).If you remove ROWNUM
from the SELECT
clause of the sub-query then:
LEFT JOIN
before it generates the XMLTABLE
.ON
condition can be used to provide the values to use the index, similar to using i.name in(...)
, and the full table scan can be eliminated again and index scans used.