I am trying to fetch data from remote table. The data is expanded from seed set of data in local table using recursive CTE. The query is very slow (300 seed rows to 800 final rows takes 7 minutes).
For other "tiny local, huge remote"-cases with no recursive query the DRIVING_SITE
hint works excellently. I also tried to export seed set from local table into auxiliary table on remotedb
with same structure and - being logged in remotedb
- ran query as pure local query (my_table
as p
, my_table_seed_copy
as i
). It took 4s, which encouraged me to believe forcing query to remote site would make query fast.
What's the correct way to force Oracle to execute recursive query on the remote site?
with s (id, data) as (
select p.id, p.data
from my_table@remotedb p
where p.id in (select i.id from my_table i)
union all
select p.id, p.data
from s
join my_table@remotedb p on ...
)
select /*+DRIVING_SITE(p)*/ s.*
from s;
In the query above, I tried
select /*+DRIVING_SITE(p)*/ s.*
in main selectselect /*+DRIVING_SITE(s)*/ s.*
in main selectDRIVING_SITE
in whole queryselect /*+DRIVING_SITE(x)*/ s.* from s, dual@remotedb x
as main selectselect /*+DRIVING_SITE(p)*/ p.id, p.data
in first inner selectselect /*+DRIVING_SITE(p)*/ p.id, p.data
in both inner selectsselect /*+DRIVING_SITE(p) MATERIALIZE*/ p.id, p.data
in both inner selectsconnect by
is not applicable for this case - actually the query is more complex and uses constructs which cannot be expressed by connect by
)All without success (i.e. data returned after 7 minutes).
Recursive query actually performs breadth-first search - seed rows represent 0-th level and recursive part finds element on n-th level from elements on (n-1)-th level. Original query was intended to be part of merge ... using ...
clause.
Hence I rewrote query to PLSQL loop. Every cycle generates one level. Merge prevents insertion of duplicates so finally no new row is added and loop exits (transitive closure is constructed). Pseudocode:
loop
merge into my_table using (
select /*+DRIVING_SITE(r)*/ distinct r.* /*###BULKCOLLECT###*/
from my_table l
join my_table@remotedb r on ... -- same condition as s and p in original question are joined on
) ...
exit when rows_inserted = 0;
end loop;
Actual code is not so simple since DRIVING_SITE
actually does not directly work with merge
so we have to transfer data via work collection but that's different story. Also the count of inserted rows cannot be easily determined, it must be computed as difference between row count after and before merge.
The solution is not ideal. Anyway it's much faster than recursive CTE (30s, 13 cycles) because queries are provably utilizing the DRIVING_SITE
hint.
I will leave question open for some time to wait if somebody finds answer how to make recursive query working or proving it is not possible.