I was tuning a query which used lots of IO (LIO is of the order of millions). The query was using a index on a huge table. So to experiment, I forced a full scan on that table using FULL hint, and parallelized the scans using PARALLEL hint. My experiment was successful in that the query was orders of magnitude(about 3x) faster. But I am not able to see the parallel query's buffer gets/physical reads on the huge table in both tkprof's report and dbms_xplan.display_cursor. They show up as 0.
Based on what I have read from several sites, this is probably due to parallel query using a different IO(direct path IO) which bypasses buffer cache and directly reads from disk to PGA. However this would mean the direct path IO wait event should show up in the tkprof's report(10046 trace event, level 8), except that it didn't! So my question - How do I find and measure the waits caused by parallel queries?
When you execute parallel query, each parallel slave is a different session. The main session is the query coordinator, which coordinates the work, but doesn't do any of it. So, you need to trace the parallel query slaves to see those waits.
If you're on at least 11g, you can alter session set events 'sql_trace level 12';
You may also want to set tracefile_identifier
, to help you find related trace files.
More details are available here: https://blogs.oracle.com/db/entry/how_to_get_a_10046_trace_for_a_parallel_query