sqloracle-databaseinline-viewquery-plannersubquery-factoring

Oracle 12c Subquery Factoring Inline View now has bad plan?


Update 11/2

After some additional troubleshooting, my team was able to tie this Oracle bug directly to a parameter change that was made on the 12c database the night before the query stopped working. After experiencing some performance issues from an application tied to this database, my team had our DBA change the OPTIMIZER_FEATURES_ENABLE parameter from 12.1.02 to 11.2.0.4. This fixed the performance issue for the problem application but caused the bug I have described above. To verify, I've been able to replicate this same issue in a separate environment by changing this parameter. My DBA has filed a ticket with Oracle to have this looked at.

As a workaround, I was able to make a slight change to my query in order to retrieve the expected results. Specifically, I combined Subquery1 with Subquery2 and I moved a few predicates in Subquery1 from the WHERE clause to the JOIN (where they more properly belonged). This change edited my execution plan (it is slightly less efficient than what was listed before) but was enough to address the original issue.


Original Post

Firstly, let me apologize for any vagueness in this question but I'm dealing with a confidential financial system so I am forced to hide certain implementation details.

Background

I have an Oracle query that I put into production a long time ago that has recently stopped producing expected results coincidentally after an upgrade from 11g to 12c. To my (and my production support team's) knowledge this query had been working fine for well over a year before that.

Details

The query is overly complicated and not very efficient but this is in large part because I am dealing with non-normalized tables (historically modeled after a Mainframe) and poor data input from upstream systems. In order to deal with a complicated business situation I leveraged multiple levels of Subquery Factoring (the WITH statement) and then my final statement joins together two Inline Views. The basic structure of the query without all of the complicated predicates is as follows:

I have 3 tables Table1, Table2, Table3. Table1 is a processing table made up of records from Table2.

--This grabs a subset from Table1
WITH Subquery1 as (
   SELECT FROM Table1),

--This eliminates certain records from the first subset based on sister records 
--from the original source table 
Subquery2 as (
   SELECT FROM Subquery1
   WHERE NOT EXISTS FROM (SELECT from Table2)),

--This ties the records from Subquery2 to Table3
Subquery3 as (
   SELECT FROM Table3
   JOIN (SELECT Max(Date) FROM Table3)
   JOIN Subquery2)

--This final query evaluates subquery3 in two different ways and 
--only takes those records which fit the criteria items from both sets
SELECT FROM 
(SELECT FROM Subquery3)             -- Call this Inline View A
JOIN (SELECT FROM Subquery3)        -- Call this Inline View B

The final query is pretty basic:

   SELECT A.Group_No, B.Sub_Group, B.Key, B.Lob               
   FROM   (SELECT Group_No, Lob, COUNT(Sub_Group) 
           FROM   Subquery3 
           GROUP BY Group_No, Lob
           HAVING COUNT(Sub_Group) = 1) A 
   JOIN (SELECT Group_No, Sub_Group, Key, Lob
         FROM   Subquery3 
         WHERE  Sub_Group LIKE '0000%') B 
   ON A.Group_No = B.Group_No
   AND A.Lob = B.Lob

Problem

If I edit the final query to remove the second Inline View and evaluate the output of the A inline view, I come away with 0 returned rows. I've manually evaluated the records for each individual subquery and can confirm this is an expected result.

Likewise, if I edit the final query to produce the output of only the 'B' inline view, I come away with 6 returned rows. Again, I've manually evaluated the data and this is exactly as expected.

Now when joining these two subsets (Inline View A and Inline View B) together, I would expect that the final query result would be 0 rows (since an inner join between a full set and an empty set produces no matches). However, when I run the entire query with the inner join as described above, I am getting back 1158 rows!

I have reviewed the Execution Plan but nothing jumps out at me:

Execution Plan 1 Execution Plan 2

Questions

Clearly I have done something to confuse the Oracle Optimizer and the updated query plan is pulling back a much different query than the one I have submitted. My best guess is that with all of these temporary views floating around within the same query, I have confused Oracle into evaluating some set before one that it depends upon.

To this day I've been unable to locate the official Oracle documentation around the WITH statement so I've never been completely confident about the order that subqueries are evaluated. I did notice in searching SO (can't find it now) someone mentioned that a factored subquery cannot refer to another factored query. I've never known this to be true before but the bizarre output above is making me wonder if I had only been lucky before with this query?

Can anyone explain the behavior I am seeing? Am I attempting to do something obviously incorrect with this query plan? Or alternatively, is there any chance that something changed between 11g and 12c that could explain why the behavior of this query might have changed?


Solution

  • This sounds like a "wrong results" bug in Oracle. These bugs are usually extremely specific to the version and the features you are using. There's nothing obviously wrong with the queries or execution plan you posted.

    You have two ways of handling this:

    1. Try to find the precise bug. What you're doing with common table expressions looks fine. There are some rare times when your query is technically invalid, you get "lucky" in one version and it works, and when you upgrade it fails. But when that happens the new version usually throws an error, not return wrong results. There's probably some extremely weird, specific combination of features you're using that's causing the issue. To find the real issue you need to massively simplify the query until you can make the smallest possible change and see the problem appear and disappear. You'll also want to remove all objects and only use DUAL. This process can take hours. At the end, when you're left with only a few lines of code, either post them here, look on Oracle Support, or create a Service Request.
    2. Avoid the bug. Even if you go through the above steps there may not be a fix anyway. Sometimes the best work-around is to do something differently. It's nice to get to the bottom of every problem but you don't always have time. Instead, try re-writing the query in syntactically different but logically equivalent ways. Remove some or all of the common table expressions, maybe even repeat some SQL. But be sure to leave a comment warning future programmers of why you're doing things in a weird way.