oracle-databasejoinindexingpredicatecost-based-optimizer

Oracle Avoid Wasteful Join Back?


Suppose we have three tables (A, B, and C) as defined in the contrived example below where A and B are related to (have foreign keys in) C. Suppose, we want values from all three tables and predicate on A and B. Oracle can only join two rowsets together at any one time. We get a join order something like ((A -> C) -> B). This means we spend I/O getting rows from C, that we end up just discarded when we join back to B (and B's predicate).

How can we avoid this "wasteful" I/O on table C?

Star Transformation is great, but only kicks in if the optimizer determines the cost justifies the star transformation. That is, we are not guaranteed to get a star transformation. This might seem like what one wants, but the optimizer is obtaining poor estimated rows (see example below - off by a factor of 10). Thus, the optimize is electing to not use star transformation when it otherwise would prove beneficial.

We cannot manually author the query in star transformation like from as the SQL is generated by a BI reporting tool.

Perhaps my question is how to "force" the optimizer to use star transformation without manually writing the query in that form? Or, perhaps, my question is how to get the estimated rows to be better, so we can be more assured that the optimizer will invoke star transformation? Or, perhaps (quite likely), there is some other cool Oracle feature of which I am not yet aware that might provide a solution.

Oracle 12.1 Enterprise Edition (but moving to 19.1 in a few months) Thanks in advance.

drop table join_back_c;
drop table join_back_a;
drop table join_back_b;

create table join_back_a
  as
  with "D" as (select 1 from dual connect by level <= 1000)
    select rownum                  a_code
           , rpad('x',100)         a_name
      from "D"
;

create unique index IX_join_back_a_code on join_back_a(a_code); 
alter table join_back_a add constraint PK_dan_join_back_a primary key (a_code);

create table join_back_b
  as
  with "D" as (select /*+ materialize */ 1 from dual connect by level <= 320)
    select  rownum                b_id
          , mod(rownum, 10)       b_group
    from "D", "D"
   where rownum <= 100000 --100k
;

create unique index IX_join_back_b_id on join_back_b(b_id);   
create index IX_join_back_b_group on join_back_b(b_group); 
alter table join_back_b add constraint PK_dan_join_back_b primary key (b_id);

create table join_back_c
  as
  with "D" as (select /*+ materialize */ level from dual connect by level <= 3200)
    select  rownum                              c_id
          , trunc(dbms_random.value(1, 1000))   a_code     --table a FK
          , trunc(dbms_random.value(1, 100000)) b_id       --table b FK
    from "D", "D"
   where rownum <= 1000000 -- 1M
;

create index IR_join_back_c_a_code on join_back_c(a_code);
create index IR_join_back_c_b_id on join_back_c(b_id);  

exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_C');
exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_A');
exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_B');
select *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;
--------------------------------------------------------------------------------------------------------
| id  | Operation                      | name                  | rows  | Bytes | cost (%CPU)| time     |
--------------------------------------------------------------------------------------------------------
|   0 | select statement               |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|*  1 |  hash join                     |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|   2 |   nested LOOPS                 |                       |       |       |            |          |
|   3 |    nested LOOPS                |                       |  1001 |   116K|   839   (1)| 00:00:01 |
|   4 |     table access by index ROWID| JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|*  5 |      index range scan          | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     index range scan           | IR_JOIN_BACK_C_A_CODE |  1001 |       |     4   (0)| 00:00:01 |
|   7 |    table access by index ROWID | JOIN_BACK_C           |  1001 | 14014 |   837   (1)| 00:00:01 |
|*  8 |   table access full            | JOIN_BACK_B           | 10000 | 80000 |   143   (5)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."B_ID"="C"."B_ID")
   5 - access("A"."A_CODE"=1)
   6 - access("C"."A_CODE"=1)
   8 - filter("B"."B_GROUP"=1)
select count(*) 
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;  -- about 100 rows

Join Order: ((A -> C) -> B)

A -> C (step 3) has an accurate estimated rows of about 1k.

step 8's estimate is accurate too.

However, this join back to B (step 1) will only serve to further reduce the 1k rowset from step 3. In this case B's predicate reduces the (A -> C) rowset by a factor of 1/10.
This means we accessed 1000 rows from C, just to throw away 900 of those rows.

select /*+ star_transformation */
       *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;
--------------------------------------------------------------------------------------------------------
| id  | Operation                      | name                  | rows  | Bytes | cost (%CPU)| time     |
--------------------------------------------------------------------------------------------------------
|   0 | select statement               |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|*  1 |  hash join                     |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|   2 |   nested LOOPS                 |                       |       |       |            |          |
|   3 |    nested LOOPS                |                       |  1001 |   116K|   839   (1)| 00:00:01 |
|   4 |     table access by index ROWID| JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|*  5 |      index range scan          | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     index range scan           | IR_JOIN_BACK_C_A_CODE |  1001 |       |     4   (0)| 00:00:01 |
|   7 |    table access by index ROWID | JOIN_BACK_C           |  1001 | 14014 |   837   (1)| 00:00:01 |
|*  8 |   table access full            | JOIN_BACK_B           | 10000 | 80000 |   143   (5)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."B_ID"="C"."B_ID")
   5 - access("A"."A_CODE"=1)
   6 - access("C"."A_CODE"=1)
   8 - filter("B"."B_GROUP"=1)

I'm looking for an execution path that is something more like the following. Despite the 10M estimated rows below, the row count of this query remains around 100. However, we cannot control the SQL generated to this degree. This is what was referred to above as a manually writing the query in star transformation like from.

select *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where C.rowid in ( select C1.rowid 
                      from join_back_C "C1"
                           join join_back_a "A1"
                                on C1.a_code = A1.a_code
                     where A1.a_code = 1
                    intersect
                    select C2.rowid 
                      from join_back_C "C2"
                           join join_back_b "B1"
                                on C2.b_id = B1.b_id
                     where B1.b_group = 1                  
                  )
;
---------------------------------------------------------------------------------------------------------------
| id  | Operation                     | name                  | rows  | Bytes |TempSpc| cost (%CPU)| time     |
---------------------------------------------------------------------------------------------------------------
|   0 | select statement              |                       |  9928K|  1316M|       |  4649  (17)| 00:00:01 |
|*  1 |  hash join                    |                       |  9928K|  1316M|       |  4649  (17)| 00:00:01 |
|   2 |   table access full           | JOIN_BACK_A           |  1000 |   102K|       |    16   (0)| 00:00:01 |
|*  3 |   hash join                   |                       |  9928K|   321M|       |  4320  (11)| 00:00:01 |
|   4 |    table access full          | JOIN_BACK_B           |   100K|   781K|       |   142   (5)| 00:00:01 |
|   5 |    nested LOOPS               |                       |    10M|   248M|       |  3858   (3)| 00:00:01 |
|   6 |     view                      | VW_NSO_1              |  1001 | 12012 |       |  2855   (4)| 00:00:01 |
|   7 |      INTERSECTION             |                       |       |       |       |            |          |
|   8 |       SORT UNIQUE             |                       |  1001 | 18018 |       |            |          |
|   9 |        NESTED LOOPS           |                       |  1001 | 18018 |       |     5   (0)| 00:00:01 |
|* 10 |         INDEX RANGE SCAN      | IX_JOIN_BACK_A_CODE   |     1 |     4 |       |     1   (0)| 00:00:01 |
|* 11 |         INDEX RANGE SCAN      | IR_JOIN_BACK_C_A_CODE |  1001 | 14014 |       |     4   (0)| 00:00:01 |
|  12 |       SORT UNIQUE             |                       | 99191 |  2131K|  3120K|            |          |
|* 13 |        HASH JOIN              |                       | 99191 |  2131K|       |  1789   (5)| 00:00:01 |
|* 14 |         TABLE ACCESS FULL     | JOIN_BACK_B           | 10000 | 80000 |       |   143   (5)| 00:00:01 |
|  15 |         INDEX FAST FULL SCAN  | IR_JOIN_BACK_C_B_ID   |  1000K|    13M|       |  1614   (3)| 00:00:01 |
|  16 |     TABLE ACCESS BY USER ROWID| JOIN_BACK_C           | 10000 |   136K|       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."A_CODE"="C"."A_CODE")
   3 - access("B"."B_ID"="C"."B_ID")
  10 - access("A1"."A_CODE"=1)
  11 - access("C1"."A_CODE"=1)
  13 - access("C2"."B_ID"="B1"."B_ID")
  14 - filter("B1"."B_GROUP"=1)

Tried making the two foreign key indices on table C into bitmap indices - no luck. Also, tried a composite index on table C(a_code, b_id) - again, no luck. Also, the composite index is not preferable as our table C really has many foreign keys (some surrogates and some natural keys).


Solution

  • In addition to Jon Heller's answer:

    On the other hand, the predicate b.b_group = 1 will select 10% of the rows, which is in full table scan territory, and is not an operation you want to run repeatedly in a subquery.

    mod(rownum, 10) b_group not only gives 10% selectivity, but also means that in your test case each table block contains a few dozens of such rows:

    SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_b;
    
    COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
    ---------------------------------------------------
                                                    177
    SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_b where b_group=1;
    
    COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
    ---------------------------------------------------
                                                    177
    
    SQL> select min(cnt), max(cnt), avg(cnt)
      2  from (
      3    select dbms_rowid.rowid_block_number(rowid) block_n, count(*) cnt
      4    from join_back_b
      5    where b_group=1
      6    group by dbms_rowid.rowid_block_number(rowid)
      7  );
    
      MIN(CNT)   MAX(CNT)   AVG(CNT)
    ---------- ---------- ----------
            49         62 56.4971751
    

    It gives us 10000 rows from B and b.b_id=c.b_id predicate gives us ~10% selectivity from JOIN_BACK_C and also means that each block of JOIN_BACK_C contains few dozens of needed rows:

    SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_c;
    
    COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
    ---------------------------------------------------
                                                   2597
    
    select min(cnt), max(cnt), avg(cnt), count(distinct block_n)
    from (
      select dbms_rowid.rowid_block_number(c.rowid) block_n, count(*) cnt 
      from join_back_b b
           join join_back_c c
           on b.b_id=c.b_id
      where b_group=1 
      group by dbms_rowid.rowid_block_number(c.rowid)
    );
    
      MIN(CNT)   MAX(CNT)   AVG(CNT) COUNT(DISTINCTBLOCK_N)
    ---------- ---------- ---------- ----------------------
             8         57 38.6334232                   2597
    

    Moreover join_back_c.a_code=1 also gives bad selectivity ~ 1/1000 = 1000 rows in random blocks, while this table contains just ~2500 blocks. So you need to scan 1/2.5 =~ 40% of table blocks. Obviously, it's better to do it using multiblock reads.

    But if we return to the main problem: yes, I understand your problem - sometimes it's better to split one row source to 2 different access paths and CBO often can't do this. There is a standard approach for such cases - to rewrite the query and duplicate row source twice, for example:

    a bit modified test data for better selectivity/reduced IO:

    create table join_back_b
      as
      with "D" as (select /*+ materialize */ 1 from dual connect by level <= 320)
        select  rownum                b_id
              , mod(rownum, 1000)     b_group
        from "D", "D"
       where rownum <= 100000 --100k
       order by b_group
    ;
    

    and +padding (to make rows bigger):

    create table join_back_c
      as
      with "D" as (select /*+ materialize */ level from dual connect by level <= 3200)
        select  rownum                              c_id
              , trunc(dbms_random.value(1, 1000))   a_code     --table a FK
              , trunc(dbms_random.value(1, 100000)) b_id       --table b FK
              , rpad('x',100,'x') padding
        from "D", "D"
       where rownum <= 1000000 -- 1M
    ;
    

    Example:

    with
     ac as (
      select c.rowid rid
            ,a.*
      from join_back_a A
           join join_back_c C
             on A.a_code = C.a_code
      where a.a_code = 1
     )
    ,bc as (
      select c.rowid rid
            ,b.*
      from join_back_b B
           join join_back_c C
             on b.b_id = c.b_id
      where b.b_group = 1
    )
    select--+ no_adaptive_plan NO_ELIMINATE_JOIN(c) no_merge(ac) no_merge(bc) 
       *
      from ac 
           join bc on ac.rid=bc.rid
           join join_back_c C
             on bc.rid = c.rowid;
    

    Plan:

    Plan hash value: 3065703407
    
    -----------------------------------------------------------------------------------------------------------------
    | Id  | Operation                               | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                        |                       |     1 |   230 |   209   (0)| 00:00:01 |
    |   1 |  NESTED LOOPS                           |                       |     1 |   230 |   209   (0)| 00:00:01 |
    |*  2 |   HASH JOIN                             |                       |     1 |   115 |   208   (0)| 00:00:01 |
    |   3 |    VIEW                                 |                       |   992 | 37696 |   202   (0)| 00:00:01 |
    |   4 |     NESTED LOOPS                        |                       |   992 | 25792 |   202   (0)| 00:00:01 |
    |   5 |      TABLE ACCESS BY INDEX ROWID BATCHED| JOIN_BACK_B           |   100 |   900 |     2   (0)| 00:00:01 |
    |*  6 |       INDEX RANGE SCAN                  | IX_JOIN_BACK_B_GROUP  |   100 |       |     1   (0)| 00:00:01 |
    |*  7 |      INDEX RANGE SCAN                   | IR_JOIN_BACK_C_B_ID   |    10 |   170 |     2   (0)| 00:00:01 |
    |   8 |    VIEW                                 |                       |  1001 | 77077 |     6   (0)| 00:00:01 |
    |   9 |     NESTED LOOPS                        |                       |  1001 |   118K|     6   (0)| 00:00:01 |
    |  10 |      TABLE ACCESS BY INDEX ROWID        | JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
    |* 11 |       INDEX UNIQUE SCAN                 | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
    |* 12 |      INDEX RANGE SCAN                   | IR_JOIN_BACK_C_A_CODE |  1001 | 16016 |     4   (0)| 00:00:01 |
    |  13 |   TABLE ACCESS BY USER ROWID            | JOIN_BACK_C           |     1 |   115 |     1   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("AC"."RID"="BC"."RID")
       6 - access("B"."B_GROUP"=1)
       7 - access("B"."B_ID"="C"."B_ID")
      11 - access("A"."A_CODE"=1)
      12 - access("C"."A_CODE"=1)