Mohamed Houri’s Oracle Notes

June 6, 2015

SUBQ INTO VIEW FOR COMPLEX UNNEST

Filed under: Oracle — hourim @ 8:42 am

If you are a regular reader of Jonathan Lewis blog you will have probably came across this article in which the author explains why an “OR subquery” pre-empts the optimizer from unnesting the subquery and merging it with its parent query for a possible optimal join path. This unnesting impossibility is so that the “OR subquery” is executed as a FILTER predicate which when applied on a huge data row set penalizes dramatically the performance of the whole query. In the same article, you will have hopefully also learned how by re-writing the query using a UNION ALL (and taking care of the always threatening NULL via the LNNVL() function) you can open a new path for the CBO allowing an unnest of the subquery.

Unfortunately, nowadays there is a massive expansion of third party software where changing SQL code is not possible so that I hoped that the optimizer was capable to automatically re-factor a disjunctive subquery and consider unnesting it using the UNION ALL workaround.

I was under that impression that this hope is never exhausted by the optimizer until last week when I have received from my friend Ahmed Aangour an e-mail showing a particular disjunctive subquery which has been unnested by the optimizer without any rewrite of the original query by the developer. I have found the case very interesting so that I decided to model it and to share it with you. Take a look to the query and the execution plan first in 11.2.0.2 (the table script is supplied at the end of the article)

SQL> alter session set statistics_level=all;

SQL> alter session set optimizer_features_enable='11.2.0.2';

SQL> select
 a.id1
 ,a.n1
 ,a.start_date
from t1 a
where (a.id1 in
 (select
 b.id
 from t2 b
 where
 b.status = 'COM'
 )
 OR
 a.id1 in
 (select
 c.id1
 from t2 c
 where
 c.status = 'ERR'
 )
 );

SQL> select * from table(dbms_xplan.display_cursor(null,null, ‘allstats last’));

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |   9890 |00:00:02.23 |     742K|<--
|*  1 |  FILTER            |      |      1 |        |   9890 |00:00:02.23 |     742K|
|   2 |   TABLE ACCESS FULL| T1   |      1 |  10000 |  10000 |00:00:00.01 |    1686 |
|*  3 |   TABLE ACCESS FULL| T2   |  10000 |      1 |   9890 |00:00:02.16 |     725K|
|*  4 |   TABLE ACCESS FULL| T2   |    110 |      1 |      0 |00:00:00.05 |   15400 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(( IS NOT NULL OR IS NOT NULL))
 3 - filter(("B"."ID"=:B1 AND "B"."STATUS"='COM'))
 4 - filter(("C"."ID1"=:B1 AND "C"."STATUS"='ERR'))

The double full access to table t2 plus the FILTER operation indicate clearly that the OR clause has not been combined with the parent query. If you want to know what is behind the filter predicate n°1 above then the “not so famous” explain plan for command will help in this case:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   975 | 15600 |   462   (0)| 00:00:01 |
|*  1 |  FILTER            |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   156K|   462   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |     8 |    42   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL| T2   |     1 |     7 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT 0 FROM "T2" "B" WHERE "B"."ID"=:B1 AND
              "B"."STATUS"='COM') OR  EXISTS (SELECT 0 FROM "T2" "C" WHERE
              "C"."ID1"=:B2 AND "C"."STATUS"='ERR'))
   3 - filter("B"."ID"=:B1 AND "B"."STATUS"='COM')
   4 - filter("C"."ID1"=:B1 AND "C"."STATUS"='ERR')

Notice how the subquery has been executed as a FILTER operation which sometimes (if not often) represents a real performance threat. 

However, when I‘ve executed the same query under optimizer 11.2.0.3 I got the following interesting execution plan

SQL> alter session set optimizer_features_enable='11.2.0.3';

--------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |      1 |        |   9890 |00:00:00.03 |    1953 |<--
|*  1 |  HASH JOIN            |          |      1 |   5000 |   9890 |00:00:00.03 |    1953 |
|   2 |   VIEW                | VW_NSO_1 |      1 |   5000 |   9890 |00:00:00.01 |     282 |
|   3 |    HASH UNIQUE        |          |      1 |   5000 |   9890 |00:00:00.01 |     282 |
|   4 |     UNION-ALL         |          |      1 |        |   9900 |00:00:00.01 |     282 |
|*  5 |      TABLE ACCESS FULL| T2       |      1 |   2500 |     10 |00:00:00.01 |     141 |
|*  6 |      TABLE ACCESS FULL| T2       |      1 |   2500 |   9890 |00:00:00.01 |     141 |
|   7 |   TABLE ACCESS FULL   | T1       |      1 |  10000 |  10000 |00:00:00.01 |    1671 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("A"."ID1"="ID1")
   5 - filter("C"."STATUS"='ERR')
   6 - filter("B"."STATUS"='COM')

Notice now how the new plan is showing a HASH JOIN operation between an internal view( VW_NSO_1) and table t1 coming from the parent query block. Notice as well the HASH JOIN condition (access(“A”.”ID1″=”ID1″)) that appears in filter n°1. The optimizer has done a double transformation:

  • created an internal view VW_NSO_1  representing a UNION-ALL between the two subqueries present in the where clause
  • joined the newly online created view with table t1 present in the parent query block

Looking at the corresponding 10053 trace file I have found how the CBO has transformed the initial query:

select a.id1 id1,
  a.n1 n1,
  a.start_date start_date
from (
  (select c.id1 id1 from c##mhouri.t2 c where c.status='ERR')
union
  (select b.id id from c##mhouri.t2 b where b.status='COM')
     ) vw_nso_1,
  c##mhouri.t1 a
where a.id1= vw_nso_1.id1;

In fact the optimizer has first combined the two subqueries into a VIEW and finished by UNNESTING them with the parent query. This is a transformation which Oracle optimizer seems to name : SUBQ INTO VIEW FOR COMPLEX UNNEST

In the same 10053 trace file we can spot the following lines:

*****************************
Cost-Based Subquery Unnesting
*****************************
Query after disj subq unnesting:******* UNPARSED QUERY IS *******

SU:   Transform an ANY subquery to semi-join or distinct.
Registered qb: SET$7FD77EFD 0x15b5d4d0 (SUBQ INTO VIEW FOR COMPLEX UNNEST SET$E74BECDC)

SU: Will unnest subquery SEL$3 (#2)
SU: Will unnest subquery SEL$2 (#3)
SU: Reconstructing original query from best state.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#3)
SU:   Checking validity of unnesting subquery SEL$3 (#2)
Query after disj subq unnesting:******* UNPARSED QUERY IS *******

SU:   Checking validity of unnesting subquery SET$E74BECDC (#6)
SU:   Passed validity checks.

This is a clear enhancement made in the optimizer query transformation that will help improving performance of disjunctive subqueries automatically without any external intervention.

Unfortunately, I was going to end this article until I’ve realized that although I am testing this case under 12.1.0.1.0 database release I still have not executed the same query under optimizer feature 12.1.0.1.0

SQL> alter session set optimizer_features_enable='12.1.0.1.1';
SQL > execute query
-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |   9890 |00:00:03.84 |     716K|
|*  1 |  FILTER            |      |      1 |        |   9890 |00:00:03.84 |     716K|
|   2 |   TABLE ACCESS FULL| T1   |      1 |  10000 |  10000 |00:00:00.01 |    1686 |
|*  3 |   TABLE ACCESS FULL| T2   |  10000 |      2 |   9890 |00:00:03.81 |     715K|
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)
   3 - filter((("B"."ID1"=:B1 AND "B"."STATUS"='ERR') OR ("B"."ID"=:B2 AND
              "B"."STATUS"='COM')))

The automatic unnesting of the disjunctive subquery has been removed in the 12.1.0.1.1 optimizer model.

If you want to reproduce and test this case here below is the model (I would be interested to see if the disjunctive subquery is unnested or not in the 12.1.0.1.2 release )

create table t1
   as select
    rownum                id1,
    trunc((rownum-1/3))   n1,
    date '2012-06-07' + mod((level-1)*2,5) start_date,
    lpad(rownum,10,'0')   small_vc,
    rpad('x',1000)        padding
from dual
connect by level <= 1e4;   

create table t2
as select
    rownum id
    ,mod(rownum,5) + mod(rownum,10)* 10  as id1
    ,case
       when mod(rownum, 1000) = 7 then 'ERR'
       when rownum <= 9900 then 'COM'
       when mod(rownum,10) between 1 and 5 then 'PRP'
     else
       'UNK'
     end status
     ,lpad(rownum,10,'0')    as small_vc
     ,rpad('x',70)           as padding
from dual
connect by level <= 1e4;

alter table t1 add constraint t1_pk primary key (id1);

1 Comment »

  1. Hi Mohamed,

    The reason for the different behavior under OFE=12.1.0.1 (and .2 too) is the fix for 10216738 that “Toggels subquery coalescing for ANY and ALL subqueries” (from the description of the fix_control).
    It looks like a new feature of the coalescing that was introduced directly in 12c but this is “closing the door” to the unnesting you get in 11.2.0.3/4 because it makes such SU invalid

    SU: Unnesting query blocks in query block SEL$1 (#1) that are valid to unnest.
    Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
    SU: Considering subquery unnest on query block SEL$1 (#1).
    SU: Checking validity of unnesting subquery SEL$ACD206C8 (#2)
    SU: SU bypassed: Invalid correlated predicates.
    SU: Validity checks failed.

    where here SEL$ACD206C8 is SUBQUERY COALESCE SEL$7C83FF7B; SEL$3 (and SEL$7C83FF7B is the coalesce from SEL$2, looks like it uses an “intermediate” qb).

    The transformation is heuristic based so it’s applied any time it’s valid to do so regardless of the benefit (hence the cost for the plan with the transformation on vs off).
    A bug has been filed already in the past (bug 14722317) but it has been closed as “not feasible to fix”

    I hope it helps and I hope I didn’t miss the elephant in the room🙂

    Cheers,
    Mauro

    Comment by Mauro Pagano — June 7, 2015 @ 4:04 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: