Mohamed Houri’s Oracle Notes

November 28, 2013

Why?

Filed under: Oracle — hourim @ 1:01 pm

Why? This is a fundamental question. And in the context of my work (or actually my hobby to say so) I often say ”Why”. Yesterday I have been handled a query to tune. This query is honored with the following sub-optimal execution plan

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                         |      1 |        |     10 |00:02:13.09 |
|   1 |  SORT ORDER BY                      |                         |      1 |      8 |     10 |00:02:13.09 |
|   2 |   NESTED LOOPS                      |                         |      1 |        |     10 |00:02:13.06 |
|   3 |    NESTED LOOPS                     |                         |      1 |      8 |     10 |00:02:13.06 |
|   4 |     NESTED LOOPS                    |                         |      1 |      8 |     10 |00:02:13.06 |
|*  5 |      HASH JOIN SEMI                 |                         |      1 |      8 |     10 |00:02:13.06 |
|   6 |       JOIN FILTER CREATE            | :BF0000                 |      1 |        |   1469 |00:00:02.06 |
|   7 |        NESTED LOOPS                 |                         |      1 |        |   1469 |00:00:00.17 |
|   8 |         NESTED LOOPS                |                         |      1 |    307 |   5522 |00:00:00.11 |
|*  9 |          TABLE ACCESS BY INDEX ROWID| T2                      |      1 |    316 |   5522 |00:00:00.07 |
|* 10 |           INDEX RANGE SCAN          | T2_OOST_START_DATE_I    |      1 |   1033 |   8543 |00:00:00.03 |
|* 11 |          INDEX RANGE SCAN           | T1_OBST_OOST_DK_I       |   5522 |      1 |   5522 |00:00:00.08 |
|* 12 |         TABLE ACCESS BY INDEX ROWID | T1                      |   5522 |      1 |   1469 |00:00:00.13 |
|  13 |       VIEW                          | VW_SQ_1                 |      1 |  64027 |   1405 |00:00:07.82 |
|* 14 |        FILTER                       |                         |      1 |        |   1405 |00:00:07.82 |
|  15 |         JOIN FILTER USE             | :BF0000                 |      1 |  64027 |   1405 |00:00:07.82 |
|  16 |          PARTITION REFERENCE ALL    |                         |      1 |  64027 |  64027 |00:01:48.22 |
|* 17 |           HASH JOIN                 |                         |     52 |  64027 |  64027 |00:02:03.37 | --> spot this
|  18 |            TABLE ACCESS FULL        | T4                      |     52 |  64027 |  64027 |00:00:00.34 |
|* 19 |            TABLE ACCESS FULL        | T5                      |     41 |    569K|   5555K|00:02:08.32 | --> spot this
|  20 |      TABLE ACCESS BY INDEX ROWID    | T3                      |     10 |      1 |     10 |00:00:00.01 |
|* 21 |       INDEX UNIQUE SCAN             | T3_CHP_PK               |     10 |      1 |     10 |00:00:00.01 |
|* 22 |     INDEX UNIQUE SCAN               | T3_CHP_PK               |     10 |      1 |     10 |00:00:00.01 |
|  23 |    TABLE ACCESS BY INDEX ROWID      | T3                      |     10 |      1 |     10 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("ITEM_1"="OBST"."OBST_ID")
       filter("ITEM_2">"OBST"."START_DATE")
   9 - filter(("OOST"."CHP_ID_START" IS NOT NULL AND "OOST"."CHP_ID_END" IS NOT NULL))
  10 - access("OOST"."START_DATE">=:LD_FROM_DATE AND "OOST"."START_DATE"<=:LD_TO_DATE)
  11 - access("OBST"."OOST_ID"="OOST"."OOST_ID")
  12 - filter(("OBST"."OBS_STAT"=2 AND ADD_MONTHS(INTERNAL_FUNCTION("OBST"."MOD_DATE"),13)>=:LD_CURR_DATE))
  14 - filter((:LN_MIN_ac_number<=:ln_max_number AND TO_DATE(:LD_FROM_DATE)<=TO_DATE(:LD_TO_DATE)))
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  19 - filter(((:LN_OPR_ID IS NULL OR "VEH"."OPR_ID"=:LN_OPR_ID) AND "VEH"."ac_number">=:LN_MIN_ac_number AND
              "VEH"."ac_number"<=:ln_max_number))
  21 - access("OOST"."CHP_ID_START"="CHP1"."CHP_ID")
  22 - access("OOST"."CHP_ID_END"="CHP2"."CHP_ID")

This query takes 2 minutes and 13 seconds to complete. Whereas a single operation (operation 19) takes 2 minutes and 8 seconds before feeding back its parent operation 17. At least I know where to focus my attention.

But Why? Why this full table scan?

Just before having this execution plan I have asked to re-gather statistics on tables and columns without histogram. Statistics seems to be Okay. And, by the way, it is normal that a full table scan is chosen by the CBO to generate 5,5 millions rows. Isn’t it?

So where is the problem?

An expert eye would say “look you start manipulating 5,5 million of rows to end up finally only with 10 rows; you need to start small and keep small”
So it is now time to investigate the query

SELECT  
                obst.obst_id obstructionId
               ,oost.comment_clob CommentClob
               ,chp1.ptcar_no StartPtcar
               ,chp2.ptcar_no EndPtcar
               ,oost.track_code Track
               ,oost.start_date StartPeriod
               ,oost.end_date EndPeriod
               ,oost.doc_no RelaasId
               ,obst.status_code Status
        FROM   T1 obst
             , T2 oost
             , T3 chp1
             , T3 chp2
          where obst.oost_id      = oost.oost_id
          and oost.chp_id_start = chp1.chp_id
          and oost.chp_id_end   = chp2.chp_id
          and obst.obs_stat     = 2 
		  and add_months(obst.mod_date,13) >= :ld_curr_date
		  and oost.start_date between :ld_from_date and :ld_to_date          
          and exists (select 1
                        from T4  obsv
                           , T5  veh
                        where  obsv.veh_id = veh.veh_id
		       and (:ln_opr_id is null
                               OR veh.opr_id = :ln_opr_id
                            )
                          and  obsv.obst_id = obst.obst_id
                          and  veh.ac_number between :ln_min_number and :ln_max_number
                          and  obsv.start_date > obst.start_date
                      )          
         order by obst.obst_id;		

As far as the most time consuming operation (table T5 full access) is in the EXISTS part I commented it and run the query which, unsurprisingly, came up instantaneously with about thousand of rows (1469). All in all the commented part when uncommented would normally keep from those 1469 rows only rows that fulfill the EXISTS clause and throw away the remaining rows. We need then to manipulate at least 1469 records. So why this enormous 5,5 millions rows?

Back to the execution plan where I decided to focus my attention only on the following operations:

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------
|* 17 |           HASH JOIN                 |                         |     52 |  64027 |  64027 |00:02:03.37 | 
|  18 |            TABLE ACCESS FULL        | T4                      |     52 |  64027 |  64027 |00:00:00.34 |
|* 19 |            TABLE ACCESS FULL        | T5                      |     41 |    569K|   5555K|00:02:08.32 | 
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  19 - filter(((:LN_OPR_ID IS NULL OR "VEH"."OPR_ID"=:LN_OPR_ID) AND "VEH"."ac_number">=:LN_MIN_ac_number AND
              "VEH"."ac_number"<=:ln_max_number))
 

The CBO started by generating 5555K worth of rows using the filter number 19 and pass this enormous amount of data to the HASH JOIN operation number 17 which throw away the majority of those rows from which only 64027 survived the join condition indicated by predicate number 17. It is clear that the ideal situation would have been to (a) join first and (b) filter in a second step. So why (again) the CBO opted for a reversed situation (a) filter first and (b) join after? The filtering bind variables (:LN_MIN_AC_NUMBER,:LN_MAX_NUMBER) are not filtering anything because they represent the real min (veh.ac_number) and max (veh.ac_number)

Why the CBO has opted for this sub-optimal execution plan? What can I do to make the CBO joining first and filtering after? Statistics seems Okay as I said above. I spent a couple of hours searching on how to make this possible without changing the query but failed to succeed. This is why I asked a question on the otn forum where Jonathan Lewis gave me a help and suggested to use the following hint in the subquery

          and exists (select /*+ no_unnest push_subq */ 1
                        from T4  obsv
                           , T5  veh
                        where  obsv.veh_id = veh.veh_id
		       and (:ln_opr_id is null
                               OR veh.opr_id = :ln_opr_id
                            )
                          and  obsv.obst_id = obst.obst_id
                          and  veh.ac_number between :ln_min_number and :ln_max_number
                          and  obsv.start_date > obst.start_date
                      )                   	

Which effectively gave the following optimal execution plan:

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                         |      1 |        |      6 |00:00:00.56 |
|   1 |  SORT ORDER BY                            |                         |      1 |    254 |      6 |00:00:00.56 |
|*  2 |   HASH JOIN                               |                         |      1 |    254 |      6 |00:00:00.11 |
|   3 |    TABLE ACCESS FULL                      | T3                      |      1 |   2849 |   2849 |00:00:00.01 |
|*  4 |    HASH JOIN                              |                         |      1 |    254 |      6 |00:00:00.11 |
|   5 |     TABLE ACCESS FULL                     | T3                      |      1 |   2849 |   2849 |00:00:00.01 |
|   6 |     NESTED LOOPS                          |                         |      1 |        |      6 |00:00:00.10 |
|   7 |      NESTED LOOPS                         |                         |      1 |    254 |   5012 |00:00:00.09 |
|*  8 |       TABLE ACCESS BY INDEX ROWID         | T2                      |      1 |    262 |   5012 |00:00:00.06 |
|*  9 |        INDEX RANGE SCAN                   | T2_OOST_START_DATE_I    |      1 |    857 |   7722 |00:00:00.01 |
|* 10 |       INDEX RANGE SCAN                    | T1_OBST_OOST_DK_I       |   5012 |      1 |   5012 |00:00:00.03 |
|* 11 |      TABLE ACCESS BY INDEX ROWID          | T1                      |   5012 |      1 |      6 |00:00:00.48 |
|  12 |       NESTED LOOPS                        |                         |   1277 |        |      6 |00:00:00.46 |
|  13 |        NESTED LOOPS                       |                         |   1277 |      2 |      6 |00:00:00.46 |
|  14 |         PARTITION REFERENCE ALL           |                         |   1277 |      4 |      6 |00:00:00.46 |
|* 15 |          TABLE ACCESS BY LOCAL INDEX ROWID| T4                      |  66380 |      4 |      6 |00:00:00.43 |
|* 16 |           INDEX RANGE SCAN                | T4_OBSV_OBST_FK_I       |  66380 |     86 |      6 |00:00:00.28 |
|* 17 |         INDEX UNIQUE SCAN                 | T5_VEH_PK               |      6 |      1 |      6 |00:00:00.01 |
|* 18 |        TABLE ACCESS BY GLOBAL INDEX ROWID | T5                      |      6 |      1 |      6 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OOST"."CHP_ID_END"="CHP2"."CHP_ID")
   4 - access("OOST"."CHP_ID_START"="CHP1"."CHP_ID")
   8 - filter(("OOST"."CHP_ID_START" IS NOT NULL AND "OOST"."CHP_ID_END" IS NOT NULL))
   9 - access("OOST"."START_DATE">=TO_DATE(' 2013-11-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "OOST"."START_DATE"
      <=TO_DATE(' 2013-11-27 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  10 - access("OBST"."OOST_ID"="OOST"."OOST_ID")
  11 - filter(("OBST"."OBS_STAT"=2 AND ADD_MONTHS(INTERNAL_FUNCTION("OBST"."MOD_DATE"),13)>=SYSDATE@! AND  IS NOT NULL))
  15 - filter("OBSV"."START_DATE">:B1)
  16 - access("OBSV"."OBST_ID"=:B1)
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  18 - filter(("VEH"."ac_number">=1 AND "VEH"."ac_number"<=99999))

This plan is doing exactly what I wanted it to do

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                    | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------------
|* 17 |         INDEX UNIQUE SCAN                 | T5_VEH_PK               |      6 |      1 |      6 |00:00:00.01 |
|* 18 |        TABLE ACCESS BY GLOBAL INDEX ROWID | T5                      |      6 |      1 |      6 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
  18 - filter(("VEH"."ac_number"&gt;=1 AND "VEH"."ac_number"&lt;=99999))

Joining first using the T5 primary key index and filtering in a second step. The Bloom filter and the Oracle internal view (VW_SQ_1) in the first plan disappeared from the new one. You know what the no_unnest hint does and the effect of push_pred hint has got on the query.

I will be very happy if someone could let me know how to obtain the desired execution plan without any hint and without re-writing the query

About these ads

5 Comments »

  1. Did you try gathering extended stats for T5 ?

    Comment by Jagjeet S — November 28, 2013 @ 2:32 pm | Reply

  2. Jagjeet,

    This is how I have collected statistics on T5

    BEGIN
       dbms_stats.gather_table_stats  (ownname          => user,-- owner of the table
                                       tabname          => ' T5',
                                       estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
                                       cascade          => true,
                                       method_opt       => 'FOR ALL COLUMNS SIZE 1'                          
                                      );
    END;
    /
    

    What do you suggest else?

    Thanks

    Comment by hourim — November 28, 2013 @ 2:35 pm | Reply

  3. Couple of points -

    if you are looking to fix the query without changing it/passing hints, that can be done by creating a new profile or oracle patch so that Oracle should use the optimum plan.

    However, if you do not want to use sql profile, I would like to give a try by gathering extended stats so that CBO understand cardinality better. though, I am not sure here it would work given the hint you used to get good plan but still no harm in trying.

    Extended Stats – Oracle finds the columns which are used together and creates a virtual columns to gather stats only so that it know cardinality better.

    please use below link.

    https://blogs.oracle.com/optimizer/entry/how_do_i_know_what_extended_statistics_are_needed_for_a_given_workload

    Let us know how it goes.

    Thanks

    Comment by Jagjeet S — November 28, 2013 @ 3:50 pm | Reply

  4. Did you get chance to play with extended stats ?

    Comment by Jagjeet Singh — November 29, 2013 @ 9:59 am | Reply

  5. Jagjeet,

    I don’t think I will have again the opportunity to work on this issue as far as the client has already considered it to be solved. However, I will try to model the issue and If I succeed to reproduce it then of course I will try to play with the extended statistics. Thanks by the way for the link from the optimizer blog.

    Best regards

    Comment by hourim — November 29, 2013 @ 2:23 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Database 11g

Oracle Database

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Oracle SQL Tuning Tools and Tips

SQL Tuning, eAdam, eDB360, SQLT, SQLHC

Oracle Scratchpad

Just another Oracle weblog

Tanel Poder's blog: Responsible data management

Linux, Oracle, Exadata and Hadoop.

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

Follow

Get every new post delivered to your Inbox.

Join 78 other followers

%d bloggers like this: