Mohamed Houri’s Oracle Notes

November 22, 2013

On how important is collecting statistics adequately

Filed under: Oracle — hourim @ 3:22 pm

Very recently I have been handled the following sub-optimal execution plan of a classical MERGE statement between two tables, t1 and t2

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                        |           |   379 | 24635 |  1107  (19)| 00:00:05 |       |       |
|   1 |  MERGE                                 | T1        |       |       |            |          |       |       |
|   2 |   VIEW                                 |           |       |       |            |          |       |       |
|   3 |    SEQUENCE                            | SEQ_SEQ   |       |       |            |          |       |       |
|*  4 |     HASH JOIN OUTER                    |           |   379 | 23119 |  1107  (19)| 00:00:05 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T2        |   372 |  8184 |     5   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | AUDIT_IND |   383 |       |     3   (0)| 00:00:01 |       |       |
|   7 |      PARTITION RANGE ALL               |           |  5637K|   209M|  1046  (15)| 00:00:05 |     1 |     2 |
|   8 |       TABLE ACCESS FULL                | T1        |  5637K|   209M|  1046  (15)| 00:00:05 |     1 |     2 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("L"."COL_SK"(+)="COL_SK")
6 - access("AUDIT_ID"=246395)

According to this execution plan there are only 379 rows to be merged. But the client was complaining about the 12 seconds this merge statement is taking and asked for improvement. Well, it is quite clear from the supplied information that the CBO is wrongly hash joining 5,637,000 worth of rows from the probed table t1 (operation 8) with 372 rows from the built table t2 (operation 5 and predicate 6), burning a lot of CPU at operation 4 and finally throwing away the majority of the 5,637,000 rows from table t1 that are not fulfilling the join condition (predicate 6).

This is the MERGE statement?

MERGE INTO T1 L USING
(SELECT COL_SK,LOAD_DTS
FROM T2
WHERE AUDIT_ID = 246395) H
ON (L.COL_SK = H.COL_SK)
WHEN NOT MATCHED THEN
INSERT
(COL1,COL_SK,COL2,LOAD_DTS ,ORIG,AUDIT_ID)
VALUES
(SEQ_SEQ.NEXTVAL,H.COL_SK,-1,H.LOAD_DTS,'MHO',246395);

Looking carefully to the above statement I realized that the 372 rows are coming from this

(SELECT
   COL_SK,
   LOAD_DTS
FROM T2
WHERE AUDIT_ID = 246395
) H

There are 372 values of the join column, COL_SK, to be merged. So why the CBO has not started by getting those 372 values from table t2 and then looked up into the second table t1 using the join column to satisfy the join condition

 ON (L.COL_SK = H.COL_SK)

This kind of execution path becomes more adequate when you know that there is additionally a unique index on table t1 starting with the join column

create unique index IND_T1_UK on t1 (col_sk, other_column);

When I hinted the MERGE statement to use this unique index

MERGE /*+ index(L IND_T1_UK) */ INTO T1 L USING
(SELECT COL_SK,LOAD_DTS
FROM T2
WHERE AUDIT_ID = 246395) H
ON (L.COL_SK = H.COL_SK)
WHEN NOT MATCHED THEN
INSERT
(COL1,COL_SK,COL2,LOAD_DTS ,ORIG,AUDIT_ID)
VALUES
(SEQ_SEQ.NEXTVAL,H.COL_SK,-1,H.LOAD_DTS,'MHO',246395);

I got the optimal plan. The plan I want the CBO to use without any hint.

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                        |             |   379 | 24635 |  1116   (1)| 00:00:05 |       |       |
|   1 |  MERGE                                 | T1          |       |       |            |          |       |       |
|   2 |   VIEW                                 |             |       |       |            |          |       |       |
|   3 |    SEQUENCE                            | SEQ_SEQ     |       |       |            |          |       |       |
|   4 |     NESTED LOOPS OUTER                 |             |   379 | 23119 |  1116   (1)| 00:00:05 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T2          |   372 |  8184 |     5   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | AUDIT_IND   |   383 |       |     3   (0)| 00:00:01 |       |       |
|   7 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T1          |     1 |    39 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  8 |       INDEX RANGE SCAN                 | IND_T1_UK   |     1 |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("AUDIT_ID"=246395)
8 - access("L"."COL_SK"(+)="COL_SK")

You can easily point out from the above two execution plans that the cost of the NESTED LOOP (1116) is greater than the cost of the HASH JOIN (1107) and this is the main reason why the CBO is prefering the sub-optimal HASH JOIN plan. A HASH JOIN being unable to perform an index lookups on the probed table t1 using the join condition col_sk, the CBO traumatized the merge statement by using a costly full table scan.

Finally, solving my problem resides on understanding why the cost of the NESTED LOOP(NL) is greater than the cost of the HASH JOIN.  

The cost of a nested loop join is given by

Cost of acquiring data from first table +
               Cardinality of result from first table * Cost of single visit to second table

So, in your opinion, what statistics information could play an important role in the decision the CBO makes for the NL join method?  Think a little bit before going down the page




















It is the density of the join column (col_sk).

SQL> select
          num_distinct
         ,density
         ,num_nulls
         ,last_analyzed
         ,sample_size
         ,global_stats
         ,histogram
  from   all_tab_col_statistics
  where table_name ='T1'
  and   column_name='COL_SK';

NUM_DISTINCT    DENSITY  NUM_NULLS LAST_ANALYZED       SAMPLE_SIZE  GLO HISTOGRAM
------------ ---------- ---------- -------------------   -----------  --- -------
5,546,458       0.000000180     0   2013/11/07 22:00:31   5,546,458  YES NONE     --> spot the join column last_analyzed date

select
    table_name
    ,last_analyzed
from all_tables
where table_name = ‘T1’;

TABLE_NAME    LAST_ANALYZED
------------- ---------------------
T1            2013/11/21 05:00:31     --> spot the table last_analyzed date

select count(1) from (select distinct col_sk from t1)---> 5,447,251

Two important points are to be emphasied here:

1. The table t1 has been last analyzed on 21/11/2013 while the join column has been last analyzed on 07/11/2013
2. There is about 100K difference between the real count of distinct(col_sk) and num_distinct of col_sk as taken from the all_tab_col_statistics table

A deeper look on the parameters used to compute the table statistics shows that the statistics are collected using  two non adequate parameters:

  method_opt        -=> null
  estimate_percent  -=> a given value

The first parameter setting when used translates to: collect stats on table and ignore stats on columns. Which explains the difference between the last analyzed date of the table and that of its join column. It also explains the discrepancy noticed between the num_distinct of the join column and its real count when taken directly from table t1. My prefered value of the method_opt parameter is:

 method_opt            -=> 'for all columns size 1'

Which collects stats for all columns without collecting histogram.

The second parameter(estimate_percent) indicates at which sample (precision) statistics should be gathered. Starting from 11g and above, my preferred value for this parameter is

estimate_percent  -=> 'dbms_stats.auto.sample_size'

Particularly when approximate_ndv is set to true

SELECT DBMS_STATS.get_prefs('approximate_ndv') FROM dual;

DBMS_STATS.GET_PREFS('APPROXIMATE_NDV')
--------------------------------------------------------------------------------
TRUE

Back to my MERGE problem. When the statistics have been gathered again on table t1 using the above adequate dbms_stats parameters

BEGIN
   dbms_stats.gather_table_stats  (ownname          => user,
                                   tabname          => 'T1',
                                   estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
                                   cascade          => true,
                                   method_opt       => 'FOR ALL COLUMNS SIZE 1'                          
                                  );
END;
/

the CBO started selecting automatically the optimal plan using the NESTED LOOP JOIN as shown via the following execution plan taken from memory


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                        |             |       |       |  1078 (100)|          |       |       |
|   1 |  MERGE                                 | T1          |       |       |            |          |       |       |
|   2 |   VIEW                                 |             |       |       |            |          |       |       |
|   3 |    SEQUENCE                            | SEQ_SEQ     |       |       |            |          |       |       |
|   4 |     NESTED LOOPS OUTER                 |             |   357 | 21777 |  1078   (1)| 00:00:05 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T2          |   357 |  7854 |     5   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | AUDIT_IND   |   367 |       |     3   (0)| 00:00:01 |       |       |
|   7 |      TABLE ACCESS BY GLOBAL INDEX ROWID| T1          |     1 |    39 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  8 |       INDEX RANGE SCAN                 | IND_T1_UK   |     1 |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("AUDIT_ID"=:B1)
8 - access("L"."COL_SK"="COL_SK")

Now, before ending this article I would like to show two things :

The predicate of the sub-optimal plan

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("L"."COL_SK"(+)="COL_SK")
6 - access("AUDIT_ID"=:B1)

And the predicate part of the optimal plan

Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("AUDIT_ID"=:B1)
8 - access("L"."COL_SK"="COL_SK")

In the optimal plan we start by taking the 357 (or 379) rows to be merged from the outer table using predicate 6 and then we scan the inner table using this 357 join columns via an index range scan. The majority of the rows are eliminated much earlier.

In the sub-optimal plan it is the last operation, the HASH JOIN OUTER that eliminates the majority of the rows. Which means in this case : we started big and finally finished small. But to have a performant query we need to start small and try to keep small.

Bottom line: when you meet a sub-optimal execution plan using a HASH JOIN driven by a costly full table scan on the probed table while you know that the probed table could be very quickly transformed to an inner table of a NESTED LOOP join scanned via a precise index access on the join column, then verify the statistics on the join column because this one plays an important role in the cost of the NESTED LOOP join.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: