Mohamed Houri’s Oracle Notes

September 2, 2015

CBO decision: unique or non-unique index?

Filed under: Oracle — hourim @ 5:43 pm

I have been asked to look at one of those few particular frustrating situations that only running systems can procure. It is an update of a single table using a complete set of its primary key columns in order to locate and update a unique row. This update looks like:

UPDATE T1
SET 
  {list of columns}
WHERE 
    T1_DATE    = :B9
AND T1_I_E_ID  = :B8
AND T1_TYPE    = :B7
AND DATE_TYPE  = :B6
AND T1_AG_ID   = :B5
AND T1_ACC_ID  = :B4
AND T1_SEC_ID  = :B3
AND T1_B_ID    = :B2
AND T1_FG_ID   = :B1;

The 9 columns in the above where clause represent the primary key of the T1 table. You might be surprised to know that this update didn’t used the primary key index and preferred instead a range scan of an existing 3 columns index plus a table access by index rowid to locate and update a unique row:

----------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT             |        |       |       |     1 (100)|
|   1 |  UPDATE                      | T1     |       |       |            |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |     1 |   122 |     1   (0)|
|   3 |    INDEX RANGE SCAN          | IDX_T1 |     1 |       |     1   (0)|
----------------------------------------------------------------------------
   2 - filter("T1_TYPE"=:B7 AND "DATE_TYPE"=:B6 AND "T1_I_E_ID"=:B8 AND 
              "T1_AG_ID"=TO_NUMBER(:B5) AND "T1_DATE"=TO_TIMESTAMP(:B9) AND 
              "T1_FG_ID"=TO_NUMBER(:B1))
   3 - access("T1_SEC_ID"=TO_NUMBER(:B3) AND "T1_B_ID"=TO_NUMBER(:B2) AND 
              "T1_ACC_ID"=TO_NUMBER(:B4)) 

The same update, when hinted with the primary key index, uses the following much desirable execution plan:

------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |        |     1 |   126 |     1   (0)|
|   1 |  UPDATE            | T1     |       |       |            |
|*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1   (0)|
------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1_I_E_ID"=:B8 AND "T1_TYPE"=:B7 AND "DATE_TYPE"=:B6 
              AND "T1_AG_ID"=TO_NUMBER(:B5) AND "T1_ACC_ID"=TO_NUMBER(:B4) AND 
              "T1_SEC_ID"=TO_NUMBER(:B3) AND "T1_B_ID"=TO_NUMBER(:B2) AND 
              "T1_FG_ID"=TO_NUMBER(:B1) AND "T1_DATE"=TO_TIMESTAMP(:B9))

I don’t know how Oracle manage to get the same cost (Cost = 1) for two completely different indexes one with 9 columns and one(which is a subset of the first index) with only 3 columns (not necessarily of the same starting order though)?

So why Oracle has not selected the primary key unique index?

First here are below the available statistics on the primary key columns:

SQL> select
      column_name
     ,num_distinct
     ,num_nulls
     ,histogram
    from
      all_tab_col_statistics
    where
     table_name = 'T1'
   and
    column_name in ('T1_DATE'
                   ,'T1_I_E_ID'
                   ,'T1_TYPE'
                   ,'DATE_TYPE'
                   ,'T1_AG_ID'
                   ,'T1_ACC_ID'
                   ,'T1_SEC_ID'
                   ,'T1_B_ID'
                   ,'T1_FG_ID' );

COLUMN_NAME         NUM_DISTINCT  NUM_NULLS HISTOGRAM
------------------- ------------ ---------- ---------------
T1_I_E_ID              2          0 		FREQUENCY
T1_TYPE                5          0 		FREQUENCY
DATE_TYPE              5          0 		FREQUENCY
T1_AG_ID               106        0 		FREQUENCY
T1_ACC_ID             182         0 		FREQUENCY
T1_DATE               2861        0 		HEIGHT BALANCED
T1_SEC_ID             3092480     0 		NONE
T1_B_ID               1452        0 		HEIGHT BALANCED
T1_FG_ID              1           0 		FREQUENCY

And here’s the corresponding “update” 10053 trace file restricted only to the important part related to my investigations

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 387027527  #Blks:  6778908  AvgRowLen:  126.00  ChainCnt:  0.00

Index Stats::
Index: IDX_T1  Col#: 7 8 5
 LVLS: 3  #LB: 1568007  #DK: 3314835  LB/K: 1.00   DB/K: 25.00  CLUF: 84758374.00

Index: PK_T19  Col#: 1 2 3 4 5 7 8 37 6
 LVLS: 4  #LB: 4137117  #DK: 377310281  LB/K: 1.00  DB/K: 1.00  CLUF: 375821219.00

And the part of same trace file where the index choice is done

Access Path: index (UniqueScan)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647 ----------------------> spot this
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000 
    Cost: 1.00  Resp: 1.00  Degree: 1
    ColGroup Usage:: PredCnt: 3  Matches Full:  Partial: 
    ColGroup Usage:: PredCnt: 3  Matches Full:  Partial:

Access Path: index (AllEqRange)
    Index: IDX_T1
    resc_io: 5.00  resc_cpu: 36797 ----------------------> spot this
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000 
    Cost: 1.00  Resp: 1.00  Degree: 1

Best:: AccessPath: IndexRange
  Index: IDX_T1
    Cost: 1.00  Degree: 1  Resp: 1.00  Card: 0.00  Bytes: 0

Looking closely to the above trace file I didn’t find any difference in the index costing information (ix_sel_with_filters, resc_io, cost) which favours the non-unique IDX_T1 index over the PK_T19 primary key unique index except the resc_cpu value which equals 36797 for the former and 37647 for the latter index. I haven’t considered the clustering factor information because the index I wanted the CBO to use is a unique index. The two indexes have the same cost in this case. What extra information the CBO is using in this case to prefer the non-unique index over the unique one?

This issue remembered me an old otn thread in which the Original Poster says that, under the default CPU costing model, when two indexes have the same cost, Oracle will consider using the less CPU expensive index.

As far as I have a practical case of two different indexes with the same cost, I have decided to check this assumption by changing the costing model from CPU to I/O

SQL> alter session set "_optimizer_cost_model"=io;

SQL> explain plan for 
UPDATE T1
SET 
  {list of columns}
WHERE 
    T1_DATE   = :B9
AND T1_I_E_ID = :B8
AND T1_TYPE   = :B7
AND DATE_TYPE = :B6
AND T1_AG_ID  = :B5
AND T1_ACC_ID = :B4
AND T1_SEC_ID = :B3
AND T1_B_ID   = :B2
AND T1_FG_ID  = :B1;

SQL> select * from table(dbms_xplan.display);

Plan hash value: 704748203
-------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | UPDATE STATEMENT   |        |     1 |   126 |     1 |
|   1 |  UPDATE            | T1     |       |       |       |
|*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1 |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1_I_E_ID"=:B8 AND "T1_TYPE"=:B7 AND
              "DATE_TYPE"=:B6 AND "T1_AG_ID"=TO_NUMBER(:B5) AND
              "T1_ACC_ID"=TO_NUMBER(:B4) AND "T1_SEC_ID"=TO_NUMBER(:B3) AND
              "T1_B_ID"=TO_NUMBER(:B2) AND "T1_FG_ID"=TO_NUMBER(:B1) AND
              "T1_DATE"=TO_TIMESTAMP(:B9))

Note
-----
   - cpu costing is off (consider enabling it)

Spot on. We get the desired primary key index without any help.

However changing the default costing model is not acceptable in the client PRODUCTION database. Continuing my root causes investigations I was getting the feeling that histograms are messing up this CBO index choice. This is why I decided to give it a try and get rid of histograms and analyze the corresponding 10053 trace file:

SQL> exec dbms_stats.gather_table_stats (user, 'T1', method_opt => 'for all columns size 1');


***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 387029216  #Blks:  6778908  AvgRowLen:  126.00  ChainCnt:  0.00

Index Stats::
  Index: IDX_T1  Col#: 7 8 5
  LVLS: 3  #LB: 1625338  #DK: 3270443  LB/K: 1.00   DB/K: 26.00  CLUF: 87831324.00
  
  Index: PK_T19  Col#: 1 2 3 4 5 7 8 37 6
  LVLS: 4  #LB: 4335908  #DK: 395902926  LB/K: 1.00  DB/K: 1.00  CLUF: 394578898.00

Access Path: index (UniqueScan)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000 
    Cost: 1.00  Resp: 1.00  Degree: 1
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.0000
  ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.0000

  Access Path: index (AllEqRange)
    Index: IDX_T1
    resc_io: 31.00  resc_cpu: 370081
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000 
    Cost: 6.20  Resp: 6.20  Degree: 1 ------> spot the cost

Access Path: index (AllEqUnique)
    Index: PK_T19
    resc_io: 5.00  resc_cpu: 37647
    ix_sel: 0.000000  ix_sel_with_filters: 0.000000 
    Cost: 1.00  Resp: 1.00  Degree: 1
 One row Card: 1.000000

  Best:: AccessPath: IndexUnique
  Index: PK_T19
    Cost: 1.00  Degree: 1  Resp: 1.00  Card: 1.00  Bytes: 0

And now that the cost of accessing the non-unique index became 6 times (Cost = 6,2) greater than that of the unique index (Cost =1) Oracle preferred the primary key index without any help:

============
Plan Table
============
-------------------------------------------------+----------------------+
| Id  | Operation           | Name  | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------+----------------------+
| 0   | UPDATE STATEMENT    |       |       |       |     1 |           |
| 1   |  UPDATE             | T1    |       |       |       |           |
| 2   |   INDEX UNIQUE SCAN | PK_T19|     1 |   126 |     1 |  00:00:01 |
-------------------------------------------------+-----------------------+

The final acceptable decision to solve this issue was to hint an instance of the same query to use the primary key index and attach an SQL profile to the original packaged query using the plan_hash_value of the primary key index execution plan.

Bottom Line: under the default CPU costing model, when two (or more) indexes have the same cost Oracle will prefer using the index that it is going to consume the less amount of CPU (resc_cpu). And, before jumping on collecting histograms (particularly the Height Balanced ones) by default, be informed that they do participate in the perception Oracle has on the amount of CPU the different indexes will need and ultimately on the CBO index desirability.

Advertisements

8 Comments »

  1. Nice post Mohamed.
    I’m glad you remembered my post on OTN 😉

    Comment by ahmed aangour — September 2, 2015 @ 8:16 pm | Reply

  2. Since IDX_T1 is not a unique index, the expected cardinality is important. In your first pass, Oracle came up with an expected single row to fetch. If you run an explain plan with an index hint for IDX_T1 in the second case (no histogram), you could see the expected cardinality being different. The cost depends on the number of blocks it thinks it needs to retrieve. The number of blocks depends on the number of rows, the height of the index and the clustering factor. (for expected row count of 1, you could ignore the clustering factor).

    Comment by Hemant K Chitale — September 3, 2015 @ 3:09 am | Reply

    • Herman

      Here’s what the CBO came up with when hinted to use the IDX_T1 index in absence of histogram
      and when not hinted respectively (predicate part omitted)

      ----------------------------------------------------------------------------
      | Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)|
      ----------------------------------------------------------------------------
      |   0 | UPDATE STATEMENT             |        |     1 |   126 |     6   (0)|
      |   1 |  UPDATE                      | T1     |       |       |            |
      |*  2 |   TABLE ACCESS BY INDEX ROWID| T1     |     1 |   126 |     6   (0)|
      |*  3 |    INDEX RANGE SCAN          | IDX_T1 |   118 |       |     1   (0)|
      ----------------------------------------------------------------------------
      
      ------------------------------------------------------------------
      | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)|
      ------------------------------------------------------------------
      |   0 | UPDATE STATEMENT   |        |     1 |   126 |     1   (0)|
      |   1 |  UPDATE            | T1     |       |       |            |
      |*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1   (0)|
      ------------------------------------------------------------------
      

      You’are right. When the CBO expects to get 118 rows from the IDX_T1 (Cost =6) it falls back to the PK_T19 primary key index from where it expects to get a unique row (Cost = 1)

      The aim of this article is to investigate what extra information the CBO will use in its index choice when multiple indexes have the same cost. This identical cost can be due to imprecise histogram
      or to the absence of statistics as it was the case in the otn thread I refered to it above, or to any other reason.
      And it seems in this case, that Oracle considers the resc_cpu (amount of CPU) needed by the index before making its definitif choice

      Best regards

      Comment by hourim — September 3, 2015 @ 7:09 am | Reply

  3. Mohamed,
    did you try to change the index names? Jonathan Lewis mentioned in https://jonathanlewis.wordpress.com/2012/02/06/index-naming/ a similar case: “The two indexes produce the same cost (and you can check this in more detail in the 10053 trace) – so why does Oracle use the “wrong” index ? Because when you sort the index names alphabetically, T1_I1 comes before T1_PK. If you doubt me at this point, by the way, just run the test and then repeat it using the name T1_XX instead of T1_I1.”

    Regards
    Martin

    Comment by Martin Preiss — September 21, 2015 @ 11:50 am | Reply

    • Martin,

      I shoudl have mentioned that I played with this renaming indexes when the issue occured. I think that the alphabetic index name
      matters only when both cost and resc_cpu are identical

      SQL> alter index IDX_T1 rename to ZIDX_T1;
      
      -----------------------------------------------------------------------------
      | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)|
      -----------------------------------------------------------------------------
      |   0 | UPDATE STATEMENT             |         |     1 |   126 |     1   (0)|
      |   1 |  UPDATE                      | T1      |       |       |            |
      |*  2 |   TABLE ACCESS BY INDEX ROWID| T1      |     1 |   126 |     1   (0)|
      |*  3 |    INDEX RANGE SCAN          | ZIDX_T1 |     1 |       |     1   (0)|
      -----------------------------------------------------------------------------
      

      Back to the original index name

      			  
      SQL> alter index ZIDX_T1  rename to IDX_T1;
      
      ----------------------------------------------------------------------------
      | Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)|
      ----------------------------------------------------------------------------
      |   0 | UPDATE STATEMENT             |        |     1 |   126 |     1   (0)|
      |   1 |  UPDATE                      | T1     |       |       |            |
      |*  2 |   TABLE ACCESS BY INDEX ROWID| T1     |     1 |   126 |     1   (0)|
      |*  3 |    INDEX RANGE SCAN          | IDX_T1 |     1 |       |     1   (0)|
      ----------------------------------------------------------------------------
      

      Create a SQL Profile to use the PK index

       
      SQL> @create_1_hint_sql_profile
      Enter value for sql_id: bftt2ghkddthn
      Enter value for profile_name (PROFILE_sqlid_MANUAL):
      Enter value for category (DEFAULT):
      Enter value for force_matching (false): TRUE
      Enter value for hint_text: index(T1@UPD$1 PK_T19)
      Profile PROFILE_bftt2ghkddthn_MANUAL created. 
      
      ------------------------------------------------------------------
      | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)|
      ------------------------------------------------------------------
      |   0 | UPDATE STATEMENT   |        |     1 |   126 |     1   (0)|
      |   1 |  UPDATE            | T1     |       |       |            |
      |*  2 |   INDEX UNIQUE SCAN| PK_T19 |     1 |   126 |     1   (0)|
      ------------------------------------------------------------------
       
      Note
      -----
         - SQL profile "PROFILE_bftt2ghkddthn_MANUAL" used for this statement   
      

      Comment by hourim — September 21, 2015 @ 1:40 pm | Reply

      • thank you for the addition: I didn’t consider resc_cpu in this context till now.

        Comment by Martin Preiss — September 21, 2015 @ 2:26 pm

  4. Hey can i have ur email or some think where we can talk in privet x3 my facebook is kiyoshi farouknaimi

    Comment by farouk naimi — December 24, 2015 @ 1:14 pm | Reply

  5. That is really attention-grabbing, You are aan excesswively professional blogger.

    I’ve joined your rss feeed and sit up ffor seeking extra of your wonderful post.
    Additionally, I have shared yohr web site in my socxial
    networks

    Comment by fantastic video games — February 14, 2016 @ 12:18 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: