Mohamed Houri’s Oracle Notes

February 28, 2014

Why my query is not using my unique index

Filed under: Oracle — hourim @ 8:35 am

Here’s a point that I have been asked to explain and that I failed to answer. It is a simple select on a table with an equality predicate on two columns. These two columns are indexed using a two columns unique index. So it seems quite obvious that when selecting from this table using these two columns to (a) select a unique row and (b) use that unique index for that purpose. However, the CBO seems making an illogic decision per regards to the choice of the index access when there is no statistics on the table (num_rows and last_analyzed are null) while there is 0 statistics in the indexes. Things will become clear with a concrete example.

As very often, I will use one of the Jonathan Lewis table scripts creation and population.

 create table t1
   (id number,
    n_1000 number,
    n_5000 number,
    n_10000 number,
    small_vc varchar2(20),
    padding  varchar2(100)
   );

create unique index a_unq_ind on t1(id, n_1000);
create index b_non_unq_ind on t1(n_1000, n_5000);

Spot how I have managed to name my indexes(unique and non-unique) in an alphabetic order so that when this specific order matters the CBO will choose the unique index as far as it starts with the ‘a’ letter.

Now that I have created my table and my two indexes I can populate the table with data without transmitting statistics to the indexes.

 insert into t1
  with generator as (
   select   --+ materialize
    rownum id
   from dual
  connect by
  rownum <= 10000
)
select
    rownum                    id,
    mod(rownum,1000) n_1000,
    mod(rownum,5000) n_5000,
    mod(rownum,10000) n_10000,
    lpad(rownum,10,'0')       small_vc,
    rpad('x',100)             padding
from
    generator        v1,
    generator        v2
where
rownum <= 100000
;

commit;

Before starting my select, let me show you the actual table and index statistics


SQL> select table_name, num_rows, last_analyzed from user_tables where table_name = 'T1';

TABLE_NAME   NUM_ROWS LAST_ANALYZED
---------- ---------- -----------------
T1

SQL> select index_name, num_rows, clustering_factor from user_indexes where table_name = 'T1';

INDEX_NAME        NUM_ROWS CLUSTERING_FACTOR
--------------- ---------- -----------------
B_NON_UNQ_IND            0                 0
A_UNQ_IND                0                 0

And now my desired select and its corresponding execution plan (statistics_level have been set to all before)

SQL> select * from t1
     where id = 1
     and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------- ---------- ---------- ---------- -------------------- -----------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 0
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3790258116

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |               |      1 |        |      1 |00:00:00.01 |     103 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1            |      1 |     15 |      1 |00:00:00.01 |     103 |
|*  2 |   INDEX RANGE SCAN                  | B_NON_UNQ_IND |      1 |    615 |    100 |00:00:00.01 |       3 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"=1)
2 - access("N_1000"=1)

It is clear by comparing the E-Rows and A-Rows that the statistics are not reflecting the reality.

And when I force the use of the unique index the CBO seems doing good estimations and generation less buffer gets

SQL> select /*+ index( t1 A_UNQ_IND ) */ * from t1  where id = 1  and   n_1000= 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------- ---------- ---------- ---------- --------------------
1          1          1          1 0000000001           x

SQL_ID  bt11jwur90xg0, child number 0
-------------------------------------
select /*+ index( t1 A_UNQ_IND ) */ * from t1  where id = 1  and n_1000= 1;

Plan hash value: 3585360496
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

While this unique index execution plan seems optimal with correct estimations and less Buffers gets it is nevertheless not the index that the CBO desires to use.

Anyway, you are going to object and to say we are not going to run a real life application without having correct statistics gathered. Believe me that if I am writing this case here it is because it has happened in a real life production application. Having said that let’s collect statistics and see the new reaction of the CBO

SQL> exec dbms_stats.gather_table_stats(user ,'t1');

PL/SQL procedure successfully completed.

SQL> select * from t1
where id = 1
and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
---------------------------------------------------------------------------------------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 1
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3585360496

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

Note
-----
- statistics feedback used for this statement

After having collected statistics the CBO is now choosing the right index.

By the way I am doing this test case on a 12c data base (you might have already guessed that through the presence of the word BATCHED in the execution plan) and seeing this statistics feedback used note, pushed me to flush the shared pool and re-run my query in order to be sure that the use of the unique index is not due to this statistics feedback feature

SQL> alter system flush shared_pool;

System altered.

SQL> select * from t1
where id = 1
and   n_1000 = 1;

ID     N_1000     N_5000    N_10000 SMALL_VC             PADDING
--------- ---------- ---------- ---------- -------------------- ---------
1          1          1          1 0000000001           x

SQL_ID  29tnq7b69swdr, child number 0
-------------------------------------
select * from t1  where id = 1  and   n_1000 = 1

Plan hash value: 3585360496
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX UNIQUE SCAN         | A_UNQ_IND |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1 AND "N_1000"=1)

That is the whole story. If you want to learn more about this particular situation then you have to read this related otn thread.

Bottom line: always be sure to have adequate statistics so that you will give the CBO all chances to produce an adequate plan

PS : I started by trying to explain why the CBO didn’t choose the unique index and ended up while writing this article to two questions

1 : why dynamic sampling didn’t occur : Martin Press gave a clue that needs only to be tested

2 : when the CBO uses the index range scan the table is accessed via this new 12c  TABLE ACCESS BY INDEX ROWID BATCHED  that has been explained by Timur Akhmadeed but when it uses the unique index scan the CBO opted for the classical TABLE ACCESS BY INDEX ROWID

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: