Mohamed Houri’s Oracle Notes

June 24, 2016

Index Skip Scan : how many columns are skippable?

Filed under: Index — hourim @ 10:49 am

If you want to know what an index skip scan operation is then Richard Foote article and Jonathan Lewis one you can find here and here respectively are two reliable references.

This article deals about a particular query using an index skip scan with the following Real Time SQL Monitoring (RTSM) report

Global Information
------------------------------
 Status              :  EXECUTING  
 Instance ID         :  1                   
 Session             :  C##MHOURI (965:52281)  
 SQL ID              :  c2mah5hxau88v       
 SQL Execution ID    :  16777216            
 Execution Started   :  06/15/2016 13:03:15 
 First Refresh Time  :  06/15/2016 13:03:21 
 Last Refresh Time   :  06/15/2016 14:49:32 
 Duration            :  6378s                  
 Module/Action       :  SQL*Plus/-          
 Service             :  orcl                
 Program             :  sqlplus.exe         
 Fetch Calls         :  2280           


Global Stats
================================================================
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
================================================================
|    6385 |    193 |     6192 |   2280 |   2M   | 2M   |  14GB |
================================================================
  
SQL Plan Monitoring Details (Plan Hash Value=517922901)
======================================================================================
| Id |               Operation               | Name  |   Time    | Start  | Activity |
|    |                                       |       | Active(s) | Active |   (%)    |
======================================================================================
|  0 | SELECT STATEMENT                      |       |      6297 |     +67|          |
|  1 |   TABLE ACCESS BY INDEX ROWID BATCHED | T     |      6297 |     +67|      0.52|
|->2 |    INDEX SKIP SCAN                    | T_IDX |      6378 |     +2 |     99.48|
======================================================================================

As you can see after 6385 seconds the query is still running. The Start Active column when correlated with the Time Active one indicates that 99% of this time is spent on the index skip scan operation. We all know that an index skip scan is an indicator of a missing index or a pointer to a wrongly designed one but, what has retained my attention in this case is that this skipped index is a 7 columns composed index of which the first 6 columns have been skipped.

Now the question turns to: are those 6 leading columns very repetitive(contain few distinct values) so that they can be skipped? Let’s first see the size of this index and how its columns values are distributed respectively:

SQL> @sizeBysegName
Enter value for segment_name : IDX

SEGMENT_TYPE       TABLESPACE_NAME       SEGMENT_NAME     PARTITION    GB
------------------ --------------------- --------------   ----------  -------
INDEX              IDX_TBS                T_IDX                        28.44
                                                                       -------
Total Segment Size                                                     28.44

SQL> col column_name format a20
SQL> select
  2         i.column_name
  3        ,i.column_position
  4       ,s.num_distinct
  5      from
  6         user_ind_columns i
  7       , user_tab_col_statistics s
  8       where
  9        i.column_name = s.column_name
 10       and   index_name = 'T_IDX'
 11       order by column_position asc;

COLUMN_NAME          COLUMN_POSITION NUM_DISTINCT
-------------------- --------------- ------------
BKR_ID                             1         1337
PRD_ID                             2      3342592
PRD_QTY                            3       197008
IS_PROCESED                        4            2
OPR_PRD_QTY                        5       170576
PRD_FMLY                           6            9
PROCESS_ID                         7            1

7 rows selected.

Interestingly, except the first column which is relatively repetitive when compared to the index size, neither the second column which contains more than 3 million of distinct values nor the third and the fifth which count for about 2 hundreds of thousands of distinct values, can be considered as repetitive. So why they have been skipped?

Additionally, skipping an index,say idx(a,b,c) on its column c is somehow rewriting the original query by adding two predicates on the two first index columns. Something as follows:

From

select
    {lits of columns}
from table
where
   c = :1;

To

select
    {lits of columns}
from table
where
    a in {all distinct value of a}
and b in {all distinct value of b}
and c = :1;

The two added predicates explain why the number of distinct values of the index leading edge columns should be very small for the index skip scan path to be considered.

If you want to reproduce such a kind of index operation then here’s the model I have engineered

create table t
as select
      rownum   prd_id
	 ,rownum-1 sec_prd_id
	 ,trunc((rownum-1)/3) prd_qty
	 ,trunc((rownum-1)/3) opr_prd_qty
	 ,mod(rownum,100)     bkr_id
	 ,case mod(rownum,1000000) when  0  then 'x'
	                           when  1  then 'y'
							   when  2  then 'z'
							   when  3  then 'a'
							   when  4  then 'b'
							   when  5  then 'c'
							   else 'd' end  prd_fmly
     ,case rownum when 1 then 0 else 1 end is_procesed
     ,0                                    process_id
from dual
connect by level  'for all columns size 1');
end;

SQL> select
  2         i.column_name
  3        ,i.column_position
  4       ,s.num_distinct
  5       ,s.histogram
  6      from
  7         user_ind_columns i
  8       , user_tab_col_statistics s
  9       where
 10        i.column_name = s.column_name
 11       and   index_name = 'T_IDX'
 12       order by column_position asc;

COLUMN_NAME          COLUMN_POSITION NUM_DISTINCT HISTOGRAM
-------------------- --------------- ------------ ------------
BKR_ID                             1          100 NONE
PRD_ID                             2      1000000 NONE
PRD_QTY                            3       337344 NONE
IS_PROCESED                        4            2 NONE
OPR_PRD_QTY                        5       337344 NONE
PRD_FMLY                           6            7 NONE
PROCESS_ID                         7            1 NONE

7 rows selected.
SQL> set autotrace on explain stat
SQL>  select /*+ opt_param ('optimizer_index_cost_adj' 20) */
  2             t.*
  3          from t
  4          where
  5              t.prd_id in (5507203,78400086,359798,4565189,9495,12215)
  6          and t.process_id = 0
  7          and t.opr_prd_qty  0;

    PRD_ID SEC_PRD_ID    PRD_QTY OPR_PRD_QTY     BKR_ID P IS_PROCESED PROCESS_ID
---------- ---------- ---------- ----------- ---------- - ----------- ----------
     12215      12214       4071        4071         15 d           1          0
      9495       9494       3164        3164         95 d           1          0
    359798     359797     119932      119932         98 d           1          0

---------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |     4 |   120 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T     |     4 |   120 |
|*  2 |   INDEX SKIP SCAN                   | T_IDX |     4 |       |
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T"."PROCESS_ID"=0)
       filter(("T"."PRD_ID"=9495 OR "T"."PRD_ID"=12215 OR "T"."PRD_ID"=359798 OR
              "T"."PRD_ID"=4565189 OR "T"."PRD_ID"=5507203 OR "T"."PRD_ID"=78400086) AND
              "T"."OPR_PRD_QTY"0 AND "T"."PROCESS_ID"=0)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5026  consistent gets
          0  physical reads
          0  redo size
       1167  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          3  rows processed

Not having a clear answer for why Oracle has considered this index operation, doesn’t pre-empt us from solving this issue. And as explained above, an index skip scan operation being an indicationg of a missing index I have then created the following index (bear in mind that the newly created index takes into account that the original text of query can’t be changed):

SQL> create index idx_t1_usr1 on t(process_id, prd_id, prd_qty) compress 3;

SQL> select /*+ opt_param ('optimizer_index_cost_adj' 20) */
  2         t.*
  3      from t
  4      where
  5          t.prd_id in (5507203,78400086,359798,4565189,9495,12215)
  6      and t.process_id = 0
  7      and t.opr_prd_qty  0;

    PRD_ID SEC_PRD_ID    PRD_QTY OPR_PRD_QTY     BKR_ID P IS_PROCESED PROCESS_ID
---------- ---------- ---------- ----------- ---------- - ----------- ----------
      9495       9494       3164        3164         95 d           1          0
     12215      12214       4071        4071         15 d           1          0
    359798     359797     119932      119932         98 d           1          0

----------------------------------------------------------------------------
| Id  | Operation                            | Name        | Rows  | Bytes |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |             |     4 |   120 |
|   1 |  INLIST ITERATOR                     |             |       |       |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T           |     4 |   120 |
|*  3 |    INDEX RANGE SCAN                  | IDX_T1_USR1 |     5 |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T"."OPR_PRD_QTY"0)
   3 - access("T"."PROCESS_ID"=0 AND ("T"."PRD_ID"=9495 OR "T"."PRD_ID"=12215 OR
              "T"."PRD_ID"=359798 OR "T"."PRD_ID"=4565189 OR "T"."PRD_ID"=5507203 OR
              "T"."PRD_ID"=78400086))

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         18  consistent gets
          7  physical reads
          0  redo size
       1167  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          3  rows processed

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud's blog

Just another blog : Databases, Linux and other stuffs

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)