Mohamed Houri’s Oracle Notes

May 25, 2018

Indexing Intelligibly

Filed under: Index — hourim @ 2:10 pm

Here’s again a simple but very practical tuning case taken from a real life running system. It started by a client complaint about a product inserting process that started to take a considerable amount of time (60 seconds per product). According to the client, it was not performing so badly a couple of days ago. And, this is sufficiently rare to be worthy of a note, the client supplied the sql_id of the query he found to be the root cause of this delay. In the next section I will explain, step by step, what I did to make my client very happy.

Get SQL monitoring report of the sql_id

And to make my task easy the sql_id was still in v$sql_monitor so that I immediately got its corresponding report (RTSM) shown below:

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)                
 Instance ID         :  1                    
 Session             :  xxxx (15:32901) 
 SQL ID              :  7bm6m1r2xsj5f        
 SQL Execution ID    :  16777216             
 Execution Started   :  04/05/2018 14:08:32  
 First Refresh Time  :  04/05/2018 14:08:38
 Last Refresh Time   :  04/05/2018 14:09:13
 Duration            :  41s                  
 Module/Action       :  JDBC Thin Client/-           
 Service             :  xxxxx            
 Program             :  JDBC Thin Client/-

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    42   |    5.72 |     36   |    1  |  1M    |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=4071256796)
========================================================================
| Id |               Operation |   Name   |  Rows   | Execs |   Rows   |
|    |                         |          | (Estim) |       | (Actual) |
========================================================================
|  0 | SELECT STATEMENT        |          |         |     1 |        1 |
|  1 |   NESTED LOOPS OUTER    |          |       1 |     1 |        1 |
|  2 |    NESTED LOOPS OUTER   |          |       1 |     1 |        1 |
|  3 |     NESTED LOOPS OUTER  |          |       1 |     1 |        1 |
.../...
| 12 |      TABLE ACCESS FULL  | T1       |       1 |    42 |        1 |
.../...
| 33 |   INDEX RANGE SCAN      | IDX-XXX  |       1 |     2 |          |
========================================================================

I didn’t have to search a lot. The query completes in 42 seconds of which 42 are due to the TABLE ACCESS FULL operation at line n°12.

But as interesting as this RTSM report is, it has a key limitation. It doesn’t report the predicate part. And as always, the predicate information is of a crucial importance to check what filters are applied on the above full table scan.

select * from table(dbms_xplan.display_cursor('7bm6m1r2xsj5f'));

Predicate Information (identified by operation id):
---------------------------------------------------
   12 – filter("XXX”.”COL1" = :3 AND "XXX”.”END_DATE" >=:5
               AND "XXX”.”START_DATE" <=:4 )
       

You don’t like speculation? Neither do I. This is why I immediately checked the following points:

  • There is no index starting by COL1 in table T1
  • And I executed the following query to check the pertinence of indexing COL1 column
SQL> select /*+ parallel(4) */ 
         col1, count(1)
     from t1
     group by col1
     order by 2 desc;

      COL1   COUNT(1)
---------- ----------
             60594499
    WTCLZ5         49
     LCOXS         47
    WTCLK1         47
     ../..

As you can see there is manifestly a design issue here as more than 99% of COL1 values are null. After I have got the confirmation from the client that the null value for COL1 is never used, I was going to create a single column index on COL1 when one of my DBA colleagues asked me the following question:

Why don’t you create a composite index on (COL1, END_DATE, START_DATE)?

And that was precisely the question that has motivated this blog post. I provided the following answer:

  • If we exclude null values, at maximum, we can filter down 49 rows from table T1 using COL1 column
  • While an inequality is applied on END_DATE and START_DATE columns an equality predicate is used against COL1
  • If we opt for the composite index (COL1, END_DATE, START_DATE) we will lose the formidable opportunity to have a very small and attractive index on COL1 since null values of COL1 will be added into the composite index.

Having sad that here’s below what happened next:

SQL> create index idx_t1_col1 on t1(COL1) parallel 4;

SQL> alter index idx_t1_col1 noparallel ;

With the new index having only 0,39GB of size out of table of 8,7GB as shown below:

SQL> @sizeBySegName
Enter value for segment_name: T1
Enter value for owner: xxxx

SEGMENT_TYPE       TABLESPACE_NAME   SEGMENT_NAME  PARTITION_NAME   GB
------------------ ----------------- ------------- --------------- -------
TABLE              XXXX               T1                            8.7012
                                                                   -------
Total Segment Size                                                  8.7012


SQL> @sizeBySegName
Enter value for segment_name: idx_t1_col1
Enter value for owner: xxxx

SEGMENT_TYPE       TABLESPACE_NAME   SEGMENT_NAME  PARTITION_NAME   GB
------------------ ----------------- ------------- --------------- ------
INDEX              XXXX               IDX_T1_COL1                  0.387
                                                                   ------
Total Segment Size                                                 0.387

Naturally the new execution plan uses the new index and resulted into a drastically improved query performance as the followings show via the new execution plan and the new elapsed time:

SQL_ID  7bm6m1r2xsj5f, child number 0
-------------------------------------
Plan hash value: 874603108

------------------------------------------------------------------------
| Id  | Operation                               | Name      | Rows     |
------------------------------------------------------------------------
|  0 | SELECT STATEMENT                         |            |         |
|  1 |   NESTED LOOPS OUTER                     |            |       1 |
|  2 |    NESTED LOOPS OUTER                    |            |       1 |
|  3 |     NESTED LOOPS OUTER                   |            |       1 |
.../...
| 12 |      TABLE ACCESS BY INDEX ROWID BATCHED | T1         |       1 |
| 13 |       INDEX RANGE SCAN                   | IDX_T1_COL1|       1 | 
|.../...                                                               |
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   12 – filter( "XXX”.”END_DATE" >=:6 AND "XXX”.”START_DATE" <=:5 )
   13 - access("XXX”.”COL1" = :4)
SQL> @sqlstats
Enter value for sql_id: 7bm6m1r2xsj5f

     CHILD  PLAN_HASH_VALUE   GETS      ELAPS   EXECUTIONS
----------  --------------- ---------- ------  -----------
         0       874603108        11      0      897

Thanks to the new index the execution time of the client critical query dropped from 42 to 0 second and the logical I/O consumption from 1 million to only 11 buffer gets per execution.

Bottom Line

Through this simple real life example I wanted to show how crucial is the first column of an index when used in an equality predicate. I wanted also to emphasize that, sometimes, and all things being equal, it might be better to prefer a single column index to cover a multi-column where clause provided this single column is very selective and contains a bunch of null values making the single index very small and thereby very attractive to the CBO.

4 Comments »

  1. If it was ok before, I wonder what changed?

    Comment by Dom Brooks — May 25, 2018 @ 2:20 pm | Reply

  2. Dom

    Good question.

    Because the underlying table has got a sudden increase of volume. The number of executions of the query has also been increased.

    Best regards
    Mohamed

    Comment by hourim — May 25, 2018 @ 3:28 pm | Reply

  3. Hi Mohamed,

    how / with which sql stmt do you get the first block of info about the given SQL_ID ?
    “Get SQL monitoring report of the sql_id”

    thx
    Rainer

    Comment by Rainer Hartwig — May 26, 2018 @ 4:36 am | Reply

  4. Rainer

    I use this

    SELECT DBMS_SQLTUNE.report_sql_monitor(
      sql_id       => '7bm6m1r2xsj5f',
      type         => 'TEXT',
      report_level => 'ALL') AS report
    FROM dual;
    

    Best regards

    Comment by hourim — May 26, 2018 @ 9:36 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a reply to hourim Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a free website or blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud's blog

Just another blog : Databases, Linux and other stuffs

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)