Indexing Intelligibly

May 25, 2018

Indexing Intelligibly

Filed under: Index — hourim @ 2:10 pm

Here’s again a simple but very practical tuning case taken from a real life running system. It started by a client complaint about a product inserting process that started to take a considerable amount of time (60 seconds per product). According to the client, it was not performing so badly a couple of days ago. And, this is sufficiently rare to be worthy of a note, the client supplied the sql_id of the query he found to be the root cause of this delay. In the next section I will explain, step by step, what I did to make my client very happy.

Get SQL monitoring report of the sql_id

And to make my task easy the sql_id was still in v$sql_monitor so that I immediately got its corresponding report (RTSM) shown below:

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)                
 Instance ID         :  1                    
 Session             :  xxxx (15:32901) 
 SQL ID              :  7bm6m1r2xsj5f        
 SQL Execution ID    :  16777216             
 Execution Started   :  04/05/2018 14:08:32  
 First Refresh Time  :  04/05/2018 14:08:38
 Last Refresh Time   :  04/05/2018 14:09:13
 Duration            :  41s                  
 Module/Action       :  JDBC Thin Client/-           
 Service             :  xxxxx            
 Program             :  JDBC Thin Client/-

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    42   |    5.72 |     36   |    1  |  1M    |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=4071256796)
========================================================================
| Id |               Operation |   Name   |  Rows   | Execs |   Rows   |
|    |                         |          | (Estim) |       | (Actual) |
========================================================================
|  0 | SELECT STATEMENT        |          |         |     1 |        1 |
|  1 |   NESTED LOOPS OUTER    |          |       1 |     1 |        1 |
|  2 |    NESTED LOOPS OUTER   |          |       1 |     1 |        1 |
|  3 |     NESTED LOOPS OUTER  |          |       1 |     1 |        1 |
.../...
| 12 |      TABLE ACCESS FULL  | T1       |       1 |    42 |        1 |
.../...
| 33 |   INDEX RANGE SCAN      | IDX-XXX  |       1 |     2 |          |
========================================================================

I didn’t have to search a lot. The query completes in 42 seconds of which 42 are due to the TABLE ACCESS FULL operation at line n°12.

But as interesting as this RTSM report is, it has a key limitation. It doesn’t report the predicate part. And as always, the predicate information is of a crucial importance to check what filters are applied on the above full table scan.

select * from table(dbms_xplan.display_cursor('7bm6m1r2xsj5f'));

Predicate Information (identified by operation id):
---------------------------------------------------
   12 – filter("XXX”.”COL1" = :3 AND "XXX”.”END_DATE" >=:5
               AND "XXX”.”START_DATE" <=:4 )

You don’t like speculation? Neither do I. This is why I immediately checked the following points:

There is no index starting by COL1 in table T1
And I executed the following query to check the pertinence of indexing COL1 column

SQL> select /*+ parallel(4) */ 
         col1, count(1)
     from t1
     group by col1
     order by 2 desc;

      COL1   COUNT(1)
---------- ----------
             60594499
    WTCLZ5         49
     LCOXS         47
    WTCLK1         47
     ../..

As you can see there is manifestly a design issue here as more than 99% of COL1 values are null. After I have got the confirmation from the client that the null value for COL1 is never used, I was going to create a single column index on COL1 when one of my DBA colleagues asked me the following question:

Why don’t you create a composite index on (COL1, END_DATE, START_DATE)?

And that was precisely the question that has motivated this blog post. I provided the following answer:

If we exclude null values, at maximum, we can filter down 49 rows from table T1 using COL1 column
While an inequality is applied on END_DATE and START_DATE columns an equality predicate is used against COL1
If we opt for the composite index (COL1, END_DATE, START_DATE) we will lose the formidable opportunity to have a very small and attractive index on COL1 since null values of COL1 will be added into the composite index.

Having sad that here’s below what happened next:

SQL> create index idx_t1_col1 on t1(COL1) parallel 4;

SQL> alter index idx_t1_col1 noparallel ;

With the new index having only 0,39GB of size out of table of 8,7GB as shown below:

SQL> @sizeBySegName
Enter value for segment_name: T1
Enter value for owner: xxxx

SEGMENT_TYPE       TABLESPACE_NAME   SEGMENT_NAME  PARTITION_NAME   GB
------------------ ----------------- ------------- --------------- -------
TABLE              XXXX               T1                            8.7012
                                                                   -------
Total Segment Size                                                  8.7012


SQL> @sizeBySegName
Enter value for segment_name: idx_t1_col1
Enter value for owner: xxxx

SEGMENT_TYPE       TABLESPACE_NAME   SEGMENT_NAME  PARTITION_NAME   GB
------------------ ----------------- ------------- --------------- ------
INDEX              XXXX               IDX_T1_COL1                  0.387
                                                                   ------
Total Segment Size                                                 0.387

Naturally the new execution plan uses the new index and resulted into a drastically improved query performance as the followings show via the new execution plan and the new elapsed time:

SQL_ID  7bm6m1r2xsj5f, child number 0
-------------------------------------
Plan hash value: 874603108

------------------------------------------------------------------------
| Id  | Operation                               | Name      | Rows     |
------------------------------------------------------------------------
|  0 | SELECT STATEMENT                         |            |         |
|  1 |   NESTED LOOPS OUTER                     |            |       1 |
|  2 |    NESTED LOOPS OUTER                    |            |       1 |
|  3 |     NESTED LOOPS OUTER                   |            |       1 |
.../...
| 12 |      TABLE ACCESS BY INDEX ROWID BATCHED | T1         |       1 |
| 13 |       INDEX RANGE SCAN                   | IDX_T1_COL1|       1 | 
|.../...                                                               |
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   12 – filter( "XXX”.”END_DATE" >=:6 AND "XXX”.”START_DATE" <=:5 )
   13 - access("XXX”.”COL1" = :4)

SQL> @sqlstats
Enter value for sql_id: 7bm6m1r2xsj5f

     CHILD  PLAN_HASH_VALUE   GETS      ELAPS   EXECUTIONS
----------  --------------- ---------- ------  -----------
         0       874603108        11      0      897

Thanks to the new index the execution time of the client critical query dropped from 42 to 0 second and the logical I/O consumption from 1 million to only 11 buffer gets per execution.

Bottom Line

Through this simple real life example I wanted to show how crucial is the first column of an index when used in an equality predicate. I wanted also to emphasize that, sometimes, and all things being equal, it might be better to prefer a single column index to cover a multi-column where clause provided this single column is very selective and contains a bunch of null values making the single index very small and thereby very attractive to the CBO.

Comments (4)

4 Comments »

If it was ok before, I wonder what changed?

Comment by Dom Brooks — May 25, 2018 @ 2:20 pm | Reply
Dom

Good question.

Because the underlying table has got a sudden increase of volume. The number of executions of the query has also been increased.

Best regards
Mohamed

Comment by hourim — May 25, 2018 @ 3:28 pm | Reply
Hi Mohamed,

how / with which sql stmt do you get the first block of info about the given SQL_ID ?
“Get SQL monitoring report of the sql_id”

thx
Rainer

Comment by Rainer Hartwig — May 26, 2018 @ 4:36 am | Reply

Rainer

I use this

SELECT DBMS_SQLTUNE.report_sql_monitor(
  sql_id       => '7bm6m1r2xsj5f',
  type         => 'TEXT',
  report_level => 'ALL') AS report
FROM dual;

Best regards

Comment by hourim — May 26, 2018 @ 9:36 am | Reply

RSS feed for comments on this post. TrackBack URI

Mohamed Houri’s Oracle Notes

May 25, 2018