Mohamed Houri’s Oracle Notes

November 14, 2019

Incremental and ruin

Filed under: Oracle — hourim @ 8:02 pm

It’s hardly a secret that many people are not fan of histograms. The main problem of them is that they are not very well documented, and they drive, under the hood, many other features like Adaptive Cursor Sharing. They also interact with gathering statistics for partitioned tables when the incremental preference is set to TRUE.

As you might know, if you have read my two preceding articles here and here, I have been struggling with recurrent PeopleSoft performance issues because of a wrong cardinality estimation when visiting a partitioned table. I couldn’t figure out exactly the root cause of this cardinality mis-estimation so that I opened an Oracle Service Request (SR). As an answer to my request, Oracle support send me a reference to my two articles mentioned above😊. While I was still waiting for a viable answer from the Oracle support, the same performance issue kicked in and I was asked to solve it as quickly as possible.

After a couple of thinking minutes, I decided to change the incremental preference of that partitioned table and regather its statistics as in the following:

SQL> exec dbms_stats.set_table_prefs('SYSADM', 'PS_XYZ', 'incremental', 'false');
SQL> exec dbms_stats.gather_table_stats('SYSADM', 'PS_XYZ',degree => 4);

This made the culprit query, identified by the SQL_ID 147gz7h1x7jkq, going from a non-ending 42,886 seconds

Global Information
------------------------------
 Status              :  EXECUTING                         
 Instance ID         :  1                                 
 Session             :  SYSADM (23:47269)                 
 SQL ID              :  147gz7h1x7jkq                     
 SQL Execution ID    :  16777219                          
 Execution Started   :  08/28/2019 21:19:19               
 First Refresh Time  :  08/28/2019 21:19:25               
 Last Refresh Time   :  08/29/2019 09:14:05               
 Duration            :  42886s                            
 

to about only 5 seconds for the worst case

SQL> @sqlstats
Enter value for sql_id: 147gz7h1x7jkq

     CHILD    INST_ID PLAN_HASH_VALUE   AVG_GETS LAST_ACTI            AVG_ETIME      EXECS
---------- ---------- --------------- ---------- ------------------- ---------- ----------
         0          1      3376574399      23452 29/08/2019 17:09:08   5,357016          1
         0          2       430435019       2961 29/08/2019 17:09:36   1,366556          1

The goal of this article is to explain the negative effect the incremental mode can have on the effective use of histogram during cardinality estimation.

By the way, if you are wondering why the same query has got two different execution plans (3376574399,430435019) at almost the same time, then bear in mind that this is the effect of a GTT table having a SESSION private statistics.

1.Few words about the Setup

This is a PeopleSoft software using Oracle 12cR1 under Exadata machine. The cardinality misestimation occurs when Oracle visits a range (or a range/list) partitioned table of 49 partitions. The partition key is of varchar2(11) data type having a uniform distribution for certain partitions, and a skewed distribution at others. The incremental preference of the partitioned table is set to TRUE.

Here’s an image of the partition key column (EMPLID) statistics when the performance issue occurred (incremental = TRUE) and after it has been fixed (incremental =FALSE) respectively (at global and partition level):

-- incremental = TRUE
SELECT
   table_name
  ,column_name
  ,notes
FROM
  all_tab_col_statistics
WHERE 
    owner       = 'SYSADM'
AND table_name  = 'PS_XYZ'
AND column_name = 'EMPLID';    
  
TABLE_NAME  COLUMN_NAME   HISTOGRAM       NOTES   
----------- ------------- --------------- ----------------------------------
PS_XYZ      EMPLID        HYBRID          INCREMENTAL HIST_FOR_INCREM_STATS        
SQL> select
         partition_name
        ,num_distinct
        ,density
        ,histogram
        ,notes
    from
        all_part_col_statistics
    where owner = 'SYSADM'
    and table_name = 'PS_XYZ'
    and column_name = 'EMPLID';


PARTITION_NAME  NUM_DISTINCT    DENSITY HISTOGRAM       NOTES                 
--------------- ------------ ---------- --------------- ----------------------
PS_XYZ_T_1             19888     ,00005 HYBRID                                
PS_XYZ_T_10             1191    ,000831 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_11             1153    ,000861 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_12             1187    ,000835 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_13             1390    ,000707 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_14             1733    ,000552 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_15             1950    ,000479 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_16             1842    ,000516 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_17             1584    ,000611 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_18             1714     ,00056 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_19             1410    ,000697 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_2              1059    ,000941 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_20             1460    ,000672 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_21             1899    ,000526 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_22             1916     ,00049 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_23             1568    ,000621 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_24             1550    ,000625 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_25             1697    ,000568 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_26             1637     ,00059 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_27             1528    ,000639 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_28             1629    ,000596 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_29             1479    ,000663 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_3               960     ,00104 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_30             1480    ,000662 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_31             1839     ,00052 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_32             3007    ,000284 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_33             7693     ,00013 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_34             2776     ,00036 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_35             3653    ,000215 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_36             7940    ,000126 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_37             3277    ,000305 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_38             8461    ,000118 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_39             7855    ,000127 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_4              1764    ,000567 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_40             4878    ,000205 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_41             6361    ,000157 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_42             6064    ,000165 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_43             9641    ,000104 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_44             9142    ,000109 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_45             8043    ,000124 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_46             5664    ,000078 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_47            21980    ,000045 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_48             9550    ,000105 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_49           270752    ,000004 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_5              1379    ,000725 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_6              1050    ,000952 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_7              1016    ,000984 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_8              1032    ,000969 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_9              1232    ,000811 HYBRID          HIST_FOR_INCREM_STATS 

49 rows selected.
-- incremental = FALSE

TABLE_NAME  COLUMN_NAME  HISTOGRAM       NOTES                                
----------- ------------ --------------- ------
PS_XYZ      EMPLID       HYBRID

PARTITION_NAME   NUM_DISTINCT    DENSITY HISTOGRAM       NOTES   
--------------- ------------ ---------- --------------- ---------
PS_XYZ_T_1             19888     ,00005 HYBRID                   
PS_XYZ_T_10             1191 ,000839631 NONE                     
PS_XYZ_T_11             1153    ,000861 HYBRID                   
PS_XYZ_T_12             1187  ,00084246 NONE                     
PS_XYZ_T_13             1390 ,000719424 NONE                     
PS_XYZ_T_14             1733 ,000577034 NONE                     
PS_XYZ_T_15             1950    ,000476 HYBRID                   
PS_XYZ_T_16             1842    ,000515 HYBRID                   
PS_XYZ_T_17             1584 ,000631313 NONE                     
PS_XYZ_T_18             1714     ,00056 HYBRID                   
PS_XYZ_T_19             1410  ,00070922 NONE                     
PS_XYZ_T_2              1059 ,000944287 NONE                     
PS_XYZ_T_20             1460    ,000671 HYBRID                   
PS_XYZ_T_21             1899    ,000526 HYBRID                   
PS_XYZ_T_22             1916    ,000491 HYBRID                   
PS_XYZ_T_23             1568     ,00062 HYBRID                   
PS_XYZ_T_24             1550    ,000626 HYBRID                   
PS_XYZ_T_25             1697    ,000562 HYBRID                   
PS_XYZ_T_26             1637    ,000593 HYBRID                   
PS_XYZ_T_27             1528     ,00064 HYBRID                   
PS_XYZ_T_28             1629 ,000613874 NONE                     
PS_XYZ_T_29             1479    ,000663 HYBRID                   
PS_XYZ_T_3               960 ,001041667 NONE                     
PS_XYZ_T_30             1480    ,000661 HYBRID                   
PS_XYZ_T_31             1839    ,000518 HYBRID                   
PS_XYZ_T_32             3007    ,000288 HYBRID                   
PS_XYZ_T_33             7693     ,00013 HYBRID                   
PS_XYZ_T_34             2776 ,000360231 NONE                     
PS_XYZ_T_35             3653    ,000215 HYBRID                   
PS_XYZ_T_36             7940    ,000126 HYBRID                   
PS_XYZ_T_37             3277 ,000305157 NONE                     
PS_XYZ_T_38             8461    ,000118 HYBRID                   
PS_XYZ_T_39             7855    ,000127 HYBRID                   
PS_XYZ_T_4              1764    ,000567 HYBRID                   
PS_XYZ_T_40             4878 ,000205002 NONE                     
PS_XYZ_T_41             6361    ,000157 HYBRID                   
PS_XYZ_T_42             6064    ,000165 HYBRID                   
PS_XYZ_T_43             9641    ,000104 HYBRID                   
PS_XYZ_T_44             9142    ,000109 HYBRID                   
PS_XYZ_T_45             8043 ,000124332 NONE                     
PS_XYZ_T_46             5664    ,000076 HYBRID                   
PS_XYZ_T_47            21980    ,000045 HYBRID                   
PS_XYZ_T_48             9550    ,000105 HYBRID                   
PS_XYZ_T_49           270944    ,000004 HYBRID                   
PS_XYZ_T_5              1379 ,000725163 NONE                     
PS_XYZ_T_6              1050 ,000952381 NONE                     
PS_XYZ_T_7              1016    ,000984 HYBRID                   
PS_XYZ_T_8              1032 ,000968992 NONE                     
PS_XYZ_T_9              1232 ,000811688 NONE                     

49 rows selected.

As you can see, this is a weird non-common situation. There is a clear contradiction here. How could it be differently when we see that, switching the incremental mode from true to false, switches the same data distribution for the same column at the same partition (PS_JOB_T_10 for example) from skewed (HYBIRD) to non-skewed (NONE)?

In fact, if you look closer you will realize that, as far as query optimization is concerned, there is no contradiction in the following situation:

HYBRID  +  HIST_FOR_INCREM_STATS = NONE

Because, when the CBO sees that the HYBRID histogram is accompanied by the Notes HIST_FOR_INCREM_STATS, it knows exactly that this histogram will be ignored at query optimization time, which is equivalent to not having histogram at all. This is for example correct for partitions like PS_JOB_T_10, PS_JOB_T_12 and PS_JOB_T_13 where column values are evenly distributed. But the real threat resides in partitions like PS_JOB_T_11, PS_JOB_T_15 and PS_JOB_T_16 for example where we have the following situation:

HYBRID  +  HIST_FOR_INCREM_STATS != HYBRID

In this case, the column values are skewed but their histogram are ignored by the CBO during query optimization because of the HIST_FOR_INCREM_STATS Notes. This is exactly the root cause of my performance issue.

  • Regardless of whether the partition key is skewed or not it will have a histogram
  • But this histogram will be ignored both at partition (except partition n°1 in this case) and at global level during query optimization.

Therefore, it becomes clear for me that fixing my performance issue resides in making the CBO not ignoring the presence of histogram at both partition level and global level. And, disabling the incremental mode, is what I did to achieve this goal.

2. Why histograms are ignored?

In the SR I opened for the Oracle support I have supplied the trace file of a dbms_stats call under incremental mode. Something resembling to this (partition n°11 in TEST database contains non-skewed value for the EMPLID column):

SQL> exec dbms_stats.set_global_prefs ('TRACE', to_char (1+16));

SQL> begin
        dbms_stats.gather_table_stats('sysadm'
                                      ,'PS_XYZ'
                                      ,partname  =>'PS_XYZ_T_11'
                                      ,granularity =>'partition');
    end;
   /
SQL> exec dbms_stats.set_global_prefs('TRACE', null);
DBMS_STATS: Building Histogram for EMPLID                                                                          
DBMS_STATS:  bktnum=-1, nnv=50819, snnv=5511,604748383025713834989746016721880424, sndv=1191, est_ndv=1191, mnb=254
DBMS_STATS:  Trying hybrid histogram                                                                               
DBMS_STATS:  > cdn 5592, popFreq 0, popCnt 0, bktSize 22,0750988142292490118577075098814229249
             , bktSzFrc ,0750988142292490118577075098814229249     
                                                
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 254, mnb 254, ssize 5592, min_ssize 2500, appr_ndv  TRUE
             ,ndv 1191, selNdv 0, selFreq 0, 
             pct 10,84555923647262975232686543618867329232, avg_bktsize 22, csr.hreq FALSE, normalize TRUE   
DBMS_STATS:   Checking range skew:                                                                                 
DBMS_STATS:      >> NOT SKEWED                           ----------------->                                                   
DBMS_STATS:  Not Discarding histogram (Incremental mode) ----------------->                                        
DBMS_STATS:   Histogram gathering flags: 4623     

As you can see, Oracle is aware that the EMPLID column has an even distribution in partition n°11 but it decides to do not discard the creation of histogram because of the incremental mode. Oracle, nevertheless, updates the corresponding Notes column to indicate that this histogram will not be used during cardinality estimation.

PARTITION_NAME   NUM_DISTINCT    DENSITY HISTOGRAM       NOTES
---------------- ------------ ---------- --------------- ----------------------
PS_XYZ_T_11              1191    ,000833 HYBRID          HIST_FOR_INCREM_STATS

If, I regather the same statistics for partition n°12 where the EMPLID is skewed (in TEST) I will have the following trace snippet:

DBMS_STATS: Building Histogram for EMPLID                                                                          
DBMS_STATS:  bktnum=-1, nnv=47961, snnv=5501,261730969760166840458811261730969758, sndv=1153, est_ndv=1153, mnb=254
DBMS_STATS:  Trying hybrid histogram                                                                               
DBMS_STATS:  > cdn 5443, popFreq 0, popCnt 0, bktSize 21,49407114624505928853754940711462450593, 
              bktSzFrc ,49407114624505928853754940711462450593                                    
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 254, mnb 254, ssize 5443, min_ssize 2500, appr_ndv  TRUE
            ,ndv 1153, selNdv 0, selFreq 0
            ,pct 11,47028154327424400417101147028154327424, avg_bktsize 21, csr.hreq FALSE, normalize TRUE   
DBMS_STATS:   Checking range skew:                                                                                 
DBMS_STATS:      >> SKEWED                   ------------------------>                                                       
DBMS_STATS:   Histogram gathering flags: 527                                                                       
DBMS_STATS:  Accepting histogram             ------------------------>             

DBMS_STATS:          Need Actual Values (DSC_EAVS)                                                                 
DBMS_STATS:          Partition: PS_XYZ_T_12                                                                        
DBMS_STATS:          Histogram Type: HYBRID Data Type: 2                                                           
DBMS_STATS:          Histogram Flags: 4 Histogram Gathering Flags: 10                                              
DBMS_STATS:          Incremental: TRUE Fix Control 13583722: 1      

This time, Oracle has correctly created the histogram and has left the Notes column empty indicating that the histogram will be used by the CBO when computing cardinality estimation for partition n°12:

PARTITION_NAME    NUM_DISTINCT DENSITY HISTOGRAM          NOTES
---------------- ------------ -------- ----------------- ----------------------
PS_XYZ_T_11       1191        ,000833  HYBRID             HIST_FOR_INCREM_STATS
PS_XYZ_T_12       1153        ,000861  HYBRID

However, at global level, Oracle will still ignore the usage of histogram as the following proves:

SQL> SELECT
       table_name
       ,column_name
       ,histogram
       ,notes
    FROM
      all_tab_col_statistics
    WHERE
     owner = 'SYSADM'
     and table_name ='PS_XYZ'
     and column_name = 'EMPLID';

TABLE_NAME  COLUMN_NAME   HISTOGRAM       NOTES   
----------- ------------- --------------- ----------------------------------
PS_XYZ      EMPLID        HYBRID          INCREMENTAL HIST_FOR_INCREM_STATS        

I think that it suffices to have a single partition with that Note value (HIST_FOR_INCREM_STATS ) to see Oracle ignoring the usage of histogram at global level as well.

In passing, if you have exercised eyes you should have already remarked the control fix in the dbms_stats trace file

DBMS_STATS: Incremental: TRUE Fix Control 13583722: 1

SQL> select description from V$SYSTEM_FIX_CONTROL where bugno = '13583722';

DESCRIPTION
----------------------------------------------------------------
preserve partition level histograms & actual values for increment

For the sake of simplicity, I will postpone publishing my investigations on this fix_control to a separate article.

3. Possible reasons

The following references might help understanding the situation explained in this article and the possible reasons pushing Oracle to keep creating histograms but ignoring their usage during query optimisation process:

4. Bottom-line

As we have seen above, partition level histogram will not be discarded for tables with incremental mode set to TRUE. This happens regardless of whether the column values are skewed or evenly distributed. I think that, the reason behind this is that, under incremental mode, when global histogram aggregations are calculated, missing partition level histograms must be regathered at partition level with even distribution. Probably Oracle decided to keep these histograms at partition level to avoid doing the same thing two times under incremental mode. But it decides, at the same time, to pre-empt the CBO from using the histograms both at global and partition level during query optimization.

All in all, I think that disabling incremental mode is something one should envisage if confronted to a similar performance issue.

July 30, 2019

Global Temporary Table private statistics

Filed under: CBO — hourim @ 6:27 pm

Abstract

There is one Oracle running system launching simultaneously 49 sessions doing the same business for distinct partitions and distinct ranges of employees so that there will never be any collision in this process. However, this application uses bind variables and is, therefore, confronted to the classical performance threat of sharing the same execution plan between a set of bind variables not necessarily dealing with the same volume of data. This application was upgrading from 11gR2 to 12cR1. I was then asked to find a solution so that execution plans will not be shared between those 49 sessions.

I know that Adaptive Cursor Sharing has been implemented for such kind of situations. But I know also that the transition from a bind sensitive to a bind aware cursor depends on a combination of executions done at different bind variable values which makes it very difficult to guaranty each execution will have its proper optimized execution plan. So, I immediately ruled out this option. However, when I looked at one of those shared execution plans, I realized that those 49 sessions are extensively using global temporary table (GTT). This observation gave me an idea which reveals later to be very useful: use the cursor invalidation introduced by the new 12c GTT SESSION PRIVATE Statistics to force the CBO hard parsing a new execution plan for each of the 49 sessions.

Session PRIVATE statistics

Very often the Cost Based Optimizer does a wrong cardinality estimation when visiting Global Temporary Tables (GTT). This is particularly true when multiple sessions are visiting the same GTT but using different number of rows per session. In this case, sharing GTT statistics between sessions leads, generally, to a performance pain. Dynamic sampling at its default value might also reveal to be not sufficient to help the CBO in this situation. The application locks the statistics of the 11gR2 temporary tables used during the 49 sessions, so that it becomes impossible to see one session propagates its temporary table statistics to all other sessions. But this was before 12c. Starting from 12c, Oracle introduces a global statistics preference which makes possible to have session-private statistics for temporary tables:

SQL> @getPrefs
Enter value for preference: GLOBAL_TEMP_TABLE_STATS 
Enter value for tablename: 
PREFS 
-------- 
SESSION

Therefore, I decided to:

• Let each session has its proper GTT private SESSION statistics
• Use the cursor invalidation due to this GTT private statistics to avoid sharing execution plan between sessions

I started by unlocking the statistics of a couple of GTT tables involved in the 49 sessions DML queries as follows:

  SQL> execute dbms_stats.unlock_table_stats('SYSADM',T1_GTT_TEMP);

Parsing issue

Once I have implemented this change (unlock GTT statistics and use the default SESSION global parameter) I immediately realized that the 49 sessions were not anymore suffering from a performance issue due to execution plan sharing.But, as you might suspect, this doesn’t come free of charge. Indeed, I have spotted out new wait events appearing near the TOP 10 foreground wait events:

• cursor: pin S wait on X 
• library cache lock

These two wait events are symptoms of parsing effect.

While the 49 sessions were still running I issued the following query to get the SQL_ID suffering from this parsing issue:

SQL> select 
         sql_id
	 ,count(1) 
     from 
	  gv$active_session_history 
     where sample_time between to_date('05022019 15:29:00', 'ddmmyyyy hh24:mi:ss') 
	                      and to_date('05022019 16:00:00', 'ddmmyyyy hh24:mi:ss') 
     and event = 'cursor: pin S wait on X' 
     group by sql_id order by 2 desc fetch first 5 rows only
	
SQL_ID        COUNT(1) 
------------- ---------- 
5wgr71p4aj10v   28204
6fas07f6nptcu    6415 
                 5748 
fy64nh1g4ucxp    3063 
9rhtk5vh3qkjg    2394

I then tried to check why the above red bolded SQL_ID is being hard parsed so many times

SQL> @gv$sql 
      Enter value for sql_id: 5wgr71p4aj10v 
	  
	  SQL_ID 		CHILD_NUMBER TO_CHAR(P.LAST_ACTI EXECUTIONS END_FETCH  INVALIDATIONS OBJECT_STATUS
	  ------------- ------------ ------------------- ---------- ---------- ------------- -------------- 
	  5wgr71p4aj10v 0            05/02/2019 15:39:30         0        0               8 VALID 
	  5wgr71p4aj10v 1            05/02/2019 15:36:49         1        1               9 INVALID_UNAUTH 
	  5wgr71p4aj10v 2            05/02/2019 15:39:30         0        0               6 VALID 
	  5wgr71p4aj10v 3            05/02/2019 15:39:30         0        0               6 VALID 
	  5wgr71p4aj10v 4            05/02/2019 15:39:30         0        0               7 VALID 
	  5wgr71p4aj10v 5            05/02/2019 15:39:30         0        0               6 VALID 
	  5wgr71p4aj10v 6            05/02/2019 15:39:30         0        0               4 VALID 
	  5wgr71p4aj10v 7            05/02/2019 15:39:30         0        0               5 VALID
	  5wgr71p4aj10v 8            05/02/2019 15:39:30         0        0               4 VALID 
	  5wgr71p4aj10v 9            05/02/2019 15:39:30         0        0               5 VALID 
	  5wgr71p4aj10v 10           05/02/2019 15:36:10         1        1               6 INVALID_UNAUTH 
	  5wgr71p4aj10v 11           05/02/2019 15:39:30         0        0               4 VALID 
	  5wgr71p4aj10v 12           05/02/2019 15:39:30         0        0               4 VALID 
	  5wgr71p4aj10v 13           05/02/2019 15:39:30         0        0               5 VALID
	  5wgr71p4aj10v 14           05/02/2019 15:39:30         0        0               4 VALID 
	  5wgr71p4aj10v 15           05/02/2019 15:37:18         1        1               5 INVALID_UNAUTH 
	  5wgr71p4aj10v 17           05/02/2019 15:39:29         0        0               3 VALID 
	  5wgr71p4aj10v 18           05/02/2019 15:39:30         0        0               3 VALID 
	  5wgr71p4aj10v 19           05/02/2019 15:39:30         0        0               2 VALID 
	  5wgr71p4aj10v 20           05/02/2019 15:39:30         0        0               2 VALID 
	  5wgr71p4aj10v 21           05/02/2019 15:39:30         0        0               2 VALID 

Notice the high number of invalidated cursors (several in about 1 min). The INVALID_UNAUTH status means that the underlying child cursor will not be shared the next time and will be aged out when there is a stress on the library cache.

The reason of this frequent invalidation is shown here below:

SQL> @nonshared 
     Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)...

	 Enter value for 1: 5wgr71p4aj10v 
	 SQL_ID           : 5wgr71p4aj10v 
	 DDRESS           : 0000000308AB25B8 
	 CHILD_ADDRESS    : 00000002FCDABB40 
	 CHILD_NUMBER     : 0 
	 REASON           :<reason>Session Specific Cursor Session Mismatch(1)</reason> 
	 CON_ID           : 0 
	 ----------------- 
	 SQL_ID           : 5wgr71p4aj10v 
	 ADDRESS          : 0000000308AB25B8 
	 CHILD_ADDRESS    : 00000002FC680A48 
	 CHILD_NUMBER     : 1 
	 REASON           : <reason>Session Specific Cursor Session Mismatch(1)</reason> 
	 CON_ID           : 0
	 ----------------- 
	 ../.. 
	 SQL_ID           : 5wgr71p4aj10v 
	 ADDRESS          : 0000000308AB25B8
	 CHILD_ADDRESS    : 0000000317274F10 
	 CHILD_NUMBER     : 22 
	 REASON           : <reason>Session Specific Cursor Session Mismatch(1)</reason> 
	 CON_ID           : 0
	 ----------------- 

The Session Specific Cursor Session Mismatch reason indicates that Oracle has decided to invalidate the current cursor and to optimize a new execution plan because the current session is not allowed to use the GTT statistics of another session. It is crystal clear that when using SESSION PRIVATE statistics for GTT Oracle will not share the same child cursor that was optimized from another session. Each session will want to have its proper execution plan. it tries to pin the parent cursor and finds that it has been already pinned out by a preceding session. It then starts waiting on this famous wait event: cursor: pin S wait on X
It is important also to remember that Oracle gives us an information via the Note, at the bottom of the execution plan, when it is using a GTT private session statistics as shown below:

Note 
----- 
-- Global temporary table session private statistics used

Conclusion
This brief note shows, via a practical example taken from a running system, that we can use the 12c SESSION PRIVATE statistics to achieve two goals at least:

• Have each session with its proper GTT statistics without propagating them to another session
• Use the cursor invalidation side effect induced by the underlying code of the GTT SESSION PRIVATE statistics to avoid sharing execution plan between sessions

All things being equal you must balance between the performance improvement brought by this new GTT feature and the parsing side effect it introduces because of the underlying cursor invalidation. In my client case
the Library cache and Cursor Pin S wait on X wait events introduced by the SESSION PRIVATE statistics largely outweigh the performance penalty that comes when the 49 streams share the same GTT statistics.

July 28, 2019

Hybrid-Height Balanced corrupted histogram – Part II

Filed under: CBO — hourim @ 7:58 am

Setting the Scene

At one client site, using PeopleSoft software for the employee’s payroll, I was asked to look at a weird performance issue occurring in an overnight batch. This batch launches 49 different Oracle sessions. Each session starts a bunch of DML queries. The data model has been engineered so that any session will treat a different range of employees. To accomplish this workload distribution, the main tables have been range/list partitioned. 49 range partitions with 58 list sub-partitions per partition. The range partition key identifies the employee id and is of varchar2(11) datatype. There is a skew in both the partition key values distribution inside a single partition and in the number of rows per partition for the whole table. The partition key has a HYBRID histogram as we are in 12cR1 under an Exadata machine.

The last time I did intervene for this batch, it was to find a solution so that execution plans will not be shared between the 49 sessions. Indeed, the skew of the number of rows(num_rows) in the range and list sub-partitions makes sharing the same execution plan a real performance threat. I will blog about how I achieved this goal in a couple of weeks or so.

In the current case the performance problem turned to be due to a wrong cardinality estimation because of corrupted Height-Balanced-Hybrid histogram as explained in part I of this mini-series on corrupted histogram. When visiting a single partition, the CBO underestimates the number of rows it will scan and ends up by using a dramatic NESTED LOOPS operation. But this was the case only for the single partitions having a corrupted histogram at its corresponding partition key. In this article I will show how I have fixed this HYBRID histogram corruption simply by deleting the statistics of the concerned partitions and regathering them.

The Scene in details

The part of the complex query where Oracle underestimates the cardinality of the single range partition resembles to this:

SQL> select /*+ full(a) */
        count(1)
     from
        PS_XX_YY_ZZZ_STAT a
     where EMPLID >= 'XX999501004'
     and EMPLID >== 'XX999562000';
 
COUNT(1)
---------
546461
 
Plan hash value: 2020349843
-------------------------------------------------------------------------------
| Id | Operation                   | Name              | Rows | Pstart| Pstop |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT             |                   |      |       |       |
| 1 |  SORT AGGREGATE              |                   |    1 |       |       |
| 2 |   PARTITION RANGE SINGLE     |                   |   91 |    41 | 41    |
| 3 |    PARTITION LIST ALL        |                   |   91 |     1 | 58    |
|*4 |    TABLE ACCESS STORAGE FULL | PS_XX_YY_ZZZ_STAT |   91 |  2321 | 2378  |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - storage(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))
    filter(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))

We can see that the cardinality estimation is wrong since Oracle estimates 91 rows while the actual generated rows are 546461. Let’s try to figure out why Oracle is underestimating this full table scan cardinality.
Since Oracle has identified a single partition (41) to visit it has then necessarily used the statistics of this partition when estimating the single column table selectivity. Below are listed the statistics (num_rows) at global and partition level for the range/list PS_XX_YY_ZZZ_STAT partitioned table:

SQL> @getPartTab
Enter value for owner: sysadm
Enter value for table_name: PS_XX_YY_ZZZ_STAT

TABLE_NAME            PNAME                   PPOS    SUBPARTITION_COUNT NUM_ROWS
-------------------- ------------------------ ------ ------------------ ----------
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_1    1               58         5811088
../..                                                         
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_41   41              58         568804      
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_42   42              58         527248
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_43   43              58         854141
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_44   44              58         809086
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_45   45              58         697071
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_46   46              58         484041
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_47   47              58         887009
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_48   48              58         259714
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_49   49              58         1025587
                                                                        ----------
Total num_rows                                                           38434347

As you can see the 41st partition has 568K rows while globally the whole table contains 38,4M rows. At this stage of the investigation we have got the first piece of the puzzle: num_rows (partition 41) = 568804
The second step is to check the EMPLID predicate column statistics at the 41st partition level:


col column_name    format a10
col partition_name format a35
col low_value      format a15
col high_value     format a15

SQL> select
       partition_name
      ,num_distinct
      ,utl_raw.cast_to_varchar2(low_value) low_value
      ,utl_raw.cast_to_varchar2(high_value) high_value
      ,density
      ,histogram
    from
      all_part_col_statistics
    where owner = 'SYSADM'
    and table_name = 'PS_XX_YY_ZZZ_STAT'
    and column_name = 'EMPLID'
    and partition_name in ('PS_XX_YY_ZZZ_STAT_T_41');

PARTITION_NAME             NUM_DISTINCT LOW_VALUE HIGH_VALUE            DENSITY    HISTOGRAM
------------------------- ------------ --------------- --------------- ---------- -----------
PS_XX_YY_ZZZ_STAT_T_41      6278         XX999498003    XX999561990     ,000159    HYBRID

We see that this column has a HYBRID histogram which, as you know, follows a different cardinality estimation formula depending on whether it is:
• A popular value
• A non-popular with an endpoint number
• A non-popular without an endpoint number

Let’s then check in what of the above three situations we are:

SQL> select
		partition_name
		,bucket_number
		,endpoint_actual_value
		,endpoint_repeat_count
		,bucket_size
		,case when Popularity > 0 then 'Pop'
		else 'Non-Pop'
		end Popularity
    from
		(select
			uth.partition_name
			,uth.bucket_number
			,uth.endpoint_actual_value
			,uth.endpoint_repeat_count
			,ucs.sample_size/ucs.num_buckets bucket_size
			,(uth.endpoint_repeat_count - ucs.sample_size/ucs.num_buckets) Popularity
		 from
		   all_part_histograms uth
		   ,all_part_col_statistics ucs
		 where
			uth.table_name = ucs.table_name
			and uth.column_name = ucs.column_name
			and uth.owner = ucs.owner
			and uth.owner = 'SYSADM'
			and uth.table_name = 'PS_XX_YY_ZZZ_STAT'
			and uth.partition_name in ('PS_XX_YY_ZZZ_STAT_T_41')
			and uth.column_name = 'EMPLID'
			and uth.endpoint_actual_value in ('XX999501004','XX999562000')
		)
    order by 
	partition_name, bucket_number;
	
no rows selected

It is a non-popular value without an endpoint number which then normally obeys to the following cardinality estimation formula:

   Card = num_rows * NewDensity = 568804 * .000159 = 90.439836

According to the above computed cardinality, you might think that Oracle has used the HYBRID histogram to estimate the wrong 90.43 ~ 91 cardinality. But, in fact, Oracle has completely ignored the histogram as proved by the corresponding 10053 trace file:

Column (#1): EMPLID(VARCHAR2) Part#: 0
AvgLen: 12 NDV: 267112 Nulls: 0 Density: 0.000004 Min: 0.000000 Max: 0.000000
Histogram: HtBal #Bkts: 5866 UncompBkts: 5866 EndPtVals: 0 ActualVal: yes

NOTE: ignoring histogram of column (PS_XX_YY_ZZZ_STAT.EMPLID) --->
used only for incremental stats maintenance -->

#IMCUs: 0 IMCRowCnt: 0 IMCJournalRowCnt: 0 #IMCBlocks: 0 IMCQuotient: 0.000000
Column (#1): EMPLID(VARCHAR2) Part#: 0

AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159 Min: 0.000000 Max: 0.000000 Histogram: HtBal #Bkts: 5308 
UncompBkts: 5308 EndPtVals: 0 ActualVal: yes

NOTE: ignoring histogram of column (PS_XX_YY_ZZZ_STAT.EMPLID)
used only for incremental stats maintenance Column (#1): EMPLID(VARCHAR2) Part#: 40 AvgLen: 12 
NDV: 6278 Nulls: 0 Density: 0.000159
Column (#1): EMPLID(VARCHAR2)
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159
Table: PS_XX_YY_ZZZ_STAT Alias: A Card: Original: 568804.000000 Rounded: 91 
          Computed: 90.602740 Non Adjusted: 90.602740

Oracle is clearly showing that it has ignored the histogram for the column EMPLID and that it will use it only for incremental statistics maintenance.

Moreover, there is another oddity in the above 10053 trace file:

Histogram: HtBal #Bkts: 5308 UncompBkts: 5308 EndPtVals: 0 ActualVal: yes

See how Oracle CBO trace is looking at Height Balanced histogram while the statistics clearly shows that the EMPLID column has a HYBRID histogram?

I think that the histograms of this column at this partition are corrupted and therefore Oracle has ignored them and fall back to the cardinality estimation of a column without histogram:

Card = num_rows/NDV = 568804/6278 = 90.60273972 –-> exactly the value computed by Oracle

By the way we can know that Oracle will ignore the histogram, beforehand, just by selecting the NOTES column from
the all_part_col_statistics table as shown below:

SQL> select
	 partition_name
	,column_name
	,histogram
	,notes
    from
	all_part_col_statistics
    where owner = 'SYSADM'
     and table_name = 'PS_XX_YY_ZZZ_STAT'
     and column_name = 'EMPLID'
     and partition_name in ('PS_XX_YY_ZZZ_STAT_T_41');
		
PARTITION_NAME           COLUMN_NAME  HISTOGRAM       NOTES
------------------------ ----------- --------------- ---------------------
PS_XX_YY_ZZZ_STAT_T_41   EMPLID        YBRID          HIST_FOR_INCREM_STATS

The Notes indicates clearly that the HYBRID histogram for column EMPLID for partition n°41 will be used only for incremental statistics maintenance.

Working around the Scene

To get rid of this Height Balanced-HYBRID histogram cohabitation within the same column we need to delete and regather statistics in the concerned partition:

BEGIN
      dbms_stats.delete_table_stats(ownname  =>'SYSADM',tabname=>'PS_XX_YY_ZZZ_STAT',
                                    partname =>'PS_XX_YY_ZZZ_STAT_T_41');
									
      dbms_stats.gather_table_stats(ownname  =>'SYSADM',tabname=>'PS_XX_YY_ZZZ_STAT',
                                    partname =>'PS_XX_YY_ZZZ_STAT_T_41',
                                    granularity=>'PARTITION');
END ;
/

Querying again gives a perfect cardinality estimation

SQL> sELect /*+ full(a) */
		count(1)
	from
		PS_XX_YY_ZZZ_STAT a
	where EMPLID >= 'XX999501004'
	and EMPLID   <= 'XX999562000';
	
COUNT(1)
----------
546461

--------------------------------------------------------------------------------
| Id | Operation                   | Name               | Rows | Pstart| Pstop |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT             |                    |      |       |       |
  SORT AGGREGATE                   |                    |    1 |       |       |
| 2 |   PARTITION RANGE SINGLE     |                    | 545K |     41|    41 |
| 3 |    PARTITION LIST ALL        |                    | 545K |      1|    58 |
|*4 |    TABLE ACCESS STORAGE FULL | PS_XX_YY_ZZZ_STAT  | 545K |   2321|  2378 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - storage(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))
    filter(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))

And this time Oracle has clearly used the HYBRID histogram as proved by the corresponding 10053 trace file

Column (#1):
NewDensity:0.000159, OldDensity:0.000159 BktCnt:5560.000000, PopBktCnt:0.000000, PopValCnt:0, NDV:6278
Column (#1): EMPLID(VARCHAR2) Part#: 40
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159 Histogram: Hybrid #Bkts: 254 UncompBkts: 5560 EndPtVals: 254 ActualVal: yes
Column (#1): EMPLID(VARCHAR2)
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159
Histogram: Hybrid #Bkts: 254 UncompBkts: 5560 EndPtVals: 254 ActualVal: yes
Table: PS_XX_YY_ZZZ_STAT Alias: A
Card: Original: 568804 Rounded: 545382 Computed: 545381.630747 Non Adjusted: 545381.630747

This HYBRID histogram utilization is also backed by the new values of the Notes (null) columns:

SQL> select
		 partition_name
		,column_name
		,histogram
		,notes
     from
        all_part_col_statistics
    where owner = 'SYSADM'
    and table_name = 'PS_XX_YY_ZZZ_STAT'
    and column_name = 'EMPLID'
    and partition_name in ('PS_XX_YY_ZZZ_STAT_T_41');
	
PARTITION_NAME            COLUMN_NAM HISTOGRAM     NOTES
------------------------- ---------- ------------ ------
PS_XX_YY_ZZZ_STAT_T_41      EMPLID    HYBRID

The Scene Bottom-line

Oracle might ignore the presence of histogram when calculating the cardinality estimation of table/partition/sub-partition. This happens for corrupted histogram. You can check whether the histogram for a column at table/partition/sub-partition level will be used or not by the CBO simply by selecting the NOTES column from all_tab/part/subpart_col_statistics table.

July 26, 2019

Hybrid-Height Balanced corrupted histogram – Part I

Filed under: CBO — hourim @ 6:13 pm

Bad cardinality estimates lead to bad execution plans. If you understand why Oracle is misestimating object and operation cardinalities you can then address the performance problem properly. If you’re doubtful of that claim this article will show you how I have solved a performance issue by fixing a wrong single partition cardinality estimation due to a HEIGHT-BALANCED-HYBRID corrupted histogram of the partition key column. For the sake of simplicity, I will write a two-parts article. Part I explains how I have discovered this HEIGHT-BALANCED-HYBRID histogram corruption. It outlines also how it is easy to know, beforehand, that Oracle will ignore the usage of such histogram via a simple query. The second part examines the bad effect of this histogram corruption on the performance of a critical query and how I fixed it.

The Notes column

The column named NOTES in the following tables (database version is 12cR1):

  • all_tab_col_statistics
  • all_part_col_statistics
  • all_subpart_col_statistics

is, nonchalantly, an important piece of information when trying to understand why Oracle has messed up its cardinality estimation.

Here’s below a list of partitions of a range-list sub partitioned table:


SELECT
   partition_name
   ,num_distinct
   ,sample_size
   ,density
   ,histogram
   ,notes
FROM
    all_part_col_statistics
WHERE
    owner = 'SYSADM'
AND table_name = 'RANGE_LIST_TAB'
AND column_name = 'EMPLID';

PARTITION_NAME             NUM_DISTINCT SAMPLE_SIZE DENSITY    HISTOGRAM NOTES
------------------------- ------------ ----------- ---------- --------- ------------------------
RANGE_LIST_TAB_T_1          18240       5396        ,000055     HYBRID
RANGE_LIST_TAB_T_10 		1180 		5472 		,000839 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_11 		1145 		5391 		,000865 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_12 		1167 		5572 		,000849 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_13 		1373 		5564 		,000715 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_14 		1720 		5520 		,000555 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_15 		1919 		5532 		,000486 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_16 		1821 		5454 		,000517 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_17 		1565 		5535 		,000618 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_18 		1690 		5377 		,000563 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_19 		1393 		5619 		,000704 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_2 			1045 		5560 		,000951 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_20 		1445 		5589 		,000676 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_21 		1874 		5401 		,000498 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_22			1895 		5332 		,000489 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_23 		1546 		5539 		,000627 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_24 		1532 		5449 		,000633 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_25 		1680 		5422 		,000568 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_26 		1623 		5511 		,000593 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_27 		1518 		5428 		,000639 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_28 		1624 		5674 		,000595 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_29 		1463 		5412 		,000665 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_3 			949 		5497 		,00105 		HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_30 		1476 		5371 		,000658 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_31 		1824 		5519 		,000517 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_32 		3039 		5445 		,00024		HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_33 		7647 		5372 		,000131 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_34 		2749 		5516 		,000287 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_35 		3622 		5398 		,000161 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_36 		7833 		5462 		,000128		HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_37 		3233 		5622 		,000221 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_38 		8323 		5471 		,00012 		HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_39 		7733 		5286 		,000129 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_4 			1745 		5588 		,000546 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_40 		4798 		5491 		,000052		HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_41 		6278 		5560 		,000159 	HYBRID
RANGE_LIST_TAB_T_42 		5995 		5542 		,000167 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_43 		9502 		5538 		,000105 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_44 		9018 		5499 		,000111 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_45 		7941 		5544 		,000126 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_46 		5587 		5516 		,000179 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_47 		21670 		5484 		,000046 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_48 		9486 		5473 		,000105 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_49 		78008 		5385 		,000013 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_5 			1360 		5503 		,000722 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_6 			1040 		5583 		,000958 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_7 			998 		5450 		,000997 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_8 			1020 		5418 		,000975 	HYBRID HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_9 			1219 		5441 		,000811 	HYBRID HIST_FOR_INCREM_STATS

49 rows selected.

What does this mean?

When Notes equals HIST_FOR_INCREM_STATS then it means that the HYBRID histogram for the EMPLID column will be ignored by the CBO during its cardinality estimation whenever this one will hit any of the 49 partitions except partition 1 and 41 where the Notes column is null.

HYBRID-Height Balanced corrupted histogram

Here’s below an illustration of what I’ve claimed above; first for partition 10 where histograms are ignored, followed by partition 41 where HYBRID histograms are used:

SQL> select /*+ full(a) */ count(1) from RANGE_LIST_TAB a where a.emplid = 'XY361125999';

COUNT(1)
----------
1414

---------------------------------------------------------------------------------
| Id  | Operation                       | Name           | Rows | Pstart| Pstop  |
----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                 |                |      |       |        |
|  1 |   SORT AGGREGATE                 |                | 1    |       |        |
|  2 |    PARTITION RANGE SINGLE        |                | 534  |     10|     10 |
|  3 |      PARTITION LIST ALL          |                | 534  |     1 |     58 |
|* 4 |       TABLE ACCESS STORAGE FULL  | RANGE_LIST_TAB | 534  |   523 |    580 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - storage("A"."EMPLID"='XY361125999')
    filter("A"."EMPLID"='XY361125999')

And a here’s below the important piece of the corresponding 10053 trace file

SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for RANGE_LIST_TAB[A]
  
Column (#1): EMPLID(VARCHAR2) Part#: 0
AvgLen: 12 NDV: 1180 Nulls: 0 Density: 0.000839 Min: 0.000000 Max: 0.000000 

Histogram: HtBal #Bkts: 5472 UncompBkts: 5472 EndPtVals: 0 ActualVal: yes <------

NOTE: ignoring histogram of column (RANGE_LIST_TAB.EMPLID)   <------
      used only for incremental stats maintenance            <------

Column (#1): EMPLID(VARCHAR2) Part#: 9
AvgLen: 12 NDV: 1180 Nulls: 0 Density: 0.000847
Column (#1): EMPLID(VARCHAR2)
AvgLen: 12 NDV: 1180 Nulls: 0 Density: 0.000847
Table: RANGE_LIST_TAB Alias: A Card: Original: 630005 Rounded: 534 Computed: 533.902542 Non Adjusted: 533.902542

There are two important pieces of information in the above 10053 snippet code. The first one is:

Column (#1): EMPLID(VARCHAR2) Part#: 0
AvgLen: 12 NDV: 1180 Nulls: 0 Density: 0.000839 Min: 0.000000 Max: 0.000000 
Histogram: HtBal #Bkts: 5472 UncompBkts: 5472 EndPtVals: 0 ActualVal: yes

What the heck is this Height Balanced information (HtBal)? I’ve clearly shown above that the EMPLID column has a HYBRID histogram. Something is wrong with the histogram of this column which makes Oracle ignoring its usage confirmed by the second piece of information in the same 10053-trace file

NOTE: ignoring histogram of column (RANGE_LIST_TAB.EMPLID)
      used only for incremental stats maintenance

Since the histogram of EMPLID column has been ignored Oracle falls back to the classical single table selectivity without histogram:

  E-Rows = num_rows/num_distinct = 630005/1180 = 533,902542 –-> exact value in the 10053 trace file

Let’s now consider partition 41 where the Notes column is NULL:


SQL> select /*+ full(a) */ count(1) from RANGE_LIST_TAB a where a.emplid = 'XY178963531';

COUNT(1)
----------
1022
 ---------------------------------------------------------------------------------
| Id  | Operation                       | Name           | Rows | Pstart| Pstop  |
----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                 |                |      |       |        |
|  1 |   SORT AGGREGATE                 |                |    1 |       |        |
|  2 |    PARTITION RANGE SINGLE        |                | 1330 |     41|     41 |
|  3 |      PARTITION LIST ALL          |                | 1330 |      1|     58 |
|* 4 |       TABLE ACCESS STORAGE FULL  | RANGE_LIST_TAB | 1330 |   2321|   2378 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - storage("A"."EMPLID"='XY178963531')
    filter("A"."EMPLID"='XY178963531')

SINGLE TABLE ACCESS PATH
  Column (#1):
   NewDensity:0.000159, OldDensity:0.000159 BktCnt:5560.000000, PopBktCnt:0.000000, PopValCnt:0, 
  NDV:6278
Column (#1): EMPLID(VARCHAR2) Part#: 40
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159 Histogram: Hybrid #Bkts: 254 UncompBkts: 5560 EndPtVals: 254 ActualVal: yes

Column (#1): EMPLID(VARCHAR2)
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159 Histogram: Hybrid #Bkts: 254 UncompBkts: 5560 EndPtVals: 254 ActualVal: yes

Table: RANGE_LIST_TAB Alias: A Card: Original: 568804 Rounded: 1330 Computed: 1329.937410 
   Non Adjusted: 1329.937410 <-----

Oracle is, this time, using HYBRID histogram for the EMPLID value as shown in the above corresponding 10053 trace file. Let’s see if the partition key ‘XY178963531’ value is popular or not

SELECT
   partition_name
  ,bucket_number
  ,endpoint_actual_value
  ,endpoint_repeat_count
  , sample_size
  ,case when Popularity > 0 then 'Pop'
     else 'Non-Pop'
  end Popularity
FROM
	(select
		uth.partition_name
		,uth.bucket_number
		,uth.endpoint_actual_value
		,uth.endpoint_repeat_count
		,ucs.sample_size
		,(uth.endpoint_repeat_count - ucs.sample_size/ucs.num_buckets) Popularity
	from
	  all_part_histograms uth
	  ,all_part_col_statistics ucs
	where
	uth.table_name = ucs.table_name
	and uth.column_name = ucs.column_name
	and uth.owner = ucs.owner
	and uth.owner = 'SYSADM'
	and uth.table_name = 'RANGE_LIST_TAB'
	and uth.partition_name in ('RANGE_LIST_TAB_T_41')
	and uth.column_name = 'EMPLID'
	and uth.endpoint_actual_value = 'XY178963531'
	)
ORDER BY 
  partition_name, bucket_number;
  
PARTITION_NAME 			 BUCKET_NUMBER 	ENDPOINT_ACTUAL ENDPOINT_REPEAT_COUNT SAMPLE_SIZE POPULAR
------------------------ ------------- --------------- --------------------- ----------- -------
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5396        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5441        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5391        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5572        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5564        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5520        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5532        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5454        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5535        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5377        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5619        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5560        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5589        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5401        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5332        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5539        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5449        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5422        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5511        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5428        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5674        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5412        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5497        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5371        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5519        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5445        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5372        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5516        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5398        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5462        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5622        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5471        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5286        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5588        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5491        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5560        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5542        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5538        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5499        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5544        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5516        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5484        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5473        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5385        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5503        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5583        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5450        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5418        Non-Pop
RANGE_LIST_TAB_T_41       242           XY178963531     13                    5472        Non-Pop

‘XY178963531’ is a non-popular value with an endpoint number which resolves to the following cardinality estimation formula:

E-Rows = num_rows * greatest (NewDensity, ENDPOINT_REPEAT_COUNT/sample_size) 
E-Rows = 568804 * greatest (0.000159, 13/5560) = 1329,93741

This is exactly the cardinality computed by Oracle (visible in the 10053-trace file) which proves, again, that for partition 41 the HYBRID histogram has really been used.

How to proactively check for any corruption in histogram

Oracle, via this MOS Note 2483224.1 proposes the following query (which I have slightly updated) to proactively check for any corruption in histogram:

SQL> select
		c1.table_name,
		c1.column_name
    from
		all_histograms c1,
		all_histograms c2
    where 
	    c1.table_name = c2.table_name
		and c1.column_name = c2.column_name
		and c1.endpoint_number < c2.endpoint_number
		and c1.endpoint_value > c2.endpoint_value
		and c1.owner = c2.owner
		and c1.owner = 'SYSADM'
	group by
	  c1.table_name,
	  c1.column_name;

no rows selected

As you can see, this query is not correct since it shows no histogram corruption in my application.
If, instead, I use the following query it will show clearly that there are corrupted histograms or at least histograms that will be ignored by the CBO during its cardinality estimations:

SQL> SELECT
		 table_name
		,column_name
		,notes
	 FROM
		all_tab_col_statistics
	WHERE owner = 'SYSADM'
	  and histogram != 'NONE'
	  and trim(notes) = 'INCREMENTAL HIST_FOR_INCREM_STATS'
	GROUP BY
      table_name, column_name, notes
  ORDER BY
     table_name, column_name;
	 
TABLE_NAME 				COLUMN_NAME 	NOTES
-------------------- -------------- ----------------------------------
RANGE_LIST_TAB 			EMPLID 		INCREMENTAL HIST_FOR_INCREM_STATS
RANGE_LIST_TAB 			PRC_ORD_TS 	INCREMENTAL HIST_FOR_INCREM_STATS

Here we see that the table RANGE_LIST_TAB has two columns on which the histogram will be ignored. In addition, since RANGE_LIST_TAB is a partitioned table I can show this corrupted histogram information per partition using the following query as I did above in the first paragraph of this article:

SQL> SELECT
		 partition_name
		,column_name
		,histogram
		,notes
      FROM
		all_part_col_statistics
	WHERE owner = 'SYSADM'
		and table_name = 'RANGE_LIST_TAB'
		and column_name = 'EMPLID'
		and histogram != 'NONE'
		and trim(notes) = 'HIST_FOR_INCREM_STATS'
	ORDER BY
		partition_name, column_name;
		
PARTITION_NAME 				COLUMN_NAME 	HISTOGRAM 		NOTES
-------------------------- -------------- --------------- ----------------------
RANGE_LIST_TAB_T_10        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_11        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_12        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_13        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_14        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_15        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_16        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_17        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_18        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_19        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_2         EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_20        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_21        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_22        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_23        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_24        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_25        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_26        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_27        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_28        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_29        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_3         EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_30        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_31        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_32        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_33        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_34        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_35        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_36        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_37        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_38        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_39        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_4         EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_40        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_42        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_43        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_44        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_45        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_46        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_47        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_48        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_49        EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_5         EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_6         EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_7         EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_8         EMPLID          HYBRID          HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_9         EMPLID          HYBRID          HIST_FOR_INCREM_STATS

47 rows selected.

There are 47 partitions out of 49 in this RANGE_LIST_TAB table for which the HYBRID histogram for EMPLID column is corrupted. Partition 1 and 41 are not concerned by this corruption.
To work around this issue, you must delete and regather the statistics of the concerned partition. Let’s do the experiment for partition 42 for example:

BEGIN
  dbms_stats.delete_table_stats(ownname       =>'SYSADM'
                                ,tabname      =>'RANGE_LIST_TAB' ,partname =>'RANGE_LIST_TAB_T_42');
  dbms_stats.gather_table_stats(ownname       =>'SYSADM'
                                 ,tabname     =>'RANGE_LIST_TAB', partname =>'RANGE_LIST_TAB_T_42'
                                 ,granularity =>'PARTITION');
END ;
/

SQL> SELECT
		 partition_name
		,column_name
		,histogram
		,notes
     FROM
	     all_part_col_statistics
	WHERE owner = 'SYSADM'
	 and table_name = 'RANGE_LIST_TAB'
	 and column_name = 'EMPLID'
	 and histogram != 'NONE'
	 and trim(notes) = 'HIST_FOR_INCREM_STATS'
	ORDER BY
	   partition_name, column_name;
	   
PARTITION_NAME 				COLUMN_NAME 	HISTOGRAM 		NOTES
--------------------------- -------------- --------------- ------------------------
RANGE_LIST_TAB_T_10         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_11         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_12         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_13         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_14         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_15         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_16         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_17         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_18         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_19         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_2          EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_20         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_21         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_22         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_23         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_24         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_25         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_26         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_27         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_28         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_29         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_3          EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_30         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_31         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_32         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_33         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_34         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_35         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_36         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_37         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_38         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_39         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_4          EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_40         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_43         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_44         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_45         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_46         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_47         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_48         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_49         EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_5          EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_6          EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_7          EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_8          EMPLID         HYBRID           HIST_FOR_INCREM_STATS
RANGE_LIST_TAB_T_9          EMPLID         HYBRID           HIST_FOR_INCREM_STATS

46 rows selected.

The number of partitions concerned by the corrupted histogram is now 46 and the partition 42 is not anymore among the list.

Summary
You should always show a tenacity in trying to understand why Oracle has messed up its cardinality estimation. You may end up by finding the root cause of the sup-optimal execution plan. In this case, I ended up by finding that the 12c HYBRID histograms can be ignored by the CBO when they are corrupted (presence of Height Balance histogram in the 10053-trace file while the column dictionary table is showing HYBRID histogram).
I’ve also found that we can know whether the histogram will be used or not by Oracle during its cardinality estimation, simply by looking at the value of the column Note from all_tab/part_col_statistics table.

October 28, 2018

Diagnosing the past using AWR: hard parse headache

Filed under: AWR — hourim @ 8:22 am

This is a document I initially wrote, almost two years ago, for one of my clients to help developers interpreting AWR reports. I believe it is interesting to share it world wide

Diagnosing the past using AWR-hard parse headache

I have two other documents concerning two others situations where AWR report has been very helpful and which I hope to share with you when I will make them publishable.

September 12, 2018

Troubleshooting via indexing

Filed under: Oracle — hourim @ 6:27 pm

The damage that can be inflicted by a filter predicate applied on a big table is difficult to calculate. But it is clear that it has been almost always a tragic mistake to filter a huge amount of rows from a very large table. This article outlines one of those practical cases where the creation of a chirurgical index overcame the waste of time and energy spent at the table level in a very critical third party software query.

To begin with here’s a SQL monitoring report of this critical query taken from a 12cR2 release:

SELECT
    conRef h,
    vamisin h
FROM 
  TABLEHIST h
WHERE h.typeisin <> 4
AND h.optionv NOT IN
  (SELECT  vwx.ident 
   FROM AGGREGATE_VW vwx 
   WHERE (vwx.I1 = 2298535)
  )
AND h.backclient NOT IN (106, 28387)
AND h.vamisin        IN (91071113)

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)       
 Instance ID         :  1                     
 Session             :  XXXXXX (2198:9848) 
 SQL ID              :  7v8109fkuusw3         
 SQL Execution ID    :  17529954              
 Execution Started   :  08/01/2018 04:58:31   
 First Refresh Time  :  08/01/2018 04:58:35   
 Last Refresh Time   :  08/01/2018 04:59:16   
 Duration            :  45s                   
 Module/Action       :  JDBC Thin Client/-    
 Service             :  sphpre1               
 Program             :  JDBC Thin Client      
 Fetch Calls         :  6                     

Global Stats
=================================================================
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read  | Read  |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs  | Bytes |
=================================================================
|      44 |    2.35 |       42 |     6 |  81930 | 39157 | 311MB |
=================================================================

SQL Plan Monitoring Details (Plan Hash Value=4122899908)
========================================================================================================
| Id |               Operation                |          Name          |   Time    | Start  |   Rows   |
|    |                                        |                        | Active(s) | Active | (Actual) |
========================================================================================================
|  0 | SELECT STATEMENT                       |                        |        42 |     +4 |    50549 |
|  1 |   HASH JOIN ANTI                       |                        |        42 |     +4 |    50549 |
|  2 |    TABLE ACCESS BY INDEX ROWID BATCHED |TABLEHIST               |        42 |     +4 |    50549 |
|  3 |     INDEX RANGE SCAN                   |TABLEHIST_IND4          |        42 |     +4 |     281K |
|  4 |    VIEW                                |                        |         1 |    +45 |    12883 |
|  5 |     CONNECT BY WITH FILTERING (UNIQUE) |                        |         1 |    +45 |    21028 |
|  6 |      INDEX RANGE SCAN                  |IDX_AGGREGAT_VIEW_IDENT |         1 |    +45 |       40 |
|  7 |      HASH JOIN                         |                        |         1 |    +45 |        0 |
|  8 |       NESTED LOOPS                     |                        |         1 |    +45 |    20988 |
|  9 |        STATISTICS COLLECTOR            |                        |         1 |    +45 |        9 |
| 10 |         CONNECT BY PUMP                |                        |         1 |    +45 |    21028 |
| 11 |        INDEX RANGE SCAN                |IDX_AGGREGAT_VIEW_IDENT |         1 |    +45 |    20988 |
| 12 |       INDEX FAST FULL SCAN             |IDX_AGGREGAT_VIEW_IDENT |           |        |          |
========================================================================================================

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("H"."OPTIONV"="IDENT")
   2 - filter(("H"."BACKCLIENT"<>:SYS_B_3 AND "H"."BACKCLIENT"<>:SYS_B_2 
           AND "H"."TYPEISIN"<>:SYS_B_0))
   3 - access("H"."VAMISIN"=:1)
   4 - filter((TO_NUMBER(REPLACE(REGEXP_SUBSTR 
                  ("IDENT_PATH",'@@(.+?)(@@|$)',1,1,HEXTORAW('5824EBC2000')),'@@',NULL))=:SYS_B_1 
              AND "IDENT">0))
   5 - access("F"."MGR"=PRIOR NULL)
   6 - access("F"."MGR"=1)
   9 - access("connect$_by$_pump$_005"."prior f.ident "="F"."MGR")
 
Note
-----
   - this is an adaptive plan 

In their attempt to fix this performance pain, DBAs and developers were focusing their efforts in reviewing the part of the execution plan that involves the hierarchical query (operation n°4 to 12).

But whoever knows how to interpret correctly the multitude information provided by the SQL monitor report would have immediately pointed out where most of the query time is spent. Look at the Start Active column of operations n°0 to 3: they all shows +4 seconds. This means that those operations started to be active 4 seconds after the beginning of the query. And whoever knows how to interpret an execution plan would have pointed out that the parent-child operation n°2-3 is the first to be started by Oracle. Look now to the Time Active(s) column of these two operations. They both indicate 42 seconds. This means that the table/index operation 2-3 took 42 seconds. Since the total duration of this query is 45 seconds we know then clearly what part of the query we have to address.

The index at operations n°3 supplied its parent table at line n°2 with 281K of rowids which has thrown away all but only 50549 using the filter predicate n°2

   2 - filter(("H"."BACKCLIENT"<>:SYS_B_3 AND "H"."BACKCLIENT"<>:SYS_B_2 
           AND "H"."TYPEISIN"<>:SYS_B_0))

Most of the 81930 buffer gets come from the TABLEHIST table at operation line n°2. We need then to engineer an index starting by the column in the predicate on which an equality is applied (VAMISIN) followed by the other three columns from the other predicates on which an inequality is applied (OPTIONV, BACKCLIENT, TYPEISIN).

SQL> create index TABLEHIST_IND_MHO on TABLEHIST 
       (VAMISIN, OPTIONV, BACKCLIENT, TYPEISIN)

Since the select part of this critical query involves only two columns of which one (VAMISIN) is already included in the new index then why not add the second selected column to this index and avoid visiting the table at all?

SQL> create index TABLEHIST_IND_MHO on TABLEHIST 
       (VAMISIN, OPTIONV, BACKCLIENT, TYPEISIN, CONREF)

Although this will increase the size of the index it has, however, the potentiality to eliminate 96% of I/O.
Indeed, following the creation of the new index, Oracle has used the following execution plan:

-------------------------------------------------------------------------------
| Id  | Operation                            | Name                   | Rows  |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                        |       |
|*  1 |  HASH JOIN ANTI                      |                        |     1 |
|*  2 |   INDEX RANGE SCAN                   |TABLEHIST_IND_MHO       |    49 |
|*  3 |   VIEW                               |                        |    89 |
|*  4 |    CONNECT BY WITH FILTERING (UNIQUE)|                        |       |
|*  5 |     INDEX RANGE SCAN                 |IDX_AGGREGAT_VIEW_IDENT |     5 |
|   6 |     NESTED LOOPS                     |                        |    84 |
|   7 |      CONNECT BY PUMP                 |                        |       |
|*  8 |      INDEX RANGE SCAN                |IDX_AGGREGAT_VIEW_IDENT |    17 |
-------------------------------------------------------------------------------


SQL> @sqlstats
Enter value for sql_id: 7v8109fkuusw3

     CHILD SQL_PROFILE          PLAN_HASH_VALUE   AVG_GETS    EXECS
---------- -------------------- --------------- ---------- --------
1	                           860019752       3651      204

As you can see Oracle is not visiting anymore the TABLEHIST table and the reduction of the logical I/O consumption is very remarkable as it went from 81930 to 3651.

September 11, 2018

Exploded index causes a wrong execution plan

Filed under: Oracle — hourim @ 6:57 pm

A couple of days ago I have been asked to look at performance issue which occurred in a very simple delete statement. Below is this delete with its good and bad execution plan respectively:

SQL_ID gq3khfnctny4h
--------------------
DELETE FROM table_t1 WHERE evt_id = :1  AND eng_id = :2
 
Plan hash value: 1362912592 
-----------------------------------------------------------------
| Id  | Operation          | Name          | Rows  | Cost (%CPU)|
-----------------------------------------------------------------
|   0 | DELETE STATEMENT   |               |       |     3 (100)|
|   1 |  DELETE            | TABLE_T1      |       |            |
|   2 |   INDEX UNIQUE SCAN| PK_TABLE_T117 |     1 |     3   (0)| -- cost = 3
-----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
    2 - access("EVT_ID"=:1 AND "ENG_ID"=:2)
SQL_ID gq3khfnctny4h
--------------------
DELETE FROM table_t1 WHERE evt_id = :1  AND eng_id = :2
 
Plan hash value: 3057265474 
-------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  |Cost (%CPU)|
-------------------------------------------------------------------
|   0 | DELETE STATEMENT      |               |       |    2 (100)|
|   1 |  DELETE               | TABLE_T1      |       |           |
|   2 |   INDEX FAST FULL SCAN| PK_TABLE_T117 |     1 |    2   (0)| -- cost = 2
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
    2 - access("EVT_ID"=:1 AND "ENG_ID"=:2)

The statistics of the different executions of this delete over the last couple of weeks are:

SQL> @HistStats
Enter value for sql_id: gq3khfnctny4h
Enter value for from_date: 01092018

SNAP_BEGIN                PLAN_HASH_VALUE    EXECS END_OF_FETCH  AVG_ETIME  AVG_LIO   AVG_ROWS
------------------------- --------------- -------- ------------ ---------- -------- ----------
04-SEP-18 01.00.23.666 PM      1362912592     5109         5109          0     19.3          1 -- good execution plan
04-SEP-18 02.00.32.471 PM      1362912592      574          574          0       20          1
04-SEP-18 03.00.48.319 PM      1362912592     1472         1472          0       20          1
05-SEP-18 08.00.10.847 AM      1362912592     3378         3378          0     19.4          1
05-SEP-18 05.00.04.444 PM      1362912592    17759        17759          0     19.7          1
05-SEP-18 08.00.43.235 PM      1362912592   201040       201040          0     18.6          1
05-SEP-18 09.00.30.871 PM      1362912592    59826        59826          0     19.7          1
05-SEP-18 10.01.28.933 PM      1362912592   140850       140850          0     17.7          1
06-SEP-18 12.01.11.664 AM      1362912592    17951        17951          0     22.9          1
06-SEP-18 09.00.31.336 AM      3057265474    22899        22899         19   2136.5          1 -- bad plan switch 
06-SEP-18 10.00.44.721 AM      3057265474    45736        45736         14     1315          1
06-SEP-18 11.00.48.929 AM      3057265474    56729        56729         11  37105.8          1
06-SEP-18 05.00.43.993 PM      3057265474    48568        48568         12    11275          1

As you can see, when the delete statement switched from an execution plan using a INDEX UNIQUE SCAN (1362912592) to an execution plan using an INDEX FAST FULL SCAN (3057265474) of the same primary key index, the average elapsed time of the delete went from 0 to 19 seconds.

The AVG_ROWS column above clearly shows that we are always deleting a single row. This is because the two columns of the predicate part represent the two primary key columns (evt_id, eng_id). So why on earth would Oracle FAST FULL scan an entire index to end up by locating a single row and delete it?

If you look back to the above execution plan you will realize that Oracle has completely messed up its estimations as it has considered that FAST FULL scanning the PK_TABLE_T117 primary key index will cost only 2 while the cost of unique scanning the same index would be 3. This is why it has switched for the bad execution plan causing a serious performance issue.

The statistics of this unique index are:

SQL> select
        index_name
      , blevel
      , leaf_blocks
      , distinct_keys
      , num_rows
      , clustering_factor
      , sample_size    
    from user_indexes
    where index_name = 'PK_TABLE_T117';

INDEX_NAME     BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR SAMPLE_SIZE 
-------------- ------ ----------- ------------- ---------- ----------------- ----------- 
PK_TABLE_T117       3       28860        524234     524234            348959      524234   

Since I have at my disposable a PRE-PRODUCTION copy of this database I decided to get the same index statistics from this database:

SQL> select
        index_name
      , blevel
      , leaf_blocks
      , distinct_keys
      , num_rows
      , clustering_factor
      , sample_size    
    from user_indexes
    where index_name = 'PK_TABLE_T117';

INDEX_NAME      BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR SAMPLE_SIZE 
-------------- ------- ----------- ------------- ---------- ----------------- ----------- 
PK_TABLE_T117        2       21470        617240     617240            190479      617240 

How could a BLEVEL of an index be equal to 3 in PRODUCTION and the BLEVEL of the same index with much more rows be equal to 2 in another database?

That’s probably an indication of an exploded index due to the delete and its pattern. Let’s check this using a little bit enhanced Jonathan Lewis script

SQL> @IndexEfficiency2
Enter value for m_schemaname: xxxx
../..
PK_TABLE_T117
      Current Index Size GB: 1.44         Expected Index size GB: .07

This index certainly needs a rebuild

SQL> alter index PK_TABLE_T117 rebuild online;
SQL> select
            index_name
          , blevel
          , leaf_blocks
          , distinct_keys
          , num_rows
          , clustering_factor
          , sample_size      
     from user_indexes
     where index_name = 'PK_TABLE_T117';

INDEX_NAME      BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR SAMPLE_SIZE 
-------------- ------- ----------- ------------- ---------- ----------------- ----------- 
PK_TABLE_T117        2        1461        524202     524202            348956      524202

SQL> @sizeBySegName
Enter value for segment_name: PK_TABLE_T117
Enter value for owner: xxxx

SEGMENT_TYPE       TABLESPACE_NAME  SEGMENT_NAME      GB
------------------ ---------------- --------------- -----
INDEX              IDX              PK_TABLE_T117   .0117
                                                    ------
Total Segment Size                                  .0117 

And here’s below the new execution plan:

SQL> explain plan for
     DELETE FROM table_t1 WHERE evt_id = :1  AND eng_id = :2;
  
SQL> select * from table(dbms_xplan.display);

--------------------------------------------------------------------
| Id  | Operation          | Name          | Rows  | Bytes | Cost  |
--------------------------------------------------------------------
|   0 | DELETE STATEMENT   |               |     1 |    31 |     2 |
|   1 |  DELETE            | TABLE_T1|     |       |       |
|   2 |   INDEX UNIQUE SCAN| PK_TABLE_T117 |     1 |    31 |     2 | -- cost = 2
--------------------------------------------------------------------

SQL> explain plan for 
    delete /*+ index_ffs(TABLE_T1, PK_TABLE_T117) */ table_t1 
    WHERE evt_id = :1  AND eng_id = :2;

SQL> select * from table(dbms_xplan.display);

----------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | DELETE STATEMENT      |               |     1 |    31 |   402 |
|   1 |  DELETE               | TABLE_T1      |       |       |       |
|   2 |   INDEX FAST FULL SCAN| PK_TABLE_T117 |     1 |    31 |   402 | -- cost = 402
-----------------------------------------------------------------------

CONCLUSION

I have never considered rebuilding an index to be a reliable tuning strategy. This is why, I have already ousted this operation from my list of tuning steps. But, as Tom Kyte says “always is never true, never is always false”, it is then interesting to see practical real life cases where rebuilding index reveals to be a fix for a wrong execution plan choice.

August 17, 2018

Wrong utilisation of Result Cache

Filed under: result cache — hourim @ 5:43 pm

Still, not everyone is so in love with the Oracle result cache feature. Franck Pachot has already explained why here and here. In this article I will show you a practical real life case where result cache has been wrongly used. A couple of weeks ago I had to look at an annoying sporadic 5 to 10 min spike occurring randomly during the day. A simple ASH query during the spike shows high CPU consumption coupled with a latch free wait event popping up at the TOP. Using ASH again I found that almost all latch free wait events were due to the following TOP query:

1. Real life case

SQL> select sql_id, count(1)
    from gv$active_session_history
    where
        sample_time between to_date('03082018 10:00:00', 'ddmmyyyy hh24:mi:ss')
                    and     to_date('03082018 11:00:00', 'ddmmyyyy hh24:mi:ss')
    and event = 'latch free'
    group by sql_id
    order by 2 desc;

SQL_ID            COUNT(1)
--------------- ----------
3djqkyz0taafr           77 -->
                        15
9fxqwts01uzzw            6
43sh58j74rr0a            2
b7qrux84y1hna            1
dr3s0wjmn5h2s            1
g4k4tt8uh1tv6            1

The corresponding SQL monitoring report outlines the evidence:

SQL Text
------------------------------
SELECT
  /*+ RESULT_CACHE */
  {list_of_columns}
FROM table
WHERE 
  {list_of_predicates}
ORDER BY col1;

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  1                   
 SQL ID              :  3djqkyz0taafr       
 SQL Execution ID    :  21511509            
 Execution Started   :  08/01/2018 10:48:03 
 First Refresh Time  :  08/01/2018 10:48:15 
 Last Refresh Time   :  08/01/2018 10:48:21 
 Duration            :  18s                  
 Fetch Calls         :  1                

Global Stats
===========================================================================
| Elapsed |   Cpu   |    IO    |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
===========================================================================
|      18 |    0.05 |     0.86 |       17 |     1 |   1656 |  654 |   5MB |
===========================================================================

SQL Plan Monitoring Details (Plan Hash Value=289067707)
==========================================================================================================
| Id |            Operation            |            Name            | Execs |   Rows   | Activity Detail |
|    |                                 |                            |       | (Actual) |   (# samples)   |
==========================================================================================================
|  0 | SELECT STATEMENT                |                            |     1 |        0 |                 |
|  1 |   RESULT CACHE                  | gv2j746786jamb4s329xj0ur8g |     1 |        0 | latch free (18) |
|  2 |    FILTER                       |                            |     1 |     1540 |                 |
|  3 |     TABLE ACCESS BY INDEX ROWID | HIST_TABLE                 |     1 |     1540 |                 |
|  4 |      INDEX RANGE SCAN           | HIST_IDX                   |     1 |     1540 |                 |
==========================================================================================================

As you can see out of the 18 seconds of the total query response time, 18 were due to the wait event latch free. This latch free itself is caused by what seems to be a wrong utilisation of the result cache hint.

Of course that was the ideal moment to use the so-famous Tanel Poder latchprof.sql script, which, when launched, shows the following picture:

SQL> @latchprof sid,name,sqlid % "Result" 100000

-- LatchProf 2.02 by Tanel Poder ( http://www.tanelpoder.com )

       SID NAME                     SQLID               Held       Gets  Held % 
---------- ------------------------ ------------- ---------- ---------- ------- 
      1753 Result Cache: RC Latch   3djqkyz0taafr        248         62     .25 
       298 Result Cache: RC Latch   3djqkyz0taafr        151         45     .15 
        35 Result Cache: RC Latch   3djqkyz0taafr        148         32     .15 
      3671 Result Cache: RC Latch   3djqkyz0taafr        136         35     .14 
      2681 Result Cache: RC Latch   3djqkyz0taafr        130         34     .13 
       273 Result Cache: RC Latch   3djqkyz0taafr         94         18     .09 
      1238 Result Cache: RC Latch   6hbsjju24n8d1         56         56     .06 

Not only did the latchprof.sql succeed to show 7 different sessions waiting on Result Cache: RC Latch wait event; but it outlined also that all those sessions were executing the same query already spotted above via ASH and SQL monitor report.

Close on the heels of the above latchprof.sql script, I executed the following query to check the content of result cache object view:

SQL> select
         to_char(dbms_sqltune.sqltext_to_signature(name, 1))
        , count(1)
     from
        gv$result_cache_objects
     group by
        to_char(dbms_sqltune.sqltext_to_signature(name, 1))
     having count(1) > 3
     order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
11688800737312996943                         106782 --> this is a portion of 3djqkyz0taafr
778588378770889215                              879
9760132213098432565                              62
13511637134398334555                              7
10994613278769629249                              7
13666841246362081009                              6
2234831540847838164                               5
16412641633620715561                              4

8 rows selected.

I don’t think that it is normal to see so many rows (106,782) for the same force matched cache_id into the result cache object memory. This is rather an indication of something trashing the result cache.

Another clue about this miss-utilisation of the result cache resides in the high number of invalidation for the same force matched name as shown below:

select 
    status
  , count(1)
  from
    gv$result_cache_objects
where 
   to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = '11688800737312996943'
  group by status;
  
  STATUS      COUNT(1)
--------- ----------
Invalid        77583
Published      30185
New	           1

At this stage of the investigation, there was enough low-level evidence to have some cause for concern about the result cache wrong utilisation done by this query. This is why I asked the developer to get rid of the

 /*+ result_cache */ 

hint, tracked back the new sql_id and confirmed the disappearance of the Result Cache: RC Latch wait event.

2. General case

For the sake of completeness I will try to show, below, how attractive the result cache pool looks like when it is very well used and when it isn’t:

exec dbms_result_cache.flush();

alter session set cursor_sharing=force;

declare
    x number;
begin
    for i in 1..&1 loop
    	execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
    end loop;
end;
/		
Enter value for 1: 100

PL/SQL procedure successfully completed. 

All what I did is:

  • Flush the result cache memory
  • Force the cursor sharing in order to use system bind variable
  • Execute 100 times an exactly identical query using the same bind variable value at each execution

And here’s below how the result cache memory looks like in this particular ideal case:

 select
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
   ,count(1)
 from
   gv$result_cache_objects
 where
  type != 'Dependency'
 group by
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
 having count(1) >= 1
 order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
9830716796482200263                               1

select 
    status
  , scan_count 
  , count(1)
  from
    gv$result_cache_objects
where 
   to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = ‘9830716796482200263’
and type != 'Dependency'
group by 
    status
   ,scan_count
order by 3 desc
    ;    

STATUS    SCAN_COUNT   COUNT(1)
--------- ---------- ----------
Published         99          1

The presence of a single row in gv$result_cache_objects is a clear indication of the result cache being very well reused. The scan_count figure indicates that out of 100 executions we’ve initiated 99 scans in the cached result. This is a perfect use of the result cache feature.

But let’s now disturb a little bit this ideal setup by changing the above PL/SQL anonymous block so that cached result will not be always found and re-used because of different bind variable values used for the last 30 executions:

exec dbms_result_cache.flush();

declare
    x number;
begin
    for i in 1..&1 loop
     if i <= 70 then execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     else
       execute immediate 'select /*+ result_cache */ mgr from emp where sal >= '||2*i          
                  || 'and rownum=1' into x;	

     end if;
    end loop;
end;
/		
Enter value for 1: 100

PL/SQL procedure successfully completed. 

And here’s below how the result cache memory looks like now:

 select
     to_char(dbms_sqltune.sqltext_to_signature(name, 1))
    ,count(1)
 from
    gv$result_cache_objects
 where
  type != 'Dependency'
 group by
    to_char(dbms_sqltune.sqltext_to_signature(name, 1))
 having count(1) >= 1
 order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
9830716796482200263                              31

select 
    status
  , scan_count 
  , count(1)
  from
    gv$result_cache_objects
where 
   to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = '9830716796482200263'
and type != 'Dependency'
group by 
    status
   ,scan_count
order by 3 desc;    

STATUS    SCAN_COUNT   COUNT(1)
--------- ---------- ----------
Published          0         30
Published         69          1

Since I have used 30 different bind variable values for the last 30 executions I have then failed to initiate any scan count into the result cache for these particular executions. However the first 70 executions have perfectly re-used the result cache 69 times (the first execution is the one that stored the result cache).

You might have probably already guessed that we can fail reusing a stored result cache even in the absence of any dependent object invalidation. It suffices, as shown via the above 30 particular query executions, to use a different bind variable value to produce a new not already cached result. But let’s now introduce a couple of invalidations in the above setup:

exec dbms_result_cache.flush();

declare
    x number;
begin
    for i in 1..&1 loop
     if i <= 50 then execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     elsif mod(i,3) = 0 then
      execute immediate 'alter table emp modify (sal default 42)';
      execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     else
      execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     end if;
    end loop;
end;
/	
Enter value for 1: 100

PL/SQL procedure successfully completed.	

All what I did above is launch a query 100 times with:

  • The first 50 executions using the same bind variable value and thereby using the result cache
  • The next 50 executions flip flopping between reusing a result cache and failing to reuse it because of a DDL on the result cache dependent table
select
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
   ,count(1)
from
   gv$result_cache_objects
where
   type != 'Dependency'
group by
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
   having count(1) >= 1
order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
9830716796482200263                              18
SQL> @getRCStatus.sql
Enter value for force_match_sign: 9830716796482200263

STATUS                    SCAN_COUNT        CNT
------------------------- ---------- ----------
Invalid                           49          1
Invalid                            2         16
Published                          1          1
                                     ----------
Total cache creation                         18 

The output of the query above shows that the 100 executions have been decomposed as follows:

  • 1 execution (the first one) + 49 hits of the result cache for the next 49 executions
  • 16 invalidations with scan count = 2 meaning that 32 executions hit the cache before being invalidated
  • And since there are 18 rows (16 +1+1) in the result cache this means that they inevitably correspond to 18 result cache creations : the first execution and the 17 other one following each of the 17 invalidations.

And here we are done: 1+ 49 + 32 + 18 make the 100 executions.

Oracle gives also a way to retrieve statistics on the cache memory usage via the following package call:

SQL> exec dbms_result_cache.Memory_Report
R e s u l t   C a c h e   M e m o r y   R e p o r t
[Parameters]
Block Size          = 1K bytes
Maximum Cache Size  = 10496K bytes (10496 blocks)
Maximum Result Size = 524K bytes (524 blocks)
[Memory]
Total Memory = 205320 bytes [0.016% of the Shared Pool]
... Fixed Memory = 12592 bytes [0.001% of the Shared Pool]
... Dynamic Memory = 192728 bytes [0.015% of the Shared Pool]
....... Overhead = 159960 bytes
....... Cache Memory = 32K bytes (32 blocks)
........... Unused Memory = 13 blocks
........... Used Memory = 19 blocks
............... Dependencies = 1 blocks (1 count)
............... Results = 18 blocks
................... SQL     = 1 blocks (1 count)
................... Invalid = 17 blocks (17 count)

PL/SQL procedure successfully completed. 

This output doesn’t say anything else then what the following query lists:

select 
   status 
  ,count(*) 
  ,round(max(scan_count)) max_scan_cnt
  ,round(sum(block_count)) tot_blk_cnt
from 
  gv$result_cache_objects
where
  to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = '9830716796482200263'
and type = 'Result'
group by 
  status; 

STATUS      COUNT(*) MAX_SCAN_CNT TOT_BLK_CNT
--------- ---------- ------------ -----------
Invalid           17           49          17
Published          1            1           1
  • There are 18 result cache creations : Results = 18 blocks
  •  One result cache has been created and published : SQL = 1 blocks (1 count)
  •  17 result caches have been invalidated : Invalid = 17 blocks (17 count)

3. Summary

The SQL query result cache can introduce a serious performance threat if not correctly used. In the same vein as you want to parse once and execute many, you would also like to store a result cache once and reuse it many times. To do so you need to take care that the query you intend to cache the result will almost always return the same result (use the same bind variable values). You will also take care to keep the result in the result cache pool valid as long as possible. To do so you will try to avoid DMLing or DDLing the SQL query dependent objects.

July 7, 2018

Subquery unnesting that does not require costing

Filed under: Subquery Unnesting — hourim @ 6:16 am

There are two types of subquery unnesting: the first one is merged by the CBO with its outer query and the second one is turned into an inline view before being, eventually, merged with its outer query. While the latter is cost based, the former is imperatively done provided it passes the subquery unnesting validity check. This is why, we may, sometimes, pay a performance penalty because of the automatic subquery unnesting.

Automatic subquery unnesting that does not require costing

This is precisely what I have experienced at one of my customer sites:

insert into t1_targ
     (id)
select
     t2_id
from
     t2 lt1
where
     t2.l_id > 0
and  t2.status in (0,1,3)
and  t2.class = 'XyyZgg'
and EXISTS
         (select null
          from t2 lt2
          where lt2.t1_id  = lt1.l_id
          and   lt2.status = 2
          and   lt2.class != 'XyyZgg'
          );
Global Stats
=========================================
| Elapsed |   Cpu   |  IO      | Buffer |
| Time(s) | Time(s) | Waits(s) |  Gets  |
=========================================
|    4070 |    343  |     3724 |   31M  |
=========================================

SQL Plan Monitoring Details (Plan Hash Value=1860787001)
===================================================================
| Id    |                 Operation                   |       Name |
|       |                                             |            |
===================================================================
|     0 | INSERT STATEMENT                            |            |
|     1 |   LOAD TABLE CONVENTIONAL                   | T1_ARG     |
|     2 |    HASH JOIN                                |            |
|     3 |     NESTED LOOPS                            |            |        
|     4 |      NESTED LOOPS                           |            |
|     5 |       STATISTICS COLLECTOR                  |            |
|     6 |        INLIST ITERATOR                      |            |
|     7 |         TABLE ACCESS BY INDEX ROWID BATCHED | T2         |
|     8 |          INDEX RANGE SCAN                   | IDX_T262   |
|     9 |       INDEX UNIQUE SCAN                     | PK_T250    |
|    10 |     TABLE ACCESS BY INDEX ROWID             | T2         |
|    11 |    TABLE ACCESS BY INDEX ROWID BATCHED      | T2         |
|    12 |     INDEX SKIP SCAN                         | IDX_T262   |
===================================================================

As you can see the EXISTS subquery has been unnested and merged into the body of the outer query to form a join as the following lines taken from the corresponding 10053 trace file prove:

*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block SEL$1 (#1) that are valid to unnest.
Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing. -- automatic
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:   Passed validity checks. -- subquery is valid for unnesting
SU:   Transforming EXISTS subquery to a join. -- unnesting is done here
Registered qb: SEL$5DA710D3 0x64c1aaf8 (SUBQUERY UNNEST SEL$1; SEL$2)

That is how the automatic subquery unnesting works: provided there are no restrictions pre-empting it, the unnesting will be automatically done.

if, I add a restriction (rownum <=10) into the above query, the unnesting will be bypassed as the following proves:

insert into t1_targ
     (id)
select
     t2_id
from
     t2 lt1
where
     t2.l_id > 0
and  t2.status in (0,1,3)
and  t2.class = 'XyyZgg';
and EXISTS
         (select null
          from t2 lt2
          where lt2.t1_id  = lt1.l_id
          and   lt2.status = 2
          and   lt2.class != 'XyyZgg';
          and   rownum <= 10 -- restriction
          );

===================================================================
| Id    |                 Operation                   |       Name |
===================================================================
|     0 | INSERT STATEMENT                            |            |
|     1 |   LOAD TABLE CONVENTIONAL                   | T1_ARG     |
|     2 |    INLIST ITERATOR                          |            |
|     3 |     TABLE ACCESS BY INDEX ROWID BATCHED     | T2         | 
|     4 |      INDEX RANGE SCAN                       | IDX_T262   |                       
|     5 |       COUNT STOPKEY                         |            |
|     6 |        TABLE ACCESS BY INDEX ROWID          | T2         |            
|     7 |         INDEX UNIQUE SCAN                   | PK_T250    |
===================================================================

*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block SEL$1 (#1) that are valid to unnest.
Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:     SU bypassed: Query block references ROWNUM. -- restriction
SU:   Validity checks failed.-- unnesting is bypassed

At my client site, however, automatically unnesting the subquery was a dramatic choice made by the CBO as the historical statistics of the corresponding insert/select displayed below clearly show (sometimes more than 14,000 seconds without completing):

@HistStats
Enter value for sql_id:6gryp77nfgtv8
Enter value for from_date: 15052018

SNAP_BEGIN                PLAN_HASH_VALUE EXECS END_OF_FETCH  AVG_ETIME  AVG_LIO  
------------------------- --------------- ----- ------------ ----------  --------
16-MAY-18 12.00.13.600 AM      1860787001     0            0      3607    2975544     
16-MAY-18 01.00.20.220 PM      1860787001     1            1      3062    2708725
16-MAY-18 02.00.41.960 PM      1860787001     0            0      2509    3374627
16-MAY-18 03.00.48.113 PM      1860787001     0            0      4169    4970849
16-MAY-18 04.00.54.446 PM      1860787001     0            0      7659   16659725
16-MAY-18 05.00.04.833 PM      1860787001     0            0     13342   21798355
16-MAY-18 06.00.14.538 PM      1860787001     0            0     14245   35578453
../...
17-MAY-18 08.00.59.398 PM      1860787001     0            0      3544    3627666

But immediately when I cancelled the unnesting (using /*+ no_unnest */) I got a better execution time as shown below:

@HistStats
Enter value for sql_id:6gryp77nfgtv8
Enter value for from_date: 17052018

SNAP_BEGIN                PLAN_HASH_VALUE EXECS END_OF_FETCH  AVG_ETIME  AVG_LIO  
------------------------- --------------- ----- ------------ ----------  --------
17-MAY-18 08.00.59.398 PM      1860787001     0            0     3544    3627666
18-MAY-18 11.00.30.847 AM          530894     2            2     116      528385  -- 116 sec/exec 

Cost based subquery unnesting

The subquery unnesting ceases, however, to be imperatively done whenever the subquery is transformed into an inline view. In this case, even when there is nothing preventing the subquery from being unnested, Oracle will evaluate the cost of unnesting it before proceeding to the transformation. Here’s below a typical illustration (the model is at the end of this blog post):

explain plan for
 select
     a.n1
 from
    t1 a
where
    a.n1 = 10
and exists
    (select 
	    null 
	 from
	    t1 b
	   ,t2 c
	 where
	    b.n1 = c.n1
	 and 
	    b.n2 = a.n3
    );

select * from table(dbms_xplan.display);

----------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes |
----------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     1 |     9 |
|   1 |  NESTED LOOPS SEMI           |         |     1 |     9 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1      |     1 |     7 |
|*  3 |    INDEX UNIQUE SCAN         | T1_PK   |     1 |       |
|   4 |   VIEW PUSHED PREDICATE      | VW_SQ_1 |     1 |     2 | -- subquery into view
|   5 |    NESTED LOOPS SEMI         |         |     1 |    11 |
|*  6 |     TABLE ACCESS FULL        | T1      |     3 |    21 |
|*  7 |     INDEX UNIQUE SCAN        | T2_PK   |   300 |  1200 |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("A"."N1"=10)
   6 - filter("B"."N2"="A"."N3")
   7 – access("B"."N1"="C"."N1")

*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block SEL$1 (#1) that are valid to unnest.
Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:   Passed validity checks, but requires costing.  -- here where the CBO is costing the subquery unnesting
SU: Using search type: exhaustive
SU: Starting iteration 1, state space = (2) : (1)
SU:   Unnesting subquery query block SEL$2 (#2)Registered qb: SEL$683B0107 0x70409408 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)

Summary

Don’t be surprised to see Oracle, sometimes, using a dramatic execution plan when it unnests a subquery. This might be related to the automatic unnesting which does not require costing. It is done by Oracle provided the subquery passes the unnesting validity check and is not transformed into an inline view prior to its unnesting.

create table t1
as select
    rownum n1
   ,trunc((rownum-1)/3) n2
   ,mod(rownum,10) n3
from dual
connect by level <= 3e2;

create table t2
as select
    2*rownum n1
   ,trunc(rownum,5) n2
   ,lpad('x',5) vc
from dual
connect by level <= 3e2;

alter table t1 add constraint t1_pk primary key (n1);
alter table t2 add constraint t2_pk primary key (n1);

June 28, 2018

SQL Monitor and parallel processing

Filed under: SQL Monitor — hourim @ 6:25 pm

Very often, people get confused about the actual execution time taken by a parallel query. I’ve seen this asked a few times so that I have always thought it deserves a simple blog post. This is then the last article of the series about SQL monitor Duration and Elapsed time information where we’re going to become acquainted with how to get wall clock time of queries executed using parallelism.

SQL monitor and Parallelism

In the first article we learned that SQL monitor Elapsed time doesn’t sum up time spent outside the database by idle sessions. In the second one we outlined that the SQL monitor begins monitoring the query only when this one reaches its execution phase. In this article we will see that,for parallel query, the wall clock time, very often, approaches the Duration information of the SQL monitor report rather than that of the Elapsed time.

Having said that, here’s below a query which I will run using parallelism (t1 being a simple copy of all_objects table):

SQL> select /*+ parallel(4) */ 
        object_type
	  , count(1)
    from t1
    group by object_type;

43 rows selected.

Elapsed: 00:00:09.91

As you can see it took almost 10 seconds wall clock time for the above query to complete using the following parallel execution plan:

SQL_ID  08ka9ybgk43v1, child number 0
-------------------------------------
----------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |       |        |      |            |
|   1 |  PX COORDINATOR          |          |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)    | :TQ10001 |    43 |  Q1,01 | P->S | QC (RAND)  |
|   3 |    HASH GROUP BY         |          |    43 |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE           |          |    43 |  Q1,01 | PCWP |            |
|   5 |      PX SEND HASH        | :TQ10000 |    43 |  Q1,00 | P->P | HASH       |
|   6 |       HASH GROUP BY      |          |    43 |  Q1,00 | PCWP |            |
|   7 |        PX BLOCK ITERATOR |          | 23131 |  Q1,00 | PCWC |            |
|*  8 |         TABLE ACCESS FULL| T1       | 23131 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access(:Z>=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 4 because of hint

If you want to, or are planning to analyze the above parallel query through the SQL monitor report then pay attention to the following:

SELECT DBMS_SQLTUNE.report_sql_monitor(
  sql_id       => '08ka9ybgk43v1',
  type         => 'TEXT',
  report_level => 'ALL') AS report
FROM dual; 

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  1                   
 Session             :  XXXX(296:2203)   
 SQL ID              :  08ka9ybgk43v1       
 SQL Execution ID    :  16777216            
 Execution Started   :  06/28/2018 07:30:52 
 First Refresh Time  :  06/28/2018 07:30:54 
 Last Refresh Time   :  06/28/2018 07:31:02 
 Duration            :  10s                 
 Module/Action       :  SQL*Plus/-          
 Service             :  YYYYY
 Program             :  sqlplus.exe         
 Fetch Calls         :  4                   

Global Stats
=========================================================
| Elapsed |   Cpu   |    IO    | Application |  Other   |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) |
=========================================================
|      32 |    0.91 |       29 |        1.74 |     0.08 |
=========================================================

It is, again, the Duration of the Global Information which gives the wall clock time taken by the query executed via parallelism. The Elapsed Time of the Global Stats section represents time taken by the Query Coordinator and all its parallel slaves as clearly shown in the same SQL monitor report:

Parallel Execution Details (DOP=4 , Servers Allocated=8)
=================================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    | Application |
|                |       |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   |
=================================================================================
| PX Coordinator | QC    |         |    2.26 |    0.01 |     0.50 |        1.74 |
| p000           | Set 1 |       1 |    0.00 |    0.00 |          |             |
| p001           | Set 1 |       2 |    0.00 |    0.00 |          |             |
| p002           | Set 1 |       3 |    0.00 |    0.00 |          |             |
| p003           | Set 1 |       4 |    0.00 |    0.00 |          |             |
| p004           | Set 2 |       1 |    7.38 |    0.20 |     7.17 |             |
| p005           | Set 2 |       2 |    7.22 |    0.22 |     6.97 |             |
| p006           | Set 2 |       3 |    7.57 |    0.27 |     7.28 |             |
| p007           | Set 2 |       4 |    7.32 |    0.21 |     7.10 |             |
=================================================================================

Based on the information provided by the gv$sql_monitor view I’ve developed the following scripts showing how the query execution time has been distributed between the Query Coordinator and its parallel slaves

-- this gives execution time consumed by parallel slaves
SQL> @PXElaps
Enter value for sql_id: 08ka9ybgk43v1
Enter value for exec_id: 16777216

SQL_ID        SQL_EXEC_ID        SID PROCE    ELAPSED
------------- ----------- ---------- ----- ----------
08ka9ybgk43v1    16777216        254 p003           0
08ka9ybgk43v1    16777216        193 p000           0
08ka9ybgk43v1    16777216        214 p001           0
08ka9ybgk43v1    16777216        234 p002           0
08ka9ybgk43v1    16777216        299 p005        7.22
08ka9ybgk43v1    16777216          2 p007        7.32
08ka9ybgk43v1    16777216        277 p004        7.38
08ka9ybgk43v1    16777216        318 p006        7.57
                                           ----------
Total Slaves                                    29.49

8 rows selected.
-- this gives execution time consumed by Query Coordinator
SQL> @QCElaps
Enter value for sql_id: 08ka9ybgk43v1
Enter value for exec_id: 16777216

SQL_ID           QC_TIME        DOP PX_SERVERS_REQUESTED PX_SERVERS_ALLOCATED
------------- ---------- ---------- -------------------- --------------------
08ka9ybgk43v1       2.26          4                    8                    8 
-- this gives execution time consumed by Query Coordinator and its PX slaves
SQL> @QCPXElaps
Enter value for sql_id: 08ka9ybgk43v1
Enter value for sql_exec_id: 16777216

SQL_ID                           SID PROCE      ELAPS
------------------------- ---------- ----- ----------
08ka9ybgk43v1                    254 p003           0
08ka9ybgk43v1                      2 p007        7.32
08ka9ybgk43v1                    214 p001           0
08ka9ybgk43v1                    299 p005        7.22
08ka9ybgk43v1                    296 QC          2.26
08ka9ybgk43v1                    193 p000           0
08ka9ybgk43v1                    277 p004        7.38
08ka9ybgk43v1                    318 p006        7.57
08ka9ybgk43v1                    234 p002           0
                                           ----------
Elapsed time(s)                                 31.75

9 rows selected.

The devil is in the details
Before you start crying ‘yes, but what about…‘, make sure you understand the following statistics of AWR historical execution of a particular parallel query taken from a running system:

SQL> @HistStats
Enter value for sql_id: a8yny7n0x1gj1
Enter value for from_date: 01062018

SNAP_BEGIN                PLAN_HASH_VALUE    EXECS END_OF_FETCH  AVG_ETIME    AVG_PIO    AVG_LIO   AVG_ROWS
------------------------- --------------- -------- ------------ ---------- ---------- ---------- ----------
20-JUN-18 08.00.58.250 PM       765437158        1            0        665    4708199    4789407          0
25-JUN-18 08.00.43.224 PM       765437158        1            0        773    5446986    5545989          0

At first glance many persons will think that this query took 665 seconds. In effect, we should not consider the above average elapsed time without the fundamental information which is: avg_px

agv_px = px_servers_execs_delta/decode (executions_delta, 0, 1, executions_delta)

This metric indicates whether the query runs parallel or not. If it is not null then the AVG_ETIME represents the cumulated execution time of taken by the Query Coordinator and all its parallel slaves used to run the query concurrently.

Here’s below the same AWR historical execution statistics augmented with the avg_px metric:

@HistStats
Enter value for sql_id:  a8yny7n0x1gj1
Enter value for from_date: 01052018

SNAP_BEGIN                PLAN_HASH_VALUE EXECS END_OF_FETCH  AVG_ETIME AVG_PX   AVG_LIO 
------------------------- --------------- ----- ------------ ---------- ------  --------
20-JUN-18 08.00.58.250 PM      765437158      1            0      665      8     4789407    
25-JUN-18 08.00.43.224 PM      765437158      1            0      773      8     5545989

Thanks to the avg_px supplementary information we know now that this query has used parallelism during its execution and that the AVG_ETIME of 665 seconds (or 773) corresponds to the Elapsed time taken by all parallel servers(8) including the query coordinator. If you want to measure the actual wall clock time taken by this query you have then to use the following approximative arithmetic:

 ag_wc_time = trunc(avg_etime/decode(avg_px,0,1,avg_px) ,2) avg_px_time

Which when implemented into the above Histstats script gives this:

@HistStats
Enter value for sql_id:a8yny7n0x1gj1
Enter value for from_date: 01052018

SNAP_BEGIN                PLAN_HASH_VALUE EXECS END_OF_FETCH  AVG_ETIME AVG_PX AVG_WC_TIME  
------------------------- --------------- ----- ------------ ---------- ------ ----------- 
20-JUN-18 08.00.58.250 PM      765437158      1            0      665      8        166.25     
25-JUN-18 08.00.43.224 PM      765437158      1            0      773      8        193.25 

Finally we are able to say that, approximatively, this query took about 193 seconds wall clock time during it execution of June the 25th at around 8PM.

Since the above query used parallelism it has been monitored as the following proves:

SQL> select
       sql_exec_id
       ,to_char(sql_exec_start, 'dd/mm/yyyy hh24:mi:ss') exec_date
    from
      gv$sql_monitor
    where
      sql_id = 'a8yny7n0x1gj1';

SQL_EXEC_ID EXEC_DATE
----------- -------------------
 16777225    25/06/2018 20:01:09

SELECT DBMS_SQLTUNE.report_sql_monitor(
  sql_id       => 'a8yny7n0x1gj1',
  sql_exec_id  =>  16777225,
  type         => 'TEXT',
  report_level => 'ALL') AS report
FROM dual;

Global Information
------------------------------
 Status              :  DONE (ERROR)             
 Instance ID         :  1                     
 SQL ID              :  a8yny7n0x1gj1  
 SQL Execution ID    :  16777225                   
 Execution Started   :  06/25/2018 20:01:09   
 First Refresh Time  :  06/25/2018 20:01:09  
 Last Refresh Time   :  06/25/2018 20:04:20   
 Duration            :  191s                  
                 
Global Stats
================================
| Elapsed |   Cpu   |  IO      |
| Time(s) | Time(s) | Waits(s) |
================================
|  776    |  236    |   538    | 
================================

Notice how using AWR historical tables(HistStats.sql) we are able to derive, almost reliably, the actual time taken by a query ran in parallel.

Summary
SQL monitor feature monitors performance of SQL statements while they execute. It is started automatically when a SQL statement runs parallel or serial but has consumed at least 5 seconds either on CPU or on I/O wait. The parsed phased of SQL statements is not monitored. In order to have an idea about the exact duration of the SQL statement execution then we have better to rely on the Duration information rather than on the Elapsed Time of the Global Stats section of the report. This is because Elapsed Time doesn’t include time spent by idle session fetching data for example. And when a SQL statement runs parallel, the Elapsed Time includes execution time of both the Query Coordinator and all its parallel server slaves while the Duration represents the wall clock time of the query that runs both serial and parallel.

SQL scritps

@PxElaps
compute sum label 'Total Slaves time' of elapsed on report
break   on report
select
          sql_id
		 ,sql_exec_id
		 ,sid		 
         , process_name 		 
         , round(elapsed_time/1e6,2) elapsed
       --  , round(cpu_time/1e6) cpu
         --, round(concurrency_wait_time/1e6,2) conc
       --  , round(user_io_wait_time/1e6) IO_Waits
   from gv$sql_monitor
   where sql_id = '&sql_id'   
   and sql_exec_id = '&exec_id'
   and sql_text is  null
   order by round(elapsed_time/1e6,2);
@QCElaps
select
           sql_id
		 , round(elapsed_time/1e6,2) QC_time
         , px_maxdop DOP
         , px_servers_requested
         , px_servers_allocated        
         , round(cpu_time/1e6) cpu
         --, round(concurrency_wait_time/1e6,2) conc
         , round(user_io_wait_time/1e6) IO_Waits
   from gv$sql_monitor
   where sql_id = '&sql_id'
   and sql_exec_id = '&exec_id'
   and sql_text is not null;
@QCPXElaps
column label format a25
col sql_id format a25
compute sum label 'Elapsed time(s)' of elaps on report
break   on report
select
          sql_id	
         ,sid          		 
         ,decode(process_name, 'ora', 'QC', process_name) process
         ,round(elapsed_time/1e6,2) elaps
from gv$sql_monitor
where sql_id    = '&sql_id'   
and sql_exec_id = '&exec_id';
Next Page »

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)