November | 2019 | Mohamed Houri’s Oracle Notes

November 28, 2019

DOP Downgrade

Filed under: Oracle — hourim @ 5:53 pm

Suppose you are confronted to a performance issue due to an initially optimized parallel execution plan that has been 100% DOP downgraded. Suppose also that this DOP downgrade is due to an Oracle Resource Manager max DOP limit directive.
Would you be able to point out this DOP downgrade root cause using gv$sql_monitor view? Would you be able as well to say, a posteriori (using AWR tables), that your issue was due to ORM max DOP limit?

This blog post aims to answer the above two questions.

1.Replaying the issue using a reproducible model

While it is not so straightforward to link a real-life performance issue to a DOP downgrade due to a resource manager maxim allowed DOP, it is nevertheless, very easy to model and reproduce this case at will.

RDEV>  create table t1 as select a.* from dba_objects a,dba_objects where rownum<= 1e6;

RDEV2> alter table t1 parallel 4;

RDEV2> select count(1) from t1 where owner >= 'A';

Parallel Execution Details (DOP=4 , Servers Allocated=4)
===================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    |
|                |       |         | Time(s) | Time(s) | Waits(s) |
===================================================================
| PX Coordinator | QC    |         |    0.12 |    0.01 |          |
| p000           | Set 1 |       1 |    0.20 |    0.03 |     0.15 |
| p001           | Set 1 |       2 |    0.20 |    0.03 |     0.15 |
| p002           | Set 1 |       3 |    0.20 |    0.04 |     0.15 |
| p003           | Set 1 |       4 |    0.20 |    0.04 |     0.12 |
===================================================================
-------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  |    TQ  |IN-OUT|
-------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |        |      |
|   1 |  SORT AGGREGATE        |          |     1 |        |      |
|   2 |   PX COORDINATOR       |          |       |        |      |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |  Q1,00 | P->S |
|   4 |     SORT AGGREGATE     |          |     1 |  Q1,00 | PCWP |
|   5 |      PX BLOCK ITERATOR |          |  1000K|  Q1,00 | PCWC |
|*  6 |       TABLE ACCESS FULL| T1       |  1000K|  Q1,00 | PCWP |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("OWNER">='A')

Note
-----
   - Degree of Parallelism is 4 because of table property

As you can see, the Degree Of Parallelism 4 has been honored and the SQL query has been executed using Parallelism.
However, if under the hood, I create an Oracle Resource Manager which will limit the maximum DOP allowed to be used by the current user (RDEV) then the same query will be DOP downgraded causing performance pain like
in one of my real-life case (a blog post will follow):

BEGIN
-- clear pending area
  DBMS_RESOURCE_MANAGER.clear_pending_area();
  DBMS_RESOURCE_MANAGER.create_pending_area();
  
-- create resource plan
  DBMS_RESOURCE_MANAGER.create_plan(
    plan    => 'DOPDOWNGRAD_RPL',
    comment => 'Resource Plan limiting DOP');
    
-- create consumer group
  DBMS_RESOURCE_MANAGER.create_consumer_group(
    consumer_group => 'DOPDONWGRAD_RSG',
    comment        => 'LOW Parallel Priority group');

-- assign the DOP limitation to the consumer group
  DBMS_RESOURCE_MANAGER.create_plan_directive (
    plan                     => 'DOPDOWNGRAD_RPL',
    group_or_subplan         => 'DOPDONWGRAD_RSG',
    comment                  => 'LOW PDOP Priority',    
    parallel_degree_limit_p1 => 1);

-- This is mandatory to avoid ORA-29377: consumer group OTHER_GROUPS is not part of top-plan DOPDOWNGRAD_RPL
  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
    PLAN             => 'DOPDOWNGRAD_RPL',
    GROUP_OR_SUBPLAN => 'OTHER_GROUPS',
    MGMT_P1          => 10,
    COMMENT          => 'Directive for OTHER_GROUPS (mandatory)');
    
 -- validate the resource plan
  DBMS_RESOURCE_MANAGER.validate_pending_area;
  DBMS_RESOURCE_MANAGER.submit_pending_area();
END;
/

The preceding anonymous PL/SLQ block creates the DOPDOWNGRAD_RPL resource plan, attaches to it the DOPDONWGRAD_RSG consumer group and assigns a directive to this consumer group so that maximum DOP allowed by this consumer group would be 1.

To finish the setup, I will assign the user RDEV to this consumer group and set the DOPDOWNGRAD_RPL plan as the default resource plan:

-- Affect user RDEV to resource group
BEGIN
    DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
    DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
       ATTRIBUTE      => DBMS_RESOURCE_MANAGER.ORACLE_USER,
       VALUE          => 'RDEV',
       CONSUMER_GROUP => 'DOPDONWGRAD_RSG');
	  
   DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
   DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
END;
/

We are now ready to run again the same query and check whether it has been executed parallelly or it has been pre-empted by the DOPDOWNGRAD_RPL resource plan to use parallelism:

RDEV> alter system set resource_manager_plan='DOPDOWNGRAD_RPL';

System altered.

RDEV> show parameter resource_manager_plan

NAME                    TYPE        VALUE
----------------------- ----------- ----------------
resource_manager_plan   string      DOPDOWNGRAD_RPL

Let’s execute the same query using RDEV user. Remember that this user belongs to an oracle resource manager group with a parallel degree limit directive as shown above:

SQL> SELECT /*+ monitor */ count(1) from t1 where owner >= 'A';

  COUNT(1)
----------
   1000000

RDEV> @xpsimp

SQL_ID  b60wdaqys7wf9, child number 0
-------------------------------------
SELECT /*+ monitor */ count(1) from t1 where owner >= 'A'

Plan hash value: 3110199320
---------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes |   TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |       |       |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |     4 |       |      |            |
|   2 |   PX COORDINATOR       |          |       |       |       |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |     4 | Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |     4 | Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |  1000K|  3906K| Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| T1       |  1000K|  3906K| Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("OWNER">='A')

The absence of the Note line at the bottom of the execution plan represents, already, a first indication that the query has not been executed using parallelism. Indeed, this is confirmed via the corresponding SQL Monitoring report:

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  1                   
 Session             :  RDEV (273:24189)    
 SQL ID              :  b60wdaqys7wf9       
 SQL Execution ID    :  16777216            
 Execution Started   :  11/07/2019 09:02:53 
 First Refresh Time  :  11/07/2019 09:02:53 
 Last Refresh Time   :  11/07/2019 09:02:53 
 Duration            :  .414398s            
 Module/Action       :  SQL*Plus/-          
 Service             :  diiqazbx            
 Program             :  sqlplus.exe         
 DOP Downgrade       :  100%        --> here            
 Fetch Calls         :  1

The ACTIVE type of the same SQL Monitoring report shows the reason:

Other Plan Line Statistics

   Min DOP after downgrade 1
         downgrade reason  352

Thanks to Yasin Baskan, I knew that the description of the downgrade reason codes differs from release to release. We need to use the following query in the corresponding database (12cR2 in the current case) to get the actual significance of each code:

-- +-----------------------------------|
-- query supplied by Yasin Baskan
-- SQL> desc x$qksxa_reason
--		 Name           Type
--		 -------------- ---------------|
--		 ADDR           RAW(8)
--		 INDX           NUMBER
--		 INST_ID        NUMBER
--		 CON_ID         NUMBER
--		 QKSXAREASONS   VARCHAR2(255)
-- +-----------------------------------|
col qksxareasons format a100
col indx         format 999
SELECT
    indx
   ,qksxareasons
FROM
    x$qksxa_reason
WHERE
    qksxareasons like '%DOP downgrade%';   	


INDX QKSXAREASONS
---- -----------------------------------------------------
 351 DOP downgrade due to adaptive DOP
 352 DOP downgrade due to resource manager max DOP
 353 DOP downgrade due to insufficient number of processes
 354 DOP downgrade because slaves failed to join

We finally, have found the real root cause of the simulated performance issue: 352 DOP downgrade due to resource manager max DOP

2. Back to the aim of the article

The first questions that came to my mind when I have started troubleshooting the DOP Downgrade real-life case were:

• Can I find any Oracle Resource manager related information in the gv$sql_monitor view ?
• Can I find historical (AWR) execution statistics of any active Oracle Resource manager actions ?

2.1. v$sql_monitor

As per regards to the first question, hopefully the gv$sql_monitor view contains 4 columns which seem to be strictly related to the Oracle resource manager as shown below with column name starting by RM_:

SQL> @desc gv$sql_monitor

 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 INST_ID                                            NUMBER
 KEY                                                NUMBER
 REPORT_ID                                          NUMBER
 STATUS                                             VARCHAR2(19)
 USER#                                              NUMBER
 USERNAME                                           VARCHAR2(128)
../..
 RM_LAST_ACTION                                     VARCHAR2(48)
 RM_LAST_ACTION_REASON                              VARCHAR2(128)
 RM_LAST_ACTION_TIME                                DATE
 RM_CONSUMER_GROUP                                  VARCHAR2(128)

Let’s see then what Oracle has stored in those four RM_ columns:

RDEV> SELECT
 decode(process_name, 'ora', 'QC', process_name) as process_name
 ,rm_consumer_group
 ,rm_last_action
 ,rm_last_action_reason
 ,rm_last_action_time
     FROM
        gv$sql_monitor
    WHERE 
        sql_id = 'b60wdaqys7wf9';

PROCES RM_CONSUMER_GROUP RM_LAST_ACTION RM_LAST_ACTION_REASO RM_LAST_ACTION_TIME
------ ----------------- -------------- -------------------- --------------------
QC     DOPDONWGRAD_RSG   {null}               {null}               {null}

Unfortunately, it looks like only the column RM_CONSUMER_GROUP is filled up by Oracle. The remaining RM_ columns are always null.

However, it looks like that whenever the column RM_CONSUMER_GROUP contains a value which is not equal to OTHER_GROUPS (like in the current example where RM_CONSUMER_GROUP= DOPDONWGRAD_RSG), we can infer that the current monitored query
is subject to a resource manager directive which might influence its execution.

For example, let’s execute the same query when connected with a different user (RDEV2) and see what information will be stored in the gv$sql_monitor view:

RDEV2> SELECT /*+ monitor */ count(1) from t1 where owner >= 'A'

COUNT(1)
----------
   1000000

SQL_ID  b60wdaqys7wf9, child number 2
-------------------------------------
SELECT /*+ monitor */ count(1) from t1 where owner >= 'A'
Plan hash value: 3110199320
--------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |        |      |            |
|   2 |   PX COORDINATOR       |          |       |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |  1000K|  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| T1       |  1000K|  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("OWNER">='A')

Note
-----
   - Degree of Parallelism is 4 because of table property

RDEV2> SELECT  
   username
  ,decode(process_name, 'ora', 'QC', process_name) as process_name
  ,rm_consumer_group
  ,rm_last_action
  ,rm_last_action_reason
  ,rm_last_action_time
      FROM
        gv$sql_monitor
      WHERE 
        sql_id = 'b60wdaqys7wf9'
order by last_refresh_time;

USERNAME  PROCES  RM_CONSUMER_GROUP  RM_LAST_ACTION  RM_LAST_ACTION_REASO  RM_LAST_ACTION_TIME
-------- -------- ----------------- ---------------- -------------------- --------------------
RDEV      QC     DOPDONWGRAD_RSG    {null}           {null}                {null}
RDEV2     QC     OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p002   OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p000   OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p003   OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p001   OTHER_GROUPS       {null}           {null}                {null}

When the same query is executed by user RDEV2 there is no DOP downgrade because this user doesn’t belong to any user created resource manager group. There are two information backing this claim:

• User RDEV2 has OTHER_GROUPS as a default consumer group
• There is 4 PX server associated with the QC of user RDEV2

3.2. AWR and Oracle Resource manager
The second investigation was to see whether I can, a posteriori, diagnose that my query underwent a resource manager directive; particularly those concerning parallel degree limit. For that reason, I wrote the following script

SQL> @getConsGrpHist
SELECT
   begin_time
  ,execs 
  ,avg_etime
  ,avg_px
 -- ,trunc(avg_etime/decode(avg_px,0,1,avg_px) ,2) avg_px_time
  ,plan_name
  ,cpu_managed
  ,parallel   
FROM
   (SELECT 	   
        sn.begin_interval_time      begin_time
       ,st.executions_delta         execs 
       ,st.end_of_fetch_count_delta end_of_fetch 
       ,trunc(st.elapsed_time_delta/1e6/decode(st.executions_delta, 0, 1,st.executions_delta))   avg_etime
       ,round(st.px_servers_execs_delta/decode(st.executions_delta,0,1, st.executions_delta), 1) avg_px	  
       ,sr.plan_name
       ,sr.cpu_managed
       ,sr.parallel_execution_managed parallel    
    FROM 
         dba_hist_rsrc_plan sr
        ,dba_hist_snapshot  sn
        ,dba_hist_sqlstat   st
    WHERE
	sr.snap_id = sn.snap_id
    AND sn.snap_id = st.snap_id
    AND sn.begin_interval_time between to_date('&from_date', 'ddmmyyyy hh24:mi:ss')
				      and    to_date('&to_date'  , 'ddmmyyyy hh24:mi:ss')
    AND st.sql_id ='&sql_id'
    AND plan_name != 'DEFAULT_MAINTENANCE_PLAN'	
	) 
order by begin_time asc;      

SQL> @GetConsGrpHist
Enter value for from_date: 08112019 15:07:00
Enter value for to_date: 08112019 15:09:00
Enter value for sql_id: b60wdaqys7wf9

BEGIN_TIME                 EXECS  AVG_ETIME  AVG_PX PLAN_NAME       CPU_MANAGED PARALLEL
------------------------- ------ ---------- ------- --------------- ----------- ----------
08-NOV-19 03.07.08.369 PM      1          0       0 DOPDOWNGRAD_RPL ON          FULL
08-NOV-19 03.07.08.369 PM      1          0       0 DOPDOWNGRAD_RPL ON          FULL

Indeed, I can see that my query has used a non-default resource plan (DOPDOWNGRAD_RPL) in which a parallel statement queuing is enabled as indicated by the FULL value of the PARALLEL column.

Bottom-line

In this simple Note I outlined two main points:

• Show how to identify a DOP downgrade and its reason
• Show how to link a non-default Oracle resource manager to a SQL_ID by checking the value of the column RM_CONSUMER_GROUP from the gv$sql_monitor view
• Show that the remaining 3 resource manager columns (RM_LAST_ACTION, RM_LAST_ACTION_REASON, RM_LAST_ACTION_TIME) in the gv$sql_monitor view are, unfortunately, not maintained by Oracle
• Show how to look in AWR historical execution for queries subject to user defined Oracle Resource Manager directives

Comments (1)

November 21, 2019

Incremental, histogram and fix control

Filed under: Statistics — hourim @ 5:35 pm

Due to my zeal to understand the curious comportment of the non-usage of histogram in a partitioned table under incremental mode, I decided to write this 4th part article in the series of blog posts on wrong cardinality estimation due to the non-utilization of histogram during query optimization:

Part I where I wrongly put the blame of wrong cardinality estimation on corrupted histogram
Part II where I examined via a real-life example the impact of this wrong cardinality estimation on the query performance
Part III where I showed that, in fact, the wrong cardinality estimation is due to the incremental mode which, under certain circumstances, forces Oracle to ignore the usage of histogram

In this 4th Part, I would like to share with you a couple of ways you can use to make Oracle not ignoring the usage of histogram while keeping the incremental mode up (TRUE).

1. Fix Control 13583722

In Part III I mentioned that the dbms_stats trace file shows the presence of the 13583722 fix control

DBMS_STATS:          Need Actual Values (DSC_EAVS)                    
DBMS_STATS:          Partition: XZ_JOB_T_12                           
DBMS_STATS:          Histogram Type: HYBRID Data Type: 2              
DBMS_STATS:          Histogram Flags: 4 Histogram Gathering Flags: 10 
DBMS_STATS:          Incremental: TRUE Fix Control 13583722: 1

The definition of this fix control is :

SQL> select  bugno, description from V$SYSTEM_FIX_CONTROL where bugno = '13583722';

     BUGNO DESCRIPTION
---------- ----------------------------------------------------------------
  13583722 preserve partition level histograms & actual values for incremen

Let’s see what happens if I disable this fix control?

But first, here’s below the cardinality estimation before modifying the fix control :

select /*+ full(a) */
         count(1)
     from
        XZ_JOB a
     where EMPLID>= 'XZ990501004'
     and   EMPLID<= 'XZ990562000'; 

  COUNT(1)
----------
     44724

Plan hash value: 2916242030
----------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |     1 |       |       |
|   1 |  SORT AGGREGATE             |        |     1 |       |       |
|   2 |   PARTITION RANGE SINGLE    |        |     7 |    42 |    42 |
|*  3 |    TABLE ACCESS STORAGE FULL| XZ_JOB |     7 |    42 |    42 |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - storage("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000')
       filter("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000')

This wrong cardinality estimation of 7 instead of 44K, is due to Oracle ignoring the usage of histogram when visiting partition n°42

@getPartTab
compute sum label 'Total num_rows' of num_rows on report
break   on report
set verify off
select 
    table_name
   ,partition_name     pname
   ,partition_position ppos
   ,num_rows
from    
  (select
       table_name
      ,partition_name      
      ,partition_position  	
      ,num_rows	   
   from
      all_tab_partitions
  where 
    table_owner = upper('&owner')
  and
    table_name = upper('&table_name')
);

Enter value for owner: sysadm
Enter value for table_name: XZ_JOB

Enter value for owner: sysadm
Enter value for table_name: XZ_JOB

TABLE_NAME   PNAME             PPOS   NUM_ROWS 
------------ --------------- ------ ---------- 
XZ_JOB       XZ_JOB_T_1           1     721565 
XZ_JOB       XZ_JOB_T_2           2     213356 
XZ_JOB       XZ_JOB_T_3           3      35906 
../..
XZ_JOB       XZ_JOB_T_42         42      46919  -->
XZ_JOB       XZ_JOB_T_43         43      39403 
XZ_JOB       XZ_JOB_T_48         48     110980 
XZ_JOB       XZ_JOB_T_49         49    1342747 
                                      ----------
Total num_rows                         5118455

@PartHist
SQL> select
       partition_name
      ,num_distinct
      ,density
      ,histogram
       ,notes
    from
        all_part_col_statistics
    where owner    = 'SYSADM'
    and table_name = 'XZ_JOB'
    and column_name = 'EMPLID';

PARTITION_NAME  NUM_DISTINCT    DENSITY HISTOGRAM  NOTES
--------------- ------------ ---------- ---------- ---------------------
XZ_JOB_T_1             14954    ,000067 HYBRID
XZ_JOB_T_2              5141    ,000035 HYBRID
XZ_JOB_T_3              1059    ,000941 HYBRID     HIST_FOR_INCREM_STATS
../..
XZ_JOB_T_42             6372     ,00008 HYBRID     HIST_FOR_INCREM_STATS -->
XZ_JOB_T_43             6071    ,000165 HYBRID     HIST_FOR_INCREM_STATS
XZ_JOB_T_48            21812    ,000046 HYBRID     HIST_FOR_INCREM_STATS
XZ_JOB_T_49           243840    ,000004 HYBRID     HIST_FOR_INCREM_STATS

49 rows selected.

E-Rows = partition num_rows/ NDV = 46919/6372 = 7.36~7

And here’s the new cardinality estimation after the fix control change

SQL> alter session set "_fix_control" = '13583722:0';

SQL> exec dbms_stats.delete_table_stats
          (ownname=>'SYSADM',tabname=>'XZ_JOB',partname=>'XZ_JOB_T_42');

SQL> exec dbms_stats.gather_table_stats
        (ownname=>'SYSADM',tabname=>'XZ_JOB',          
         partname=>'XZ_JOB_T_42',granularity=>'PARTITION');

@PartHist
PARTITION_NAME  NUM_DISTINCT    DENSITY HISTOGRAM NOTES
--------------- ------------ ---------- --------- ----------------------
XZ_JOB_T_1             14954    ,000067 HYBRID
XZ_JOB_T_2              5141    ,000035 HYBRID
XZ_JOB_T_3              1059    ,000941 HYBRID    HIST_FOR_INCREM_STATS
../..
XZ_JOB_T_42             6372    ,000082 HYBRID
XZ_JOB_T_43             6071    ,000165 HYBRID    HIST_FOR_INCREM_STATS
XZ_JOB_T_48            21812    ,000046 HYBRID    HIST_FOR_INCREM_STATS
XZ_JOB_T_49           243840    ,000004 HYBRID

49 rows selected.

select /*+ full(a) */
         count(1)
     from
        XZ_JOB a
     where EMPLID>= 'XZ990501004'
     and   EMPLID<= 'XZ990562000'
  COUNT(1)
----------
     44724

----------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |       |
|   1 |  SORT AGGREGATE             |        |     1 |       |       |
|   2 |   PARTITION RANGE SINGLE    |        | 44649 |    42 |    42 |
|*  3 |    TABLE ACCESS STORAGE FULL| XZ_JOB | 44649 |    42 |    42 |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - storage(("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000'))
       filter(("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000'))

As you can see the fix control 13583722 has made Oracle not ignoring the usage of histogram when visiting partition n°42 (Notes column is null) which explains why the new cardinality estimation is excellent.

2. Fix Control 16726844

As far as fix control goes I decided to check whether I can find any other fix control that might be related to this incremental-histogram issue:

SQL> select bugno, value, description
     from v$system_fix_control
     where description like '%incremen%';

     BUGNO      VALUE DESCRIPTION
---------- ---------- ----------------------------------------------------------------
  10175079          1 increment kafcoref to simulate obsolete fropqop list
  13583722          1 preserve partition level histograms & actual values for incremen
  16726844          1 consider histogram request in incremental stats gathering

The fix control 16726844 seems to be also related to the usage of histogram when incremental mode is set to TRUE. Let’s change its value and check again (before changing the fix control I have of course reproduced the wrong cardinality estimation of 7):

SQL> alter session set "_fix_control" = '16726844:0';

SQL> exec dbms_stats.delete_table_stats(ownname=>'SYSADM',tabname=>'XZ_JOB',partname=>'XZ_JOB_T_42');

SQL> exec dbms_stats.gather_table_stats(ownname=>'SYSADM',tabname=>'XZ_JOB', partname=>'XZ_JOB_T_42',granularity=>'PARTITION');

----------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |       |
|   1 |  SORT AGGREGATE             |        |     1 |       |       |
|   2 |   PARTITION RANGE SINGLE    |        | 44598 |    42 |    42 |
|*  3 |    TABLE ACCESS STORAGE FULL| XZ_JOB | 44598 |    42 |    42 |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
--------------------------------------------------
   3 - storage(("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000'))
       filter(("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000')))

Indeed, this fix control has also made Oracle using the histogram when visiting partition n°42.

3. For columns size 254 EMPLID

The last option I have checked and used to workaround this issue of not using histogram in presence of incremental mode is to force histograms to be kept by manually specifying bucket size 254:

  method_opt => 'FOR ALL COLUMNS SIZE AUTO, FOR COLUMNS SIZE 254 EMPLID'

SQL> exec dbms_stats.delete_table_stats(ownname=>'SYSADM',tabname=>'XZ_JOB');

SQL> BEGIN 
        dbms_stats.gather_table_stats
        (ownname=>'SYSADM',tabname=>'XZ_JOB'
         ,method_opt => 'for all columns size auto, for columns size 254 emplid'
	 ,degree => 4);
     END;
     /

SQL> @GlobHist

TABLE_NAME   COLUMN_NAME   HISTOGRAM       NOTES
------------ ------------- --------------- ------------
XZ_JOB       EMPLID        HYBRID          INCREMENTAL

SQL> @PartHist

PARTITION_NAME  NUM_DISTINCT    DENSITY HISTOGRAM       NOTES
--------------- ------------ ---------- --------------- -----------
XZ_JOB_T_1             14954    ,000067 HYBRID
XZ_JOB_T_10             1232    ,000805 HYBRID
XZ_JOB_T_11             1191    ,000832 HYBRID
XZ_JOB_T_12             1153    ,000861 HYBRID
XZ_JOB_T_13             1187    ,000837 HYBRID
XZ_JOB_T_14             1390    ,000703 HYBRID
XZ_JOB_T_15             1733    ,000555 HYBRID
XZ_JOB_T_16             1950    ,000476 HYBRID
XZ_JOB_T_17             1842    ,000512 HYBRID
XZ_JOB_T_18             1584    ,000611 HYBRID
XZ_JOB_T_19             1714     ,00056 HYBRID
XZ_JOB_T_2              5141    ,000031 HYBRID
XZ_JOB_T_20             1410    ,000695 HYBRID
XZ_JOB_T_21             1460    ,000671 HYBRID
XZ_JOB_T_22             1899    ,000493 HYBRID
XZ_JOB_T_23             1916    ,000489 HYBRID
XZ_JOB_T_24             1568    ,000618 HYBRID
XZ_JOB_T_25             1550    ,000627 HYBRID
XZ_JOB_T_26             1697    ,000568 HYBRID
XZ_JOB_T_27             1637    ,000593 HYBRID
XZ_JOB_T_28             1528    ,000639 HYBRID
XZ_JOB_T_29             1629    ,000595 HYBRID
XZ_JOB_T_3              1059    ,000941 HYBRID
XZ_JOB_T_30             1479    ,000662 HYBRID
XZ_JOB_T_31             1480    ,000661 HYBRID
XZ_JOB_T_32             1839    ,000514 HYBRID
XZ_JOB_T_33             3083    ,000319 HYBRID
XZ_JOB_T_34             7756    ,000028 HYBRID
XZ_JOB_T_35             2789    ,000359 HYBRID
XZ_JOB_T_36             3669    ,000273 HYBRID
XZ_JOB_T_37             7947    ,000071 HYBRID
XZ_JOB_T_38             3277    ,000305 HYBRID
XZ_JOB_T_39             8466    ,000118 HYBRID
XZ_JOB_T_4               960     ,00104 HYBRID
XZ_JOB_T_40             7836    ,000044 HYBRID
XZ_JOB_T_41             4873    ,000162 HYBRID
XZ_JOB_T_42             6372    ,000077 HYBRID
XZ_JOB_T_43             6071    ,000165 HYBRID
XZ_JOB_T_44             9642    ,000104 HYBRID
XZ_JOB_T_45             9121     ,00011 HYBRID
XZ_JOB_T_46             8028    ,000125 HYBRID
XZ_JOB_T_47             5635    ,000177 HYBRID
XZ_JOB_T_48            21812    ,000046 HYBRID
XZ_JOB_T_49           243840    ,000004 HYBRID
XZ_JOB_T_5              1764    ,000541 HYBRID
XZ_JOB_T_6              1379    ,000712 HYBRID
XZ_JOB_T_7              1050    ,000949 HYBRID
XZ_JOB_T_8              1016    ,000982 HYBRID
XZ_JOB_T_9              1032    ,000965 HYBRID
                
49 rows selected.

SQL> select /*+ full(a) */
  2           count(1)
  3       FROM
  4          XZ_JOB a
  5       where EMPLID>= 'XZ990501004'
  6       and   EMPLID<= 'XZ990562000'; 
  COUNT(1)
----------
     44724

----------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |       |
|   1 |  SORT AGGREGATE             |        |     1 |       |       |
|   2 |   PARTITION RANGE SINGLE    |        | 44645 |    42 |    42 |
|*  3 |    TABLE ACCESS STORAGE FULL| XZ_JOB | 44645 |    42 |    42 |
----------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.1.0.2')
      DB_VERSION('12.1.0.2')
      OPT_PARAM('_optimizer_extended_cursor_sharing' 'none')
      OPT_PARAM('_gby_hash_aggregation_enabled' 'false')
      OPT_PARAM('_optimizer_extended_cursor_sharing_rel' 'none')
      OPT_PARAM('_optimizer_use_feedback' 'false')
      OPT_PARAM('_fix_control' '18302923:1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "A"@"SEL$1")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - storage(("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000'))
       filter(("EMPLID">='XZ990501004' AND "EMPLID"<='XZ990562000'))

Again, changing the method_opt parameter value so that histogram on the EMPLID column is forced has made Oracle not only create the HYBRID histogram but to use them during query optimization even when the incremental mode is set to TRUE.

4. Bottom-line

Understandably, when incremental mode is set to TRUE Oracle might ignore the usage of histogram during query optimization. To avoid wrong cardinality estimation in such a situation you can use one of the following fixes:

Set the preference incremental mode to FALSE (at table or global level)
Unset the fix control 13583722
Unset the fix control 16726844
Force Oracle to create histogram on the desired column by manually specifying the bucket size 254:
```
 method_opt => 'for all columns size auto, for columns size 254 COL1' 
```

November 14, 2019

Incremental and ruin

Filed under: Oracle — hourim @ 8:02 pm

It’s hardly a secret that many people are not fan of histograms. The main problem of them is that they are not very well documented, and they drive, under the hood, many other features like Adaptive Cursor Sharing. They also interact with gathering statistics for partitioned tables when the incremental preference is set to TRUE.

As you might know, if you have read my two preceding articles here and here, I have been struggling with recurrent PeopleSoft performance issues because of a wrong cardinality estimation when visiting a partitioned table. I couldn’t figure out exactly the root cause of this cardinality mis-estimation so that I opened an Oracle Service Request (SR). As an answer to my request, Oracle support send me a reference to my two articles mentioned above😊. While I was still waiting for a viable answer from the Oracle support, the same performance issue kicked in and I was asked to solve it as quickly as possible.

After a couple of thinking minutes, I decided to change the incremental preference of that partitioned table and regather its statistics as in the following:

SQL> exec dbms_stats.set_table_prefs('SYSADM', 'PS_XYZ', 'incremental', 'false');
SQL> exec dbms_stats.gather_table_stats('SYSADM', 'PS_XYZ',degree => 4);

This made the culprit query, identified by the SQL_ID 147gz7h1x7jkq, going from a non-ending 42,886 seconds

Global Information
------------------------------
 Status              :  EXECUTING                         
 Instance ID         :  1                                 
 Session             :  SYSADM (23:47269)                 
 SQL ID              :  147gz7h1x7jkq                     
 SQL Execution ID    :  16777219                          
 Execution Started   :  08/28/2019 21:19:19               
 First Refresh Time  :  08/28/2019 21:19:25               
 Last Refresh Time   :  08/29/2019 09:14:05               
 Duration            :  42886s

to about only 5 seconds for the worst case

SQL> @sqlstats
Enter value for sql_id: 147gz7h1x7jkq

     CHILD    INST_ID PLAN_HASH_VALUE   AVG_GETS LAST_ACTI            AVG_ETIME      EXECS
---------- ---------- --------------- ---------- ------------------- ---------- ----------
         0          1      3376574399      23452 29/08/2019 17:09:08   5,357016          1
         0          2       430435019       2961 29/08/2019 17:09:36   1,366556          1

The goal of this article is to explain the negative effect the incremental mode can have on the effective use of histogram during cardinality estimation.

By the way, if you are wondering why the same query has got two different execution plans (3376574399,430435019) at almost the same time, then bear in mind that this is the effect of a GTT table having a SESSION private statistics.

1.Few words about the Setup

This is a PeopleSoft software using Oracle 12cR1 under Exadata machine. The cardinality misestimation occurs when Oracle visits a range (or a range/list) partitioned table of 49 partitions. The partition key is of varchar2(11) data type having a uniform distribution for certain partitions, and a skewed distribution at others. The incremental preference of the partitioned table is set to TRUE.

Here’s an image of the partition key column (EMPLID) statistics when the performance issue occurred (incremental = TRUE) and after it has been fixed (incremental =FALSE) respectively (at global and partition level):

-- incremental = TRUE
SELECT
   table_name
  ,column_name
  ,notes
FROM
  all_tab_col_statistics
WHERE 
    owner       = 'SYSADM'
AND table_name  = 'PS_XYZ'
AND column_name = 'EMPLID';    
  
TABLE_NAME  COLUMN_NAME   HISTOGRAM       NOTES   
----------- ------------- --------------- ----------------------------------
PS_XYZ      EMPLID        HYBRID          INCREMENTAL HIST_FOR_INCREM_STATS

SQL> select
         partition_name
        ,num_distinct
        ,density
        ,histogram
        ,notes
    from
        all_part_col_statistics
    where owner = 'SYSADM'
    and table_name = 'PS_XYZ'
    and column_name = 'EMPLID';


PARTITION_NAME  NUM_DISTINCT    DENSITY HISTOGRAM       NOTES                 
--------------- ------------ ---------- --------------- ----------------------
PS_XYZ_T_1             19888     ,00005 HYBRID                                
PS_XYZ_T_10             1191    ,000831 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_11             1153    ,000861 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_12             1187    ,000835 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_13             1390    ,000707 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_14             1733    ,000552 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_15             1950    ,000479 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_16             1842    ,000516 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_17             1584    ,000611 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_18             1714     ,00056 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_19             1410    ,000697 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_2              1059    ,000941 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_20             1460    ,000672 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_21             1899    ,000526 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_22             1916     ,00049 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_23             1568    ,000621 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_24             1550    ,000625 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_25             1697    ,000568 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_26             1637     ,00059 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_27             1528    ,000639 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_28             1629    ,000596 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_29             1479    ,000663 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_3               960     ,00104 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_30             1480    ,000662 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_31             1839     ,00052 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_32             3007    ,000284 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_33             7693     ,00013 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_34             2776     ,00036 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_35             3653    ,000215 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_36             7940    ,000126 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_37             3277    ,000305 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_38             8461    ,000118 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_39             7855    ,000127 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_4              1764    ,000567 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_40             4878    ,000205 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_41             6361    ,000157 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_42             6064    ,000165 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_43             9641    ,000104 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_44             9142    ,000109 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_45             8043    ,000124 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_46             5664    ,000078 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_47            21980    ,000045 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_48             9550    ,000105 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_49           270752    ,000004 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_5              1379    ,000725 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_6              1050    ,000952 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_7              1016    ,000984 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_8              1032    ,000969 HYBRID          HIST_FOR_INCREM_STATS 
PS_XYZ_T_9              1232    ,000811 HYBRID          HIST_FOR_INCREM_STATS 

49 rows selected.

-- incremental = FALSE

TABLE_NAME  COLUMN_NAME  HISTOGRAM       NOTES                                
----------- ------------ --------------- ------
PS_XYZ      EMPLID       HYBRID

PARTITION_NAME   NUM_DISTINCT    DENSITY HISTOGRAM       NOTES   
--------------- ------------ ---------- --------------- ---------
PS_XYZ_T_1             19888     ,00005 HYBRID                   
PS_XYZ_T_10             1191 ,000839631 NONE                     
PS_XYZ_T_11             1153    ,000861 HYBRID                   
PS_XYZ_T_12             1187  ,00084246 NONE                     
PS_XYZ_T_13             1390 ,000719424 NONE                     
PS_XYZ_T_14             1733 ,000577034 NONE                     
PS_XYZ_T_15             1950    ,000476 HYBRID                   
PS_XYZ_T_16             1842    ,000515 HYBRID                   
PS_XYZ_T_17             1584 ,000631313 NONE                     
PS_XYZ_T_18             1714     ,00056 HYBRID                   
PS_XYZ_T_19             1410  ,00070922 NONE                     
PS_XYZ_T_2              1059 ,000944287 NONE                     
PS_XYZ_T_20             1460    ,000671 HYBRID                   
PS_XYZ_T_21             1899    ,000526 HYBRID                   
PS_XYZ_T_22             1916    ,000491 HYBRID                   
PS_XYZ_T_23             1568     ,00062 HYBRID                   
PS_XYZ_T_24             1550    ,000626 HYBRID                   
PS_XYZ_T_25             1697    ,000562 HYBRID                   
PS_XYZ_T_26             1637    ,000593 HYBRID                   
PS_XYZ_T_27             1528     ,00064 HYBRID                   
PS_XYZ_T_28             1629 ,000613874 NONE                     
PS_XYZ_T_29             1479    ,000663 HYBRID                   
PS_XYZ_T_3               960 ,001041667 NONE                     
PS_XYZ_T_30             1480    ,000661 HYBRID                   
PS_XYZ_T_31             1839    ,000518 HYBRID                   
PS_XYZ_T_32             3007    ,000288 HYBRID                   
PS_XYZ_T_33             7693     ,00013 HYBRID                   
PS_XYZ_T_34             2776 ,000360231 NONE                     
PS_XYZ_T_35             3653    ,000215 HYBRID                   
PS_XYZ_T_36             7940    ,000126 HYBRID                   
PS_XYZ_T_37             3277 ,000305157 NONE                     
PS_XYZ_T_38             8461    ,000118 HYBRID                   
PS_XYZ_T_39             7855    ,000127 HYBRID                   
PS_XYZ_T_4              1764    ,000567 HYBRID                   
PS_XYZ_T_40             4878 ,000205002 NONE                     
PS_XYZ_T_41             6361    ,000157 HYBRID                   
PS_XYZ_T_42             6064    ,000165 HYBRID                   
PS_XYZ_T_43             9641    ,000104 HYBRID                   
PS_XYZ_T_44             9142    ,000109 HYBRID                   
PS_XYZ_T_45             8043 ,000124332 NONE                     
PS_XYZ_T_46             5664    ,000076 HYBRID                   
PS_XYZ_T_47            21980    ,000045 HYBRID                   
PS_XYZ_T_48             9550    ,000105 HYBRID                   
PS_XYZ_T_49           270944    ,000004 HYBRID                   
PS_XYZ_T_5              1379 ,000725163 NONE                     
PS_XYZ_T_6              1050 ,000952381 NONE                     
PS_XYZ_T_7              1016    ,000984 HYBRID                   
PS_XYZ_T_8              1032 ,000968992 NONE                     
PS_XYZ_T_9              1232 ,000811688 NONE                     

49 rows selected.

As you can see, this is a weird non-common situation. There is a clear contradiction here. How could it be differently when we see that, switching the incremental mode from true to false, switches the same data distribution for the same column at the same partition (PS_JOB_T_10 for example) from skewed (HYBIRD) to non-skewed (NONE)?

In fact, if you look closer you will realize that, as far as query optimization is concerned, there is no contradiction in the following situation:

HYBRID  +  HIST_FOR_INCREM_STATS = NONE

Because, when the CBO sees that the HYBRID histogram is accompanied by the Notes HIST_FOR_INCREM_STATS, it knows exactly that this histogram will be ignored at query optimization time, which is equivalent to not having histogram at all. This is for example correct for partitions like PS_JOB_T_10, PS_JOB_T_12 and PS_JOB_T_13 where column values are evenly distributed. But the real threat resides in partitions like PS_JOB_T_11, PS_JOB_T_15 and PS_JOB_T_16 for example where we have the following situation:

HYBRID  +  HIST_FOR_INCREM_STATS != HYBRID

In this case, the column values are skewed but their histogram are ignored by the CBO during query optimization because of the HIST_FOR_INCREM_STATS Notes. This is exactly the root cause of my performance issue.

Regardless of whether the partition key is skewed or not it will have a histogram
But this histogram will be ignored both at partition (except partition n°1 in this case) and at global level during query optimization.

Therefore, it becomes clear for me that fixing my performance issue resides in making the CBO not ignoring the presence of histogram at both partition level and global level. And, disabling the incremental mode, is what I did to achieve this goal.

2. Why histograms are ignored?

In the SR I opened for the Oracle support I have supplied the trace file of a dbms_stats call under incremental mode. Something resembling to this (partition n°11 in TEST database contains non-skewed value for the EMPLID column):

SQL> exec dbms_stats.set_global_prefs ('TRACE', to_char (1+16));

SQL> begin
        dbms_stats.gather_table_stats('sysadm'
                                      ,'PS_XYZ'
                                      ,partname  =>'PS_XYZ_T_11'
                                      ,granularity =>'partition');
    end;
   /
SQL> exec dbms_stats.set_global_prefs('TRACE', null);

DBMS_STATS: Building Histogram for EMPLID                                                                          
DBMS_STATS:  bktnum=-1, nnv=50819, snnv=5511,604748383025713834989746016721880424, sndv=1191, est_ndv=1191, mnb=254
DBMS_STATS:  Trying hybrid histogram                                                                               
DBMS_STATS:  > cdn 5592, popFreq 0, popCnt 0, bktSize 22,0750988142292490118577075098814229249
             , bktSzFrc ,0750988142292490118577075098814229249     
                                                
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 254, mnb 254, ssize 5592, min_ssize 2500, appr_ndv  TRUE
             ,ndv 1191, selNdv 0, selFreq 0, 
             pct 10,84555923647262975232686543618867329232, avg_bktsize 22, csr.hreq FALSE, normalize TRUE   
DBMS_STATS:   Checking range skew:                                                                                 
DBMS_STATS:      >> NOT SKEWED                           ----------------->                                                   
DBMS_STATS:  Not Discarding histogram (Incremental mode) ----------------->                                        
DBMS_STATS:   Histogram gathering flags: 4623

As you can see, Oracle is aware that the EMPLID column has an even distribution in partition n°11 but it decides to do not discard the creation of histogram because of the incremental mode. Oracle, nevertheless, updates the corresponding Notes column to indicate that this histogram will not be used during cardinality estimation.

PARTITION_NAME   NUM_DISTINCT    DENSITY HISTOGRAM       NOTES
---------------- ------------ ---------- --------------- ----------------------
PS_XYZ_T_11              1191    ,000833 HYBRID          HIST_FOR_INCREM_STATS

If, I regather the same statistics for partition n°12 where the EMPLID is skewed (in TEST) I will have the following trace snippet:

DBMS_STATS: Building Histogram for EMPLID                                                                          
DBMS_STATS:  bktnum=-1, nnv=47961, snnv=5501,261730969760166840458811261730969758, sndv=1153, est_ndv=1153, mnb=254
DBMS_STATS:  Trying hybrid histogram                                                                               
DBMS_STATS:  > cdn 5443, popFreq 0, popCnt 0, bktSize 21,49407114624505928853754940711462450593, 
              bktSzFrc ,49407114624505928853754940711462450593                                    
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 254, mnb 254, ssize 5443, min_ssize 2500, appr_ndv  TRUE
            ,ndv 1153, selNdv 0, selFreq 0
            ,pct 11,47028154327424400417101147028154327424, avg_bktsize 21, csr.hreq FALSE, normalize TRUE   
DBMS_STATS:   Checking range skew:                                                                                 
DBMS_STATS:      >> SKEWED                   ------------------------>                                                       
DBMS_STATS:   Histogram gathering flags: 527                                                                       
DBMS_STATS:  Accepting histogram             ------------------------>             

DBMS_STATS:          Need Actual Values (DSC_EAVS)                                                                 
DBMS_STATS:          Partition: PS_XYZ_T_12                                                                        
DBMS_STATS:          Histogram Type: HYBRID Data Type: 2                                                           
DBMS_STATS:          Histogram Flags: 4 Histogram Gathering Flags: 10                                              
DBMS_STATS:          Incremental: TRUE Fix Control 13583722: 1

This time, Oracle has correctly created the histogram and has left the Notes column empty indicating that the histogram will be used by the CBO when computing cardinality estimation for partition n°12:

PARTITION_NAME    NUM_DISTINCT DENSITY HISTOGRAM          NOTES
---------------- ------------ -------- ----------------- ----------------------
PS_XYZ_T_11       1191        ,000833  HYBRID             HIST_FOR_INCREM_STATS
PS_XYZ_T_12       1153        ,000861  HYBRID

However, at global level, Oracle will still ignore the usage of histogram as the following proves:

SQL> SELECT
       table_name
       ,column_name
       ,histogram
       ,notes
    FROM
      all_tab_col_statistics
    WHERE
     owner = 'SYSADM'
     and table_name ='PS_XYZ'
     and column_name = 'EMPLID';

TABLE_NAME  COLUMN_NAME   HISTOGRAM       NOTES   
----------- ------------- --------------- ----------------------------------
PS_XYZ      EMPLID        HYBRID          INCREMENTAL HIST_FOR_INCREM_STATS

I think that it suffices to have a single partition with that Note value (HIST_FOR_INCREM_STATS ) to see Oracle ignoring the usage of histogram at global level as well.

In passing, if you have exercised eyes you should have already remarked the control fix in the dbms_stats trace file

DBMS_STATS: Incremental: TRUE Fix Control 13583722: 1

SQL> select description from V$SYSTEM_FIX_CONTROL where bugno = '13583722';

DESCRIPTION
----------------------------------------------------------------
preserve partition level histograms &amp; actual values for increment

For the sake of simplicity, I will postpone publishing my investigations on this fix_control to a separate article.

3. Possible reasons

The following references might help understanding the situation explained in this article and the possible reasons pushing Oracle to keep creating histograms but ignoring their usage during query optimisation process:

Bug 13583722 – Slow incremental stats gather from global histogram gathers (Doc ID 13583722.8)
HIST_FOR_INCREM_STATS has been added to notes by the fix for unpublished bug 14595152 .

4. Bottom-line

As we have seen above, partition level histogram will not be discarded for tables with incremental mode set to TRUE. This happens regardless of whether the column values are skewed or evenly distributed. I think that, the reason behind this is that, under incremental mode, when global histogram aggregations are calculated, missing partition level histograms must be regathered at partition level with even distribution. Probably Oracle decided to keep these histograms at partition level to avoid doing the same thing two times under incremental mode. But it decides, at the same time, to pre-empt the CBO from using the histograms both at global and partition level during query optimization.

All in all, I think that disabling incremental mode is something one should envisage if confronted to a similar performance issue.

Comments (3)

Mohamed Houri’s Oracle Notes

November 28, 2019

DOP Downgrade

1.Replaying the issue using a reproducible model

2. Back to the aim of the article

Bottom-line

November 21, 2019

Incremental, histogram and fix control

1. Fix Control 13583722

2. Fix Control 16726844

3. For columns size 254 EMPLID

4. Bottom-line

November 14, 2019

Incremental and ruin

1.Few words about the Setup

2. Why histograms are ignored?

3. Possible reasons

4. Bottom-line

Follow Blog via Email

Categories

Blog Stats

Blogs I Follow

Archives