Mohamed Houri’s Oracle Notes

July 28, 2019

Hybrid-Height Balanced corrupted histogram – Part II

Filed under: CBO — hourim @ 7:58 am

Setting the Scene

At one client site, using PeopleSoft software for the employee’s payroll, I was asked to look at a weird performance issue occurring in an overnight batch. This batch launches 49 different Oracle sessions. Each session starts a bunch of DML queries. The data model has been engineered so that any session will treat a different range of employees. To accomplish this workload distribution, the main tables have been range/list partitioned. 49 range partitions with 58 list sub-partitions per partition. The range partition key identifies the employee id and is of varchar2(11) datatype. There is a skew in both the partition key values distribution inside a single partition and in the number of rows per partition for the whole table. The partition key has a HYBRID histogram as we are in 12cR1 under an Exadata machine.

The last time I did intervene for this batch, it was to find a solution so that execution plans will not be shared between the 49 sessions. Indeed, the skew of the number of rows(num_rows) in the range and list sub-partitions makes sharing the same execution plan a real performance threat. I will blog about how I achieved this goal in a couple of weeks or so.

In the current case the performance problem turned to be due to a wrong cardinality estimation because of corrupted Height-Balanced-Hybrid histogram as explained in part I of this mini-series on corrupted histogram. When visiting a single partition, the CBO underestimates the number of rows it will scan and ends up by using a dramatic NESTED LOOPS operation. But this was the case only for the single partitions having a corrupted histogram at its corresponding partition key. In this article I will show how I have fixed this HYBRID histogram corruption simply by deleting the statistics of the concerned partitions and regathering them.

The Scene in details

The part of the complex query where Oracle underestimates the cardinality of the single range partition resembles to this:

SQL> select /*+ full(a) */
        count(1)
     from
        PS_XX_YY_ZZZ_STAT a
     where EMPLID >= 'XX999501004'
     and EMPLID >== 'XX999562000';
 
COUNT(1)
---------
546461
 
Plan hash value: 2020349843
-------------------------------------------------------------------------------
| Id | Operation                   | Name              | Rows | Pstart| Pstop |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT             |                   |      |       |       |
| 1 |  SORT AGGREGATE              |                   |    1 |       |       |
| 2 |   PARTITION RANGE SINGLE     |                   |   91 |    41 | 41    |
| 3 |    PARTITION LIST ALL        |                   |   91 |     1 | 58    |
|*4 |    TABLE ACCESS STORAGE FULL | PS_XX_YY_ZZZ_STAT |   91 |  2321 | 2378  |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - storage(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))
    filter(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))

We can see that the cardinality estimation is wrong since Oracle estimates 91 rows while the actual generated rows are 546461. Let’s try to figure out why Oracle is underestimating this full table scan cardinality.
Since Oracle has identified a single partition (41) to visit it has then necessarily used the statistics of this partition when estimating the single column table selectivity. Below are listed the statistics (num_rows) at global and partition level for the range/list PS_XX_YY_ZZZ_STAT partitioned table:

SQL> @getPartTab
Enter value for owner: sysadm
Enter value for table_name: PS_XX_YY_ZZZ_STAT

TABLE_NAME            PNAME                   PPOS    SUBPARTITION_COUNT NUM_ROWS
-------------------- ------------------------ ------ ------------------ ----------
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_1    1               58         5811088
../..                                                         
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_41   41              58         568804      
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_42   42              58         527248
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_43   43              58         854141
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_44   44              58         809086
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_45   45              58         697071
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_46   46              58         484041
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_47   47              58         887009
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_48   48              58         259714
PS_XX_YY_ZZZ_STAT    PS_XX_YY_ZZZ_STAT_T_49   49              58         1025587
                                                                        ----------
Total num_rows                                                           38434347

As you can see the 41st partition has 568K rows while globally the whole table contains 38,4M rows. At this stage of the investigation we have got the first piece of the puzzle: num_rows (partition 41) = 568804
The second step is to check the EMPLID predicate column statistics at the 41st partition level:


col column_name    format a10
col partition_name format a35
col low_value      format a15
col high_value     format a15

SQL> select
       partition_name
      ,num_distinct
      ,utl_raw.cast_to_varchar2(low_value) low_value
      ,utl_raw.cast_to_varchar2(high_value) high_value
      ,density
      ,histogram
    from
      all_part_col_statistics
    where owner = 'SYSADM'
    and table_name = 'PS_XX_YY_ZZZ_STAT'
    and column_name = 'EMPLID'
    and partition_name in ('PS_XX_YY_ZZZ_STAT_T_41');

PARTITION_NAME             NUM_DISTINCT LOW_VALUE HIGH_VALUE            DENSITY    HISTOGRAM
------------------------- ------------ --------------- --------------- ---------- -----------
PS_XX_YY_ZZZ_STAT_T_41      6278         XX999498003    XX999561990     ,000159    HYBRID

We see that this column has a HYBRID histogram which, as you know, follows a different cardinality estimation formula depending on whether it is:
• A popular value
• A non-popular with an endpoint number
• A non-popular without an endpoint number

Let’s then check in what of the above three situations we are:

SQL> select
		partition_name
		,bucket_number
		,endpoint_actual_value
		,endpoint_repeat_count
		,bucket_size
		,case when Popularity > 0 then 'Pop'
		else 'Non-Pop'
		end Popularity
    from
		(select
			uth.partition_name
			,uth.bucket_number
			,uth.endpoint_actual_value
			,uth.endpoint_repeat_count
			,ucs.sample_size/ucs.num_buckets bucket_size
			,(uth.endpoint_repeat_count - ucs.sample_size/ucs.num_buckets) Popularity
		 from
		   all_part_histograms uth
		   ,all_part_col_statistics ucs
		 where
			uth.table_name = ucs.table_name
			and uth.column_name = ucs.column_name
			and uth.owner = ucs.owner
			and uth.owner = 'SYSADM'
			and uth.table_name = 'PS_XX_YY_ZZZ_STAT'
			and uth.partition_name in ('PS_XX_YY_ZZZ_STAT_T_41')
			and uth.column_name = 'EMPLID'
			and uth.endpoint_actual_value in ('XX999501004','XX999562000')
		)
    order by 
	partition_name, bucket_number;
	
no rows selected

It is a non-popular value without an endpoint number which then normally obeys to the following cardinality estimation formula:

   Card = num_rows * NewDensity = 568804 * .000159 = 90.439836

According to the above computed cardinality, you might think that Oracle has used the HYBRID histogram to estimate the wrong 90.43 ~ 91 cardinality. But, in fact, Oracle has completely ignored the histogram as proved by the corresponding 10053 trace file:

Column (#1): EMPLID(VARCHAR2) Part#: 0
AvgLen: 12 NDV: 267112 Nulls: 0 Density: 0.000004 Min: 0.000000 Max: 0.000000
Histogram: HtBal #Bkts: 5866 UncompBkts: 5866 EndPtVals: 0 ActualVal: yes

NOTE: ignoring histogram of column (PS_XX_YY_ZZZ_STAT.EMPLID) --->
used only for incremental stats maintenance -->

#IMCUs: 0 IMCRowCnt: 0 IMCJournalRowCnt: 0 #IMCBlocks: 0 IMCQuotient: 0.000000
Column (#1): EMPLID(VARCHAR2) Part#: 0

AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159 Min: 0.000000 Max: 0.000000 Histogram: HtBal #Bkts: 5308 
UncompBkts: 5308 EndPtVals: 0 ActualVal: yes

NOTE: ignoring histogram of column (PS_XX_YY_ZZZ_STAT.EMPLID)
used only for incremental stats maintenance Column (#1): EMPLID(VARCHAR2) Part#: 40 AvgLen: 12 
NDV: 6278 Nulls: 0 Density: 0.000159
Column (#1): EMPLID(VARCHAR2)
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159
Table: PS_XX_YY_ZZZ_STAT Alias: A Card: Original: 568804.000000 Rounded: 91 
          Computed: 90.602740 Non Adjusted: 90.602740

Oracle is clearly showing that it has ignored the histogram for the column EMPLID and that it will use it only for incremental statistics maintenance.

Moreover, there is another oddity in the above 10053 trace file:

Histogram: HtBal #Bkts: 5308 UncompBkts: 5308 EndPtVals: 0 ActualVal: yes

See how Oracle CBO trace is looking at Height Balanced histogram while the statistics clearly shows that the EMPLID column has a HYBRID histogram?

I think that the histograms of this column at this partition are corrupted and therefore Oracle has ignored them and fall back to the cardinality estimation of a column without histogram:

Card = num_rows/NDV = 568804/6278 = 90.60273972 –-> exactly the value computed by Oracle

By the way we can know that Oracle will ignore the histogram, beforehand, just by selecting the NOTES column from
the all_part_col_statistics table as shown below:

SQL> select
	 partition_name
	,column_name
	,histogram
	,notes
    from
	all_part_col_statistics
    where owner = 'SYSADM'
     and table_name = 'PS_XX_YY_ZZZ_STAT'
     and column_name = 'EMPLID'
     and partition_name in ('PS_XX_YY_ZZZ_STAT_T_41');
		
PARTITION_NAME           COLUMN_NAME  HISTOGRAM       NOTES
------------------------ ----------- --------------- ---------------------
PS_XX_YY_ZZZ_STAT_T_41   EMPLID        YBRID          HIST_FOR_INCREM_STATS

The Notes indicates clearly that the HYBRID histogram for column EMPLID for partition n°41 will be used only for incremental statistics maintenance.

Working around the Scene

To get rid of this Height Balanced-HYBRID histogram cohabitation within the same column we need to delete and regather statistics in the concerned partition:

BEGIN
      dbms_stats.delete_table_stats(ownname  =>'SYSADM',tabname=>'PS_XX_YY_ZZZ_STAT',
                                    partname =>'PS_XX_YY_ZZZ_STAT_T_41');
									
      dbms_stats.gather_table_stats(ownname  =>'SYSADM',tabname=>'PS_XX_YY_ZZZ_STAT',
                                    partname =>'PS_XX_YY_ZZZ_STAT_T_41',
                                    granularity=>'PARTITION');
END ;
/

Querying again gives a perfect cardinality estimation

SQL> sELect /*+ full(a) */
		count(1)
	from
		PS_XX_YY_ZZZ_STAT a
	where EMPLID >= 'XX999501004'
	and EMPLID   <= 'XX999562000';
	
COUNT(1)
----------
546461

--------------------------------------------------------------------------------
| Id | Operation                   | Name               | Rows | Pstart| Pstop |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT             |                    |      |       |       |
  SORT AGGREGATE                   |                    |    1 |       |       |
| 2 |   PARTITION RANGE SINGLE     |                    | 545K |     41|    41 |
| 3 |    PARTITION LIST ALL        |                    | 545K |      1|    58 |
|*4 |    TABLE ACCESS STORAGE FULL | PS_XX_YY_ZZZ_STAT  | 545K |   2321|  2378 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - storage(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))
    filter(("EMPLID">='XX999501004' AND "EMPLID"<='XX999562000'))

And this time Oracle has clearly used the HYBRID histogram as proved by the corresponding 10053 trace file

Column (#1):
NewDensity:0.000159, OldDensity:0.000159 BktCnt:5560.000000, PopBktCnt:0.000000, PopValCnt:0, NDV:6278
Column (#1): EMPLID(VARCHAR2) Part#: 40
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159 Histogram: Hybrid #Bkts: 254 UncompBkts: 5560 EndPtVals: 254 ActualVal: yes
Column (#1): EMPLID(VARCHAR2)
AvgLen: 12 NDV: 6278 Nulls: 0 Density: 0.000159
Histogram: Hybrid #Bkts: 254 UncompBkts: 5560 EndPtVals: 254 ActualVal: yes
Table: PS_XX_YY_ZZZ_STAT Alias: A
Card: Original: 568804 Rounded: 545382 Computed: 545381.630747 Non Adjusted: 545381.630747

This HYBRID histogram utilization is also backed by the new values of the Notes (null) columns:

SQL> select
		 partition_name
		,column_name
		,histogram
		,notes
     from
        all_part_col_statistics
    where owner = 'SYSADM'
    and table_name = 'PS_XX_YY_ZZZ_STAT'
    and column_name = 'EMPLID'
    and partition_name in ('PS_XX_YY_ZZZ_STAT_T_41');
	
PARTITION_NAME            COLUMN_NAM HISTOGRAM     NOTES
------------------------- ---------- ------------ ------
PS_XX_YY_ZZZ_STAT_T_41      EMPLID    HYBRID

The Scene Bottom-line

Oracle might ignore the presence of histogram when calculating the cardinality estimation of table/partition/sub-partition. This happens for corrupted histogram. You can check whether the histogram for a column at table/partition/sub-partition level will be used or not by the CBO simply by selecting the NOTES column from all_tab/part/subpart_col_statistics table.

2 Comments »

  1. […] you might know, if you have read my two preceding articles here and here, I have been struggling with recurrent PeopleSoft performance issues because of a wrong cardinality […]

    Pingback by Incremental and ruin | Mohamed Houri’s Oracle Notes — November 14, 2019 @ 8:03 pm | Reply

  2. […]  Part II where I examined via a real-life example the impact of this wrong cardinality estimation on the query performance […]

    Pingback by Incremental, histogram and fix control | Mohamed Houri’s Oracle Notes — November 21, 2019 @ 5:36 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: