Mohamed Houri’s Oracle Notes

December 10, 2017

Compress enable novalidate

Filed under: Oracle — hourim @ 7:47 pm

We knew here that we can’t DDL optimise a BASIC compressed table. So I uncompressed my table in a 12.2.0.1.0 and DDL optimise it as in the following:

SQL> select 
        pct_free
	,compression
	,compress_for
    from 
	 user_tables
    where 
	   table_name = 'T1';

  PCT_FREE COMPRESS COMPRESS_FOR
---------- -------- -------------
         0 DISABLED

SQL> alter table t1 add c_ddl number default 42 not null;

Table altered.

But I changed my mind and finally decided to revert back and drop this newly added column:

SQL> alter table t1 drop column c_ddl;
alter table t1 drop column c_ddl
                           *
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

Damned database!!!

What’s going on here? My table is not compressed. Is it?

In fact I didn’t show you how I have uncompressed the t1 table:

SQL> alter table t1 nocompress;

Table altered.

Compressing or un-compressing a table without moving it will do nothing to its data. When the COMPRESSION column of a table is DISABLED this doesn’t necessarily mean that this table doesn’t contain compressed data . And this is probably the reason why the above bizarre error pops up out from nowhere.
In fact in order to have a significant meaning, the COMPRESSION column should normally always be considered with the PCT_FREE column (notice in passing that this is why I printed above the pct_free value):

SQL> select
         table_name
        ,compression
        ,pct_free
        ,compress_for
     from 
     user_tables
    where table_name = 'T1';

TABLE_NAME COMPRESS   PCT_FREE COMPRESS_FOR
---------- -------- ---------- --------------
T1         DISABLED          0

When PCT_FREE equals zero this is generally a hint that Oracle is considering BASIC compression since it thinks that the table is not considered for updates anymore. The definition given by Oracle to the COMPRESSION column goes in the same direction as well:

COMPRESSION	 	Indicates whether table compression 
                    is enabled (ENABLED) or not (DISABLED)

This is crystal clear: COMPRESSION referes to the status of the current compression. It doesn’t tell anything about the compression history of the table if any.

Compressing/uncompressing a table without moving it is nothing else than implementing or desimplementing compression but only for newly inserted rows. This is why I think that Oracle could have avoided the confusion by using the same command as when dealing with table constraints. Instead of this:

SQL> alter table t1 nocompress;

We would have this :

SQL> alter table t1 nocompress enable novalidate;

And of course the COMPRESSION column should have had a status indicating whether it is validated or not.

In the 10046 file of a compressed table with a move of its data and without a move we can see this respectively:

alter table t1 move compress basic

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.02          0          0          1           0
Execute      1      1.43       1.44          3       1865       2548     1000000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      1.45       1.46          3       1865       2549     1000000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 106  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD AS SELECT  T1 (cr=1905 pr=1 pw=1382 time=1436345 us
   1000000    1000000    1000000   TABLE ACCESS FULL T1 (cr=1797 pr=0 pw=0 time=104378 

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PGA memory operation                         1140        0.00          0.01
  Disk file operations I/O                        1        0.00          0.00
  db file sequential read                         3        0.00          0.00
  direct path write                               1        0.00          0.00
  reliable message                                4        0.00          0.00
  enq: RO - fast object reuse                     2        0.00          0.00
  enq: CR - block range reuse ckpt                2        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        5.93          5.93
********************************************************************************
alter table t1 nocompress


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          2           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.01          0          0          2           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 106  

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Compression analysis                            8        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        6.65          6.65
********************************************************************************

Spot that whileOracle managed to rework a million rows of in the first case it didn’t touch any in the second case proving that nocompress/compress without a move doesn’t concern existing data.

Since it concerns only new data then an inserted row should be compressed/non-compressed. Let’s see:

SQL> create table t1 as select
      rownum n1
     ,mod(rownum,10) n2
     from dual
     connect by level <=1e6;

Table created.

SQL> @sizeBysegNameMB
Enter value for segment_name: t1
Enter value for owner: c##mhouri


SEGMENT_TYPE  TABLESPACE_NAME  SEGMENT_NAME PARTITION_NAME          MB
------------- ---------------- ------------ --------------- ----------
TABLE         USERS            T1                                   15
                                                            ----------
Total Segment Size                                                  15

Let’s compress this table without moving its data and get its new size:

SQL> alter table t1 compress;

Table altered.

SQL> @sizeBysegNameMB
Enter value for segment_name: t1
Enter value for owner: c##mhouri


SEGMENT_TYPE  TABLESPACE_NAME  SEGMENT_NAME PARTITION_NAME          MB
------------- ---------------- ------------ --------------- ----------
TABLE         USERS            T1                                   15
                                                            ----------
Total Segment Size                                                  15

As you can see the size of the table remains intact indicating again that compressing without moving does nothing to the existing data.

Let’s now direct path load a single row, dump the corresponding block and analyse the dump to see if compression has been activated or not:

SQL> select rowid, n1,n2 from t1 where n1=1;

ROWID                      N1         N2
------------------ ---------- ----------
AAATYNAAHAAAdpTAAA          1          1

SQL> insert /*+ append */ into t1 select 1,42 from dual;

1 row created.

SQL> select rowid, n1,n2 from t1 where n1=1;
select rowid, n1,n2 from t1 where n1=1
                         *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


SQL> commit;

Commit complete.

SQL> select rowid, n1,n2 from t1 where n1=1;

ROWID                      N1         N2
------------------ ---------- ----------
AAATYNAAHAAAdpTAAA          1          1
AAATYNAAHAADOgcAAA          1         42

SQL> select
  2    dbms_rowid.rowid_relative_fno('AAATYNAAHAADOgcAAA') file_no
  3  , dbms_rowid.rowid_block_number('AAATYNAAHAADOgcAAA') block_no
  4  from dual;

   FILE_NO   BLOCK_NO
---------- ----------
         7     845852


SQL> alter session set tracefile_identifier='DumpingDirectLoadRowInsert';

Session altered.

SQL> alter system dump datafile 7 block 845852;

System altered.

Unfortunately I couldn’t spot anything interesting in the dump file which indicates that the inserted row has been compressed. I will probably come back to this article in a near future.

Advertisements

December 7, 2017

Optimizer statistics gathering

Filed under: direct path — hourim @ 8:24 pm

As from Oralce 12cR1, the Optimizer can automatically collect statistics on empty tables provided they are, notably, direct path loaded. This is recognisable in execution plans through the new row source operation named OPTIMIZER STATISTICS GATHERING

-----------------------------------------------------------
| Id  | Operation                        | Name   | Rows  |
-----------------------------------------------------------
|   0 | INSERT STATEMENT                 |        |  1000K|
|   1 |  LOAD AS SELECT                  | T1_TAR |       |
|   2 |   OPTIMIZER STATISTICS GATHERING |        |  1000K|
|   3 |    TABLE ACCESS FULL             | T1     |  1000K|
-----------------------------------------------------------

This is, in principle, a good initiative. But look at what I have been asked to trouble shoot a couple of days ago:

Global Information
------------------------------
 Status              :  EXECUTING                 
 Instance ID         :  1                    
 Session             :  xxxx (15:32901) 
 SQL ID              :  am6923m45s203        
 SQL Execution ID    :  16777216             
 Execution Started   :  12/02/2017 08:01:17  
 First Refresh Time  :  12/02/2017 08:01:28  
 Last Refresh Time   :  12/05/2017 11:55:51  
 Duration            :  272276s                  
 Module/Action       :  SQL*Plus/-           
 Service             :  xxxxx            
 Program             :  sqlplus@xxxx

===================================================================================================
| Id |             Operation               |  Name  | Activity | Activity Detail                  |
|    |                                     |        |   (%)    |   (# samples)                    |
===================================================================================================
| -> 0 | INSERT STATEMENT                  |        |   0.00   | Data file init write             |   
| -> 1 |   LOAD AS SELECT                  | T1_TAR |   0.06   | Cpu(28)                          |
|                                          |        |          | direct path write(2)             |
| -> 2 |    OPTIMIZER STATISTICS GATHERING |        |  99.93   | Cpu(52956)                       |
| -> 3 |     TABLE ACCESS FULL             | T1     |          |  SQL*Net more data from dblink(1)|               
==================================================================================================    

You have to believe me when I say that this insert had been running for 3 days when the above SQL report was taken. As you can see 99% of the direct path load execution time was spent gathering statistics at operation in line n°2. This is how a new feature designed to help starts disturbing you.

The insert/select statement selects a CLOB from a remote database. I was curious to see whether getting rid of this CLOB from the insert operation would make things better:

SQL>set timing on
 
    INSERT
   /*+ append */
 INTO T1_TAR
    (col1
   -- ,clob_col2
	,col3
	,col4
	,col5
	,col6
	,col7
	,col8
	)
 SELECT
    t1.col1 
   --,t1.clob_col2
   ,t1.col3
   ,t1.col4
   ,t1.col5
   ,t1.col6
   ,t1.col7
   ,t1.col8
  FROM
    t1@dblink t1;

338481182 rows created.

Elapsed: 00:11:38.85

------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes |IN-OUT|
-----------------------------------------------------------
|   0 | INSERT STATEMENT   |        |  318M|    28GB|      |   
|   1 |  LOAD AS SELECT    | T1_TAR |      |        |      |
|   2 |   REMOTE           | T1     |  318M|    28GB|R->S  |
------------------------------------------------------------

That’s really interesting.

Why would the presence or the absence of a CLOB column in the insert/select statement allow or prevent the CBO from gathering statistics online?

In fact the CLOB column has nothing to do with the online statistics gathering decision. Should I have commented any other column I would have had the same behaviour as the following prooves:

INSERT
   /*+ append */
 INTO T1_TAR
    (col1
    ,clob_col2
	,col3
	,col4
	,col5
	,col6
	--,col7
	,col8
	)
 SELECT
    t1.col1 
   ,t1.clob_col2
   ,t1.col3
   ,t1.col4
   ,t1.col5
   ,t1.col6
   --,t1.col7
   ,t1.col8
  FROM
    t1@dblink t1;
------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes |IN-OUT|
-----------------------------------------------------------
|   0 | INSERT STATEMENT   |        |  318M|    28GB|      |   
|   1 |  LOAD AS SELECT    | T1_TAR |      |        |      |
|   2 |   REMOTE           | T1     |  318M|    28GB|R->S  |
------------------------------------------------------------

The OPTIMIZER STATISTICS GATHERING operation requires the presence of the totality of the table columns in order to kick in during direct path load operations. This is probably because Oracle doesn’t want to collect statistics for a bunch of columns and keep others without statistics. This being said, this is another non-really documented restriction that comes with this new 12c feature.

Since I was not going to challenge my client getting rid of one column from the insert/select I finally opted for locally cancelling the online statistics gathering using the corresponding hint as shown below:

INSERT
   /*+ append 
       no_gather_optimizer_statistics
   */
 INTO T1_TAR
    (col1
    ,clob_col2
    ,col3
    ,col4
    ,col5
    ,col6
    ,col7
    ,col8
    )
 SELECT
    t1.col1 
   ,t1.clob_col2
   ,t1.col3
   ,t1.col4
   ,t1.col5
   ,t1.col6
   ,t1.col7
   ,t1.col8
  FROM
    t1@dblink t1;

Global Information
------------------------------
 Status              :  DONE                 
 Instance ID         :  1                    
 Session             :  xxxx (132:63271) 
 SQL ID              :  Od221zf0srs6q        
 SQL Execution ID    :  16777216             
 Execution Started   :  12/02/2017 15:10:16  
 First Refresh Time  :  12/02/2017 15:10:22  
 Last Refresh Time   :  12/05/2017 15:32:56  
 Duration            :  1360s                  
 Module/Action       :  SQL*Plus/-           
 Service             :  xxxxx            
 Program             :  sqlplus@xxxx

SQL Plan Monitoring Details (Plan Hash Value=2098243032)
================================================================
| Id |             Operation   |  Name  |   Time    |   Rows   |
|    |                         |        | Active(s) | (Actual) |
================================================================
|  0 | INSERT STATEMENT        |        |      1355 |        2 |
|  1 |   LOAD AS SELECT        | T1_TAR |      1355 |        2 |
|  2 |     REMOTE              | T1     |      1355 |     338M |
================================================================    

That’s how we fixed this issue.

Watch out your direct path load over empty tables selecting from either local or distant databases; the new 12c online table and columns statics gathering might considerably delay your treatement.

November 23, 2017

Compressed tables and DDL optimisation

Filed under: compression — hourim @ 8:15 pm

DDL optimisation is a very useful feature with which adding a column having a default value to a very very big table is almost instantaneous. And, as an icing on the cake, this feature comes free of license.

Free of license have I said?

Yes. Until you hit the followings in a 12.2.0.1.0 release

SQL> alter table t1 add c_ddl number default 42 not null;
alter table t1 add c_ddl number default 42 not null
                   *
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

My real life very very big table was BASIC compressed as is the t1 table


SQL> select
       table_name
      ,compression
      ,pct_free
      ,compress_for
    from 
      user_tables
    where 
      table_name = 'T1';

TABLE_NAME COMPRESS   PCT_FREE COMPRESS_FOR
---------- -------- ---------- -------------
T1         ENABLED           0 BASIC
 

If you want to DDL optimise a BASIC compressed table you have to switch for the paid advanced compression as the following demonstrates:


SQL> alter table t1 move compress for oltp;

Table altered.

SQL> select
       table_name
      ,compression
      ,pct_free
      ,compress_for
    from 
      user_tables
    where 
      table_name = 'T1';

TABLE_NAME COMPRESS   PCT_FREE COMPRESS_FOR
---------- -------- ---------- -------------
T1         ENABLED           0 ADVANCED

SQL> alter table t1 add c_ddl number default 42 not null;

Table altered.

SQL> select count(1) from t1 where c_ddl=42;

  COUNT(1)
----------
   1000000

SQL_ID  59ysfcmfx0s7t, child number 0
-------------------------------------
---------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |
---------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |
|   1 |  SORT AGGREGATE    |      |     1 |     8 |
|*  2 |   TABLE ACCESS FULL| T1   | 10000 | 80000 |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(NVL("C_DDL",42)=42)

 

With a compressed table using advanced compression I succeeded to add a column to table t1 having a default value using the DDL optimisation feature. The presence of the NVL function in the predicate part of the above execution plan is what allows me to say that, indeed, the DDL optimisation feature has been used during the alter table command.

This is how Oracle implicitely force you to buy a new license. If you want to DDL optimise a compressed table than you have to use the OLTP paid option.

However, If you are in the same situation as that of my customer and you don’t want to pay for extra license fees then proceed as shown below:


SQL> alter table t1 move nocompress parallel 8;

Table altered.

SQL> alter table t1 add c_ddl number default 42 not null;

Table altered.

SQL> alter table t1 move compress basic parallel 8;

Table altered.

SQL> alter index idx_t1_pk rebuild parallel 8;

Index altered.

SQL> alter index idx_t1_pk noparallel;

Index altered.

Summary

In contrast to the free BASIC compression Advanced compression allows DDL optimisation. Take this into account when deciding about the type of compression you will choose at design time.

Model

create table t1 as select 
rownum n1
,mod(rownum,10) n2
from dual
connect by level <=1e6;

create unique index idx_t1_pk on t1(n1);

alter table t1 add constraint t1_pk primary key(n1) using index;

October 28, 2017

Cursor selectivity cube -Part II

Filed under: Oracle — hourim @ 1:40 pm

In the same vein as the preceding blog post, in this second and last post pf the series I will provide three differents scripts to a bind aware cursor that owes its bind sensitiveness property from a bind variable value having a Hybrid histogram. The first one gives the selectivity cube of a popular Hybrid histogram value. The second script do the same thing for a non-popular Hybrid histogram having an endpoint number. The third and last script gives the selectivity cube of a non captured Hybrid histogram value.

1. Cursor Selectivity cube for a popular Hybrid histogram

In order to put all this in action I am going to use the model I have found in this article:

SQL> desc acs_test_tab
 Name                Null?    Type
 ------------------- -------- -------------
 ID                  NOT NULL NUMBER
 RECORD_TYPE                  NUMBER
 DESCRIPTION                  VARCHAR2(50)

SQL> alter session set cursor_sharing=force;

SQL> select
       column_name
      ,num_distinct
      ,num_buckets
      ,sample_size
      ,histogram
    from
       user_tab_col_statistics
    where table_name = 'ACS_TEST_TAB'
    and column_name  = 'RECORD_TYPE';

COLUMN_NAME  NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE HISTOGRAM
------------ ------------ ----------- ----------- ------------
RECORD_TYPE         50080         254        5407 HYBRID

As you can see the RECORD_TYPE column has a HYBRID histogram with the following popular and non-popular values distribution:

select
         endpoint_number
        ,endpoint_actual_value
        ,endpoint_repeat_count
        --,bucket_size
        ,case when Popularity > 0 then 'Pop'
                   else 'Non-Pop'
          end Popularity
    from
   (
     select
         uth.endpoint_number
        ,uth.endpoint_actual_value
        ,uth.endpoint_repeat_count
        ,ucs.sample_size/ucs.num_buckets bucket_size      
        ,(uth.endpoint_repeat_count - ucs.sample_size/ucs.num_buckets) Popularity
    from
        user_tab_histograms uth
       ,user_tab_col_statistics ucs
   where
        uth.table_name   = ucs.table_name
    and uth.column_name   = ucs.column_name
    and uth.table_name    = 'ACS_TEST_TAB'
    and uth.column_name   = 'RECORD_TYPE'
    )
   order by endpoint_number;

ENDPOINT_NUMBER ENDPOINT_A ENDPOINT_REPEAT_COUNT POPULAR
--------------- ---------- --------------------- -------
              1 1                              1 Non-Pop
           2684 2                           2683 Pop     -- we will use this
           2695 569                            1 Non-Pop -- we wiil use this
           2706 1061                           1 Non-Pop
           2717 1681                           1 Non-Pop
           2727 1927                           1 Non-Pop
../..
           5364 98501                          1 Non-Pop
           5375 98859                          1 Non-Pop
           5386 99187                          1 Non-Pop
           5396 99641                          1 Non-Pop
           5407 99999                          1 Non-Pop

254 rows selected.

There are 254 endpoint actual values of which only one value is popular(2). The following query will be used all over this article to show how we can compute the cursor selectivity cube for the three types of Hybrid histogram mentioned in the introduction:

-- as the ambassador of popular values
SELECT MAX(id) FROM acs_test_tab WHERE record_type = 2;

We will start by getting the selectivity cube of the popular value 2 as shown below (script can be found at the end of this blog post):

SQL> @CurSelCubeHybridPop

PL/SQL procedure successfully completed.

Enter value for bind: 2

BIND              LOW       HIGH
---------- ---------- ----------
2             ,446588    ,545829

Now that we have figured out what will be the value of the cursor selectivity cube of a bind aware cursor using the popular value 2, let’s see whether we have been right or wrong:

SELECT MAX(id) FROM acs_test_tab WHERE record_type = 1;-- only for ACS warmup

SELECT MAX(id) FROM acs_test_tab WHERE record_type = 2;
SELECT MAX(id) FROM acs_test_tab WHERE record_type = 2;                            

SQL_ID  9vu42gjuudpvj, child number 1
-------------------------------------
---------------------------------------------------
| Id  | Operation          | Name         | Rows  |
---------------------------------------------------
|   0 | SELECT STATEMENT   |              |       |
|   1 |  SORT AGGREGATE    |              |     1 |
|*  2 |   TABLE ACCESS FULL| ACS_TEST_TAB | 49621 |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 – filter("RECORD_TYPE"=:SYS_B_0)

select
    child_number
    ,range_id
   ,low
   ,high
 from
   gv$sql_cs_selectivity
 where
    sql_id ='9vu42gjuudpvj';

CHILD_NUMBER   RANGE_ID LOW        HIGH
------------ ---------- ---------- ----------
           1          0 0.446588   0.545829

Spot how the low-high value of child cursor n°1 matches perfectly the low and high value given by the CurSelCubeHybridPop.sql script.

2. Cursor Selectivity cube for a non-popular Hybrid histogram value having an endpoint number

Let’s now consider a non-popular value having an endpoint number in the histogram table and let’s figure out what would be the selectivity cube of its underlying bind aware cursor:

SQL> @CurSelCubeHybridNonPop

PL/SQL procedure successfully completed.

PL/SQL procedure successfully completed.

Enter value for bind: 569

BIND              LOW       HIGH
---------- ---------- ----------
569           ,000166    ,000203

And now we are ready to execute the corresponding query, get its execution plan and present the low and high value of the bind aware cursor when ran against this Hybrid non-popular value:

SELECT MAX(id) FROM acs_test_tab WHERE record_type = 569;

SQL_ID  9vu42gjuudpvj, child number 2 –- new execution plan
-------------------------------------
-----------------------------------------------------------------------------------
| Id  | Operation                            | Name                       | Rows  |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                            |       |
|   1 |  SORT AGGREGATE                      |                            |     1 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| ACS_TEST_TAB               |    18 |
|*  3 |    INDEX RANGE SCAN                  | ACS_TEST_TAB_RECORD_TYPE_I |    18 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 – access("RECORD_TYPE"=:SYS_B_0)

select
    child_number
    ,range_id
   ,low
   ,high
 from
   gv$sql_cs_selectivity
 where
    sql_id ='9vu42gjuudpvj';

CHILD_NUMBER   RANGE_ID LOW        HIGH
------------ ---------- ---------- ----------
           2          0 0.000166   0.000203  –- new execution plan
           1          0 0.446588   0.545829

Spot again the precision of the forecast. The low and high value of child cursor n°2 correspond exactly to the selectivity cube of the Hybrid non-popular value anticipated by the CurSelCubeHybridNonPop.sql script.

3. Cursor Selectivity cube for a non-popular Hybrid histogram value without an endpoint number

A Hybrid histogram value without an endpoint number is a value that exists for the column but which has not been captured by the Histogram gathering program for reasons I am not going to expose here. We can get all those values via an appropriate query. 41 is one value among the not captured ones. Let’s use it in the following demonstration:

Firt we will get its expected selectivity cube:

SQL> @CurSelCubeHybridNonPopWithoutEndPoint

PL/SQL procedure successfully completed.

PL/SQL procedure successfully completed.

Enter value for bind: 41

        41        LOW       HIGH
---------- ---------- ----------
        41    ,000009    ,000011

This selectivity range is not included in the selectivity range of child cursor n°1 nor in that of child cursor n°2. This is why if we use it in the following query it will certainly force ECS to hard parse a new execution plan as the following proves:

SQL> SELECT MAX(id) FROM acs_test_tab WHERE record_type = 41;

SQL_ID  9vu42gjuudpvj, child number 3 -- new execution plan
-------------------------------------
-----------------------------------------------------------------------------------
| Id  | Operation                            | Name                       | Rows  |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                            |       |
|   1 |  SORT AGGREGATE                      |                            |     1 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| ACS_TEST_TAB               |     1 |
|*  3 |    INDEX RANGE SCAN                  | ACS_TEST_TAB_RECORD_TYPE_I |     1 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("RECORD_TYPE"=:SYS_B_0)

As it has been again correctly expected a new execution plan (child cursor n°3) has been compiled. But since the new execution plan is identical to an existing one (child cursor n°2) Oracle has merged the selectivities of these two cursors, kept shareable the last compiled one and deleted the old plan as the following proves:

select
    child_number
    ,range_id
   ,low
   ,high
 from
   gv$sql_cs_selectivity
 where
    sql_id ='9vu42gjuudpvj';

CHILD_NUMBER   RANGE_ID LOW        HIGH
------------ ---------- ---------- ----------
           3          0 0.000009   0.000203   -- new plan with merge selectivity
           2          0 0.000166   0.000203
           1          0 0.446588   0.545829

select
   child_number
  ,is_shareable
from
   gv$sql
where
   sql_id = '9vu42gjuudpvj';

CHILD_NUMBER I
------------ -
           0 N
           1 Y
           2 N –- deleted 
           3 Y – selectivity cube merged

4. Conclusion

The first blog post of this series provided a script with which we can anticipate the cursor selectivity cube range of a bind aware cursor when its bind sensitivenes is due to a predicate having a Frequency histogram. In this article we presented three new scripts giving the same anticipation for a bind aware cursor that owes its bind sensitivenes respectively to a popular Hybrid histogram, a non-popular Hybrid histogram having and endpoint number and a non captured Hybrid histogram. This concludes the series
CurSelCubeHybridNonPop

CurSelCubeHybridNonPopWithoutEndPoint

CurSelCubeHybridPop

October 25, 2017

Cursor selectivity cube -Part I

Filed under: Oracle — hourim @ 6:00 pm

Bind variable selectivity is the building block on which the Extended Cursor Sharing Layer code reasons to compile a new good enough execution plan or share an existing one. It kicks in only for a bind aware cursor. The underlying child cursor is given a selectivity interval comprised between a low and a high value derived from the bind variable selectivity that initiates it. This is what Oracle refers to as a cursor selectivity cube shown in the following picture:

The ECS layer code launches the bind-aware cursor matching algorithm at each soft parse of a bind aware cursor. If the new bind variable value selectivity is outside an existing selectivity cube (low-high exposed in gv$sql_cs_selectivity) then a new hard parse is done and a new child cursor with a new selectivity cube is created. If, however, the peeked bind variable selectivity falls into a range of an existing child cursor selectivity cube, ECS will then share the corresponding child cursor’s execution plan. Finally if a new hard parsed execution plan is equivalent to an existing one then both child cursors will be merged. The selectivity cube of the last created child cursor will be adjusted while the previous cursor which served the merge process will be marked un-shareable in order to save space in the memory and reduce the time spent during cursor pruning activity.

The rest of this article shows, first, how the selectivity cube (low-high value) is computed for a bind variable value with a Frequency histogram. It then explains how two cursors with the same execution plan but different selectivity cubes are merged to form a single child cursor with an updated low-high range interval.

1. Cursor Selectivity cube

For simplicity’s sake I am going to use my old and helpful model:

create table t_acs(n1 number, n2 number);

begin
for j in 1..1200150 loop
   if j=1 then
      insert into t_acs values (j,1);
   elsif j> 1 and j <= 101 then 
        insert into t_acs values (j,100); 
   elsif j> 101 and j <= 1011 then 
        insert into t_acs values (j,1000); 
   elsif j> 10001 and j <= 110001 then 
        insert into t_acs values (j,10000); 
   else insert into t_acs values (j,1000000); 
   end if; 
end loop; 
commit; 
end 
/ 
create index t_acs_idx1 on t_acs(n2); 
begin dbms_stats.gather_table_stats(user, 't_acs' 
                                   ,method_opt => 'for all columns size skewonly'
                                   ,cascade => true
				   ,estimate_percent => dbms_stats.auto_sample_size);
end;
/

The above data set contains 1,200,150 rows of which the n2 column has 5 distinct highly skewed values as shown below:

SQL> select n2, count(1) from t_acs group by n2 order by 2 ;

        N2   COUNT(1)
---------- ----------
         1          1
       100        100
      1000        910
     10000     100000
   1000000    1099139

The n2 column has a FREQUENCY histogram.

SQL> select column_name, histogram
     from user_tab_col_statistics
     where table_name ='T_ACS'
     and column_name  ='N2';

 COLUMN_NAM HISTOGRAM
---------- -----------
N2         FREQUENCY

The selectivity of the numeric n2 column is then computed via the following formula:

SQL> select 
        endpoint_actual_value bind
       ,round( endpoint_actual_value/1200150,6) bind_sel
       ,endpoint_number
       ,endpoint_number –(lag(endpoint_number,1,0)
                         over(order by endpoint_number)) value_count
     from user_tab_histograms
     where table_name ='T_ACS'
     and column_name  ='N2'
     order by endpoint_number;

BIND         BIND_SEL ENDPOINT_NUMBER VALUE_COUNT
---------- ---------- --------------- -----------
1             ,000001               1           1
100           ,000083             101         100
1000          ,000833            1011         910
10000         ,008332          101011      100000
1000000       ,833229         1200150     1099139

The cursor selectivity cube is computed using the selectivity of the n2 bind variable value and an offset of +- 10% far from that selectivity forming  the x and y abscises of the cursor selectivity cube(see the above figure):

SQL> select
    bind
   ,round((sel_of_bind - offset),6) low
   ,round((sel_of_bind + offset),6) high
from
   (select
      bind
     ,value_count/1200150 sel_of_bind
	 ,0.1*(value_count/1200150) offset
     from
     (
      select 
        endpoint_actual_value bind
       ,round( endpoint_actual_value/1200150,6) bind_sel
       ,endpoint_number
       ,endpoint_number –(lag(endpoint_number,1,0)
                         over(order by endpoint_number)) value_count
     from user_tab_histograms
     where table_name ='T_ACS'
     and column_name  ='N2'
     order by endpoint_number
     )
    )
    ;

BIND              LOW       HIGH
---------- ---------- ----------
1             ,000001    ,000001
100           ,000075    ,000092
1000          ,000682    ,000834
10000         ,074991    ,091655
1000000       ,824251   1,007418

Let’s put this select into a sql script and name it CurSelCubeFreq.sql.

We will be back to this script later in this article. For this moment, we will put it on hold and we will embark on the cursor merge section.

2. Cursor merge

The cursor of the query I am going to execute is not yet bind aware as the following proves:

select
    child_number
	,range_id
   ,low
   ,high
 from 
   gv$sql_cs_selectivity
 where
    sql_id ='42wmc4buuh6wb';

no rows selected

But the next execution will mark it bind aware (cursor sharing parameter is set to FORCE) and will generate a new (full table scan) execution plan:

SQL> select count(1) from t_acs where N2 = 1e6;

  COUNT(1)
----------
   1099139

SQL> start xpsimp

SQL_ID  42wmc4buuh6wb, child number 1
-------------------------------------
--------------------------------------------
| Id  | Operation          | Name  | Rows  |
--------------------------------------------
|   0 | SELECT STATEMENT   |       |       |
|   1 |  SORT AGGREGATE    |       |     1 |
|*  2 |   TABLE ACCESS FULL| T_ACS |  1099K|
--------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("N2"=:SYS_B_1)
   
select
    child_number
	,range_id
   ,low
   ,high
 from 
   gv$sql_cs_selectivity
 where
    sql_id ='42wmc4buuh6wb';
	
CHILD_NUMBER   RANGE_ID LOW        HIGH
------------ ---------- ---------- ----------
           1          0 0.824251   1.007418

And, for the sake of clarity, let’s print again the content of the CurSelCubeFre.sql (slightly updated):

SQL> @CurSelCubeFreq
Enter value for bind: 1e6

BIND              LOW       HIGH
---------- ---------- ----------
1000000       ,824251   1,007418

Spot the coincidence 🙂

Suppose now that I want to know whether a less selective bind variable value (1) will force a new hard parse or share an existing execution plan. For that, I will first get the selectivity cube of this bind variable as shown below:

-- The selectivity cube of bind variable 1
SQL> @CurSelCubeFreq
Enter value for bind: 1

BIND              LOW       HIGH
---------- ---------- ----------
1             ,000001    ,000001

As you can see this bind variable value has a selectivity outside outside of that of the existing child cursor n°1. This is why ECS will fairly likely trigger a new hard parse as the following proves:

SQL> select count(1) from t_acs where N2 = 1;

  COUNT(1)
----------
         1

SQL> start xpsimp

SQL_ID  42wmc4buuh6wb, child number 2 -- new execution plan
-------------------------------------
------------------------------------------------
| Id  | Operation         | Name       | Rows  |
------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |
|   1 |  SORT AGGREGATE   |            |     1 |
|*  2 |   INDEX RANGE SCAN| T_ACS_IDX1 |     1 |
------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"=:SYS_B_1)


select
    child_number
	,range_id
   ,low
   ,high
 from 
   gv$sql_cs_selectivity
 where
    sql_id ='42wmc4buuh6wb';

CHILD_NUMBER   RANGE_ID LOW        HIGH
------------ ---------- ---------- ----------
           2          0 0.000001   0.000001  --> new execution plan
           1          0 0.824251   1.007418

Notice that we have got, as expected, a new child cursor n°2 with a new range of selectivity that has produced an index range scan execution plan.

Finally, what if, in the next run, I will use a bind variable value(10000) having a different selectivity but producing the same index range scan execution plan?

SQL> @CurSelCubeFreq
Enter value for bind: 10000

BIND              LOW       HIGH
---------- ---------- ----------
10000         ,074991    ,091655

SQL> select count(1) from t_acs where N2 = 10000;

  COUNT(1)
----------
    100000

SQL> start xpsimp

SQL_ID  42wmc4buuh6wb, child number 3
-------------------------------------
------------------------------------------------
| Id  | Operation         | Name       | Rows  |
------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |
|   1 |  SORT AGGREGATE   |            |     1 |
|*  2 |   INDEX RANGE SCAN| T_ACS_IDX1 |   100K|
------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"=:SYS_B_1)

select
    child_number
	,range_id
   ,low
   ,high
 from 
   gv$sql_cs_selectivity
 where
    sql_id ='42wmc4buuh6wb';
	
CHILD_NUMBER   RANGE_ID LOW        HIGH
------------ ---------- ---------- ----------
           3          0 0.000001   0.091655  --> new child cursor
           2          0 0.000001   0.000001
           1          0 0.824251   1.007418

Notice how the low value of child cursor n°3 (0.000001) corresponds to the low value of child cursor n°2 and not to the low selectivity of the bind variable value for which it has been compiled (.074991). This is because the selectivity of child cursor n°2 has been merged with that of child cursor n°3 since their execution plans are identical. While the selectivity cube of child cursor n°3 has been enlarged child cursor n°2 has been deleted (put in a non-shareable status) as shown below:

select
   child_number
  ,is_bind_aware
  ,is_shareable
 from 
   gv$sql
 where
   sql_id ='42wmc4buuh6wb'; 

CHILD_NUMBER I I
------------ - -
           0 N N
           1 Y Y
           2 Y N → is not shareable
           3 Y Y → includes the selectivity of child n°2

If we want to know whether the bind variable value 100 will share an existing execution plan or force a new one we have to check if its selectivity falls into an existing child cursor selectivity cube or not:

SQL> @CurSelCubeFreq
Enter value for bind: 100


BIND              LOW       HIGH
---------- ---------- ----------
100           ,000075    ,000092

This is going to share the child cursor n°3 since its selectivity falls into the low-high range of that cursor:

SQL> select count(1) from t_acs where N2 = 100;

  COUNT(1)
----------
       100

SQL> start xpsimp

SQL_ID  42wmc4buuh6wb, child number 3
-------------------------------------
------------------------------------------------
| Id  | Operation         | Name       | Rows  |
------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |
|   1 |  SORT AGGREGATE   |            |     1 |
|*  2 |   INDEX RANGE SCAN| T_ACS_IDX1 |   100K|
------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 – access("N2"=:SYS_B_1)

3. Conclusion

That is how the Extended Cursor Sharing layer code works. A combination of bind variable selectivities, with a possibly extendable cursor selectivity cube, allows Oracle to decide, at each execution of a bind aware cursor, whether to share an existing execution plan, compile a brand new one, or merge two cursors to form a single one to the detriment of a deleted one. This last action reduces the memory usage and the number of child cursors during the non-innocent child cursor pruning that occurs when a shareable parent cursor with multiple childs is soft parsed.

October 20, 2017

DDL(constraint), DML(predicate) and SQL CASE

Filed under: SQL — hourim @ 7:36 pm

Check and integrity constraints are DDL (Data Definition Language) operations. They consider TRUE and NULL alike. Query predicates are DML (Data Manipulation Language) operations. They consider FALSE and NULL alike.

Did you spot the difference?

In this article Jonathan Lewis turns this to:

  • DDL predicate allows a row if it does not evalute to FALSE – which means it is allowed to evaluate to TRUE or NULL.
  • DML predicate returns a row if it evaluates to TRUE – which means it will not return a row if it evaluates to FALSE or NULL.

In this article I am going to show that using the SQL CASE statement we can make DDL and DML predicates behave identically.

The standard boolean logic difference between DDL and DML predicates having been already explained (see this and this excellent book) I am not going to repeat that demonstration here. But, for simplicity’s sake, I am going to reuse a model from one of those articles as shown below:

create table t1 (v1 varchar2(1));
alter table t1 add constraint t1_ck_v1 check (v1 in ('a','b','c'));
insert into t1 values ('a');
insert into t1 values ('b');
insert into t1 values ('c');

The t1_ck_v1 DDL check constraint is supposed to prevent insertions into t1 table of any value that doesn’t match ‘a’ or ‘b’ or ‘c’. But since DDL check and integrity constraint considers TRUE and NULL alike, we are allowed to insert a null value in t1 as the following demonstrates:

SQL> insert into t1 values (null);
1 row created.

It suffices to change the check constraint t1_ck_v1 definition and nulls will definitively be banned from entering into t1 table as shown below:

truncate table t1;
alter table t1 drop constraint t1_ck_v1;

alter table t1 add constraint t1_ck_v1 check
    (case when v1 in ('a','b','c') then 1 else 0 end = 1);

insert into t1 values ('a');
insert into t1 values ('b');
insert into t1 values ('c');

SQL> insert into t1 values (null);
insert into t1 values (null)
*
ERROR at line 1:
ORA-02290: check constraint (C##MHOURI.T1_CK_V1) violated

In contrast to the first created check constraint the new refactored one has successfully pre-empted a null value from being inserted into t1 table (we will see shortly why).

The above demonstrates that using the CASE statement we made a DDL predicates considering NULL and FALSE alike while they are originally implemented to treat TRUE and NULL alike. Let’s now check how the CASE statement works with a DML predicate:

SQL> select count(1) from t1;

  COUNT(1)
----------
         3

SQL> select count(1) from t1
     where
        ( case when v1 in ('a','b','c') then 1 else 0 end = 1);

  COUNT(1)
----------
         3

Indeed the case statement works the same way in both DML and DDL predicates.

And now the legitimate question is why does the CASE statement make the check constraint, which is a DDL predicate, consider FALSE and NULL alike?
The answer can be found in the Oracle official documentation:

“In a searched CASE expression, Oracle searches from left to right until it finds an occurrence of condition that is true, and then returns return_expr. If no condition is found to be true, and an ELSE clause exists, Oracle returns else_expr. Otherwise, Oracle returns null”

Thanks to the ELSE clause of the CASE-WHEN combination we transformed a NULL possibility into a NOT NULL value (0 in the current situation). This is why we should always forsee an ELSE into a CASE statement. Should we have ommited the ELSE clause in the last t1_ck_v1 we would have then allowed null values to be inserted into t1 table as shown below:

truncate table t1;
alter table t1 drop constraint t1_ck_v1;

alter table t1 add constraint t1_ck_v1 check
    (case when v1 in ('a','b','c') then 1 end = 1);

SQL> insert into t1 values(null);
1 row created.

Furthermore ommitting the ELSE in the CASE allows not null values not in the triplet (‘a’,’b’,’c’) to be inserted into t1 table as shown below:

SQL> insert into t1 values ('z');
1 row created.

SUMMARY
Protect yourself against the always threatening NULL by setting your column not null whenever possible. Bear in mind that the CASE statement, when used with its ELSE part, can not only make DDL predicate treats FALSE and NULL alike, but it behaves the same way during DML and DDL predicate.

October 13, 2017

Execution plans : blocking and non-blocking operations

Filed under: explain plan — hourim @ 6:52 pm

Below are two 12cR2 row-source execution plans of the same query:

SQL_ID  9yqrqpgux93zk, child number 0
--------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |      1 |        |      5 |
|*  1 |  VIEW                    |      |      1 |   1000 |      5 |
|*  2 |   WINDOW SORT PUSHED RANK|      |      1 |   1000 |      5 |
|   3 |    TABLE ACCESS FULL     | T1   |      1 |   1000 |   1000 |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN"<=5)
   2 - filter(RANK() OVER ( ORDER BY "N1")<=5)
SQL_ID  9yqrqpgux93zk, child number 0
--------------------------------------------------------------------------
| Id  | Operation              | Name         | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |              |      1 |        |      5 |
|*  1 |  VIEW                  |              |      1 |   1000 |      5 |
|*  2 |   WINDOW NOSORT STOPKEY|              |      1 |   1000 |      5 |
|   3 |    INDEX FULL SCAN     | IDX_T1_N1_UK |      1 |   1000 |      6 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN"<=5)
   2 - filter(RANK() OVER ( ORDER BY "N1")<=5)
 

Forget the switch from a FULL TABLE scan to an INDEX FULL scan at line n° 3 and the switch from a SORT to a NOSORT operation at line n°2. What other remarkable difference we can still spot out there? Have you noticed that operation n°3 produced 1000 rows in the first execution plan and only 6 rows in the second one?

Here’s the model with which I obtained the above execution plans respectively:

create table t1 
as select 
    rownum n1
   ,trunc((rownum-1/3)) n2 
from dual 
connect by level <=1e3;

create unique index idx_t1_n1_uk on t1(n1);

exec dbms_stats.gather_table_stats(user, 't1');

select /*+ gather_plan_statistics */ n1
from 
   (select 
       n1
	  ,rank() over (order by n1) rn
	from t1)
where
   rn <=5;

select * from table(dbms_xplan.display_cursor(null,null, 'allstats last'));
alter table t1 modify (n1 not null);

-- execute the same query

select * from table(dbms_xplan.display_cursor(null,null, 'allstats last'));

What happened here so that an operation without a filter predicate produces 6 rows where it is supposed to generate the entire set of rows in the table i.e. 1,000 rows?

The explanation comes from the way Oracle handles blocking and non-blocking operations. A blocking operation requires receiving all rows from its child operation before a single row can be processed. A non-blocking operation consumes and produces rows at the same time.

Applied to the current situation, we see that the two previous execution plans have both a blocking operation at line n°2 which is the VIEW operation.

Right?

In fact, as magnificiently explained by Jonathan Lewis in this post, the blocking nature of the VIEW operation is function of its first child operation. If this one is a blocking operation then the VIEW is a blocking operation as well. If, however, the child operation is not blocking then its VIEW parent operation is a non-blocking operation that can consume and produce rows at the same time.

The first child of the two VIEW operations in the above execution plans is respectively:

|*  2 |   WINDOW SORT PUSHED RANK|

|*  2 |   WINDOW NOSORT STOPKEY| 

The WINDOW SORT PUSHED RANK child operation of the VIEW is a sort operation. Sorts,as you know, are blocking operations. This is why the VIEW in the first execution plan is a blocking operation. Oracle has to retrieve all rows from table t1 (A-Rows = 1000) at operation n°3, sort them totally at operation n°2 before applying the filter predicate n°2 reducing as such the number of retrieved rows to only 5.

In the second execution plan the WINDOW NOSORT STOPKEY child operation of the VIEW is a non-blocking operation thanks to the ordered list of keys it receives from its INDEX FULL SCAN child operation. The VIEW asks for the first row from its first child operation which, at its turn, asks for the same row from its child operation the INDEX FULL SCAN at line 3. The index gives the first ordered key it has found to its parent operation, the WINDOW NOSORT STOPKEY,which feeds back again it parent operation, the VIEW, provided the index key survives the filter predicate n°2. The VIEW asks then for the second row, for the third row, and so on until the WINDOW NOSORT STOPKEYdeclares the end of the query. Indeed, the WINDOW NOSORT STOPKEY knows the upper bound it can’t go beyound. This is why when it receives the 6th row from the index full scan operation (A-Rows= 6) it realises that this row spans its upper bound and that it is now time to signal to the coordinator (select statement) the end of the query.

Here’s another example demonstrating that even with FULL segment scans Oracle is able to stop generating rows at the appropriate moment. All what it requires from the parent operation is to be a non-blocking operation able to consume and produce rows at the same time:

-- n1  is declared nullable

select /*+ gather_plan_statistics */ n1
from
   t1
fetch next 5 rows only;

------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |
------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      5 |
|*  1 |  VIEW                  |      |      1 |      5 |      5 |
|*  2 |   WINDOW NOSORT STOPKEY|      |      1 |   1000 |      5 |
|   3 |    TABLE ACCESS FULL   | T1   |      1 |   1000 |      6 |
------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY  NULL )<=5)
-- n1 is declared not null

select /*+ gather_plan_statistics */ n1
from
   t1
fetch next 5 rows only;

--------------------------------------------------------------------------
| Id  | Operation              | Name         | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |              |      1 |        |      5 |
|*  1 |  VIEW                  |              |      1 |      5 |      5 |
|*  2 |   WINDOW NOSORT STOPKEY|              |      1 |   1000 |      5 |
|   3 |    INDEX FAST FULL SCAN| IDX_T1_N1_UK |      1 |   1000 |      6 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY  NULL )<=5)

Summary

Sometimes we need to read and interpret execution plans not only in their order of operation using the “first child first, recursive descent” rule, but also in their order of data flow propagation (or row-source generation). In the latter case, knowing how to identify blocking and non-blocking operations is very important.

September 23, 2017

SQL Plan Baseline, SQL Profile and materialized CTE

Filed under: Materialized view,Sql Plan Managment — hourim @ 5:32 pm

This is a reformulated simple answer to one of my colleague’s question of whether a materialized CTE would affect the functioning of a SQL Profile or a SQL Plan Baseline or both.

A very short answer is : it will not affect neither the first nor the second

A little bit more elaborated answer is:

  • it will not affect the use of SQL Profile because the transient name of the materialized SYS TEMP table is not part of the force matching signature of both the SQL query and the SQL profile
  • The changing generated name of the materialized SYS TEMP table has no effect on the PHV2 of the generated CBO plan.

An answer with demonstration is:

-- table
create table t_st 
as select 
  rownum n1, mod(rownum,5) n2, trunc((rownum-1/4)) n3, dbms_random.value(20,30) n4
from dual 
connect by level <=1e4;

-- query
with got_my_t2 as
 (select /*+ materialize */ n1, d1 from t2_st where n1 <= 11)
 select
      max(n4)
 from t_st a
 where exists
    (select null
     from got_my_t2 b
     where a.n1 = b.n1);

-- execution plan
select * from table(dbms_xplan.display_cursor);

SQL_ID  53fhbt0gjrwb3, child number 0
-------------------------------------
Plan hash value: 1035791491
--------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Rows  |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |       |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |       |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6667_32BBFD |       |
|*  3 |    TABLE ACCESS FULL                     | T2_ST                     |    11 |
|   4 |   SORT AGGREGATE                         |                           |     1 |
|*  5 |    HASH JOIN RIGHT SEMI                  |                           |    11 |
|   6 |     VIEW                                 |                           |    11 |
|   7 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9D6667_32BBFD |    11 |
|   8 |     TABLE ACCESS FULL                    | T_ST                      | 10000 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N1"<=11)
   5 – access("A"."N1"="B"."N1")

To the above setup I am going to add a SQL profile and a SQL plan baseline as shown below:

–- first SQL profile using Oracle script
SQL>@coe_xfr_sql_profile

Parameter 1:
SQL_ID (required)
Enter value for 1: 53fhbt0gjrwb3

PLAN_HASH_VALUE AVG_ET_SECS
--------------- -----------
     1035791491        ,014

Parameter 2:
PLAN_HASH_VALUE (required)
Enter value for 2: 1035791491

SQL>@coe_xfr_sql_profile_53fhbt0gjrwb3_1035791491.sql

... manual custom SQL Profile has been created
COE_XFR_SQL_PROFILE_53fhbt0gjrwb3_1035791491 completed

-- second a SQL Plan Baseline
declare
   spm_op pls_integer;
begin
   spm_op := dbms_spm.load_plans_from_cursor_cache (sql_id => '53fhbt0gjrwb3');
end;
/

As such, re-executing the initial query will show the above double execution plan protection in action:

with got_my_t2 as
 (select /*+ materialize */ n1, d1 from t2_st where n1 <= 11)
 select
      max(n4)
 from t_st a
 where exists
    (select null
     from got_my_t2 b
     where a.n1 = b.n1);

Plan hash value: 1035791491
--------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Rows  |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |       |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |       |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6668_32BBFD |       |
|*  3 |    TABLE ACCESS FULL                     | T2_ST                     |    11 |
|   4 |   SORT AGGREGATE                         |                           |     1 |
|*  5 |    HASH JOIN RIGHT SEMI                  |                           |    11 |
|   6 |     VIEW                                 |                           |    11 |
|   7 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9D6668_32BBFD |    11 |
|   8 |     TABLE ACCESS FULL                    | T_ST                      | 10000 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N1"<=11)
   5 - access("A"."N1"="B"."N1")

Note
-----
   - SQL profile coe_53fhbt0gjrwb3_1035791491 used for this statement
   - SQL plan baseline SQL_PLAN_a9tbt2yhxacpz46c3c689 used for this statement

Now that I have built the setup I would like to demonstrate that if the name of the TEMP table transformation

SYS_TEMP_0FD9D6668_32BBFD

changes it will not prevent both the SQL Profile and the SPM baseline from constraining the initial query to stick with the fixed execution plan. Before going further I would like to get the PHV2 of the above execution plan:

SELECT
               p.sql_id
              ,p.plan_hash_value
              ,p.child_number
              ,t.phv2
        FROM   v$sql_plan p
              ,xmltable('for $i in /other_xml/info
                        where $i/@type eq "plan_hash_2"
                        return $i'
                        passing xmltype(p.other_xml)
                        columns phv2 number path '/') t
          WHERE p.sql_id = '&1'
          and   p.other_xml is not null;

SQL>@phv2
Enter value for 1: 53fhbt0gjrwb3

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER       PHV2
------------- --------------- ------------ ----------
53fhbt0gjrwb3      1035791491            0 1187235465

Let’s now force Oracle to use a new name for the materialized CTE

 
SQL>alter system flush shared_pool;
System altered.

with got_my_t2 as
 (select /*+ materialize */ n1, d1 from t2_st where n1 <= 11)
 select
      max(n4)
 from t_st a
 where exists
    (select null
     from got_my_t2 b
     where a.n1 = b.n1);

Plan hash value: 1035791491
--------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Rows  |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |       |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |       |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6669_32BBFD |       |
|*  3 |    TABLE ACCESS FULL                     | T2_ST                     |    11 |
|   4 |   SORT AGGREGATE                         |                           |     1 |
|*  5 |    HASH JOIN RIGHT SEMI                  |                           |    11 |
|   6 |     VIEW                                 |                           |    11 |
|   7 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9D6669_32BBFD |    11 |
|   8 |     TABLE ACCESS FULL                    | T_ST                      | 10000 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N1"<=11)
   5 - access("A"."N1"="B"."N1")

Note
-----
   - SQL profile coe_53fhbt0gjrwb3_1035791491 used for this statement
   - SQL plan baseline SQL_PLAN_a9tbt2yhxacpz46c3c689 used for this statement

As you can see Oracle has derived a new CTE TEMP Table name:

 
SYS_TEMP_0FD9D6669_32BBFD

which, as expected, has not prevented both the SPM and the SQL Profile from being used. Furthermore, despite the TEMP table name change, the PHV2 of the generated plan is still identical to the first one mentioned above:

 
SQL>@phv2
Enter value for 1: 53fhbt0gjrwb3

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER       PHV2
------------- --------------- ------------ ----------
53fhbt0gjrwb3      1035791491            0 1187235465

Bottom Line : don’t worry about your materialized Common Table Expression used in critical queries that you want to stabilize via a SQL profile or a SQL Plan Baseline. Their changing name will not affect these two execution plan stabilizing techniques.

August 12, 2017

Unnesting of coalesced subqueries

Filed under: Oracle — hourim @ 3:49 pm

The 12cR1 way used by Oracle to coalesce two correlated disjunctive subqueries has prevented a real life query from unnesting its two subqueries causing a serious performance issue. This article examines a representative model of this real life query. The first paragraph explains the any-to-exists subquery conversion. The second paragraph shows how two correlated exists subqueries can be coalesced to form a single subquery. Once these two transformations introduced the third paragraph will examine how the previously coalesced subqueries can be unnested. It will particularly show that the ability of Oracle to unnest two coalesced correlated subqueries depends strongly on the coalescing method as this one differs from version to version.

1. Any-to-exists subquery conversion

Before coalescing two (or more) subqueries Oracle starts first by transforming ANY and ALL subqueries into EXISTS and NOT EXISTS respectively. Consider, for example, the following query:

select
    a.id1
   ,a.n1
   ,a.start_date
from 
    t1 a
where 
   (a.id1 in
         (select b.id
          from t2 b
          where b.status = 'COM')
   or a.id1 in
         (select c.id1
          from t2 c
          where c.status = 'ERR')
     );

If you examine its corresponding 10053 CBO trace file you will find that its two disjunctive subqueries have been first converted from an any-to-exists subqueries as shown below:

Query After Exists Conversion:******* UNPARSED QUERY IS *******

select
  a.id1 id1,
  a.n1 n1,
  a.start_date start_date
from
  t1 a
where
  exists
  ( select b.id id
    from t2 b
    where b.status='COM'
    and   a.id1 =b.id)
or exists
  ( select c.id1 id1
    from t2 c
    where c.status='ERR'
    and a.id1 =c.id1)
  );
 

This is generally the preduled to or, a preliminary part of, a subquery coalescing transformation presented in the next section

2. Subquery Coalescing

The preceding any-to-exists transformed query is nothing else than an instance of the following simplified one:

select 
  {list of columns}
from
  table
where 
  predicate

The predicate in this particular case is acting as two disjunctive correlated subqueries of the same type:

-- first subquery
exists
  ( select b.id id
    from t2 b
    where b.status = 'COM'
    and   a.id1    = b.id)

-- second subquery
 exists
  ( select c.id1 id1
    from t2 c
    where c.status ='ERR'
    and a.id1 = c.id1)

If these two subqueries are not coalesced they will probably generate two accesses to the same table t2. This is why Oracle can, under certain conditions, coalesce two conjunctive or disjucntive subqueries into a single one. Applied to the current query the coalescing process yields the following new transformed query:

Query After Ex/Any Disj SQ coalescing:******* UNPARSED QUERY IS *******

select
  a.id1 id1,
  a.n1 n1,
  a.start_date start_date
from
  t1 a
where
  exists
  (
    select
      0 id
    from
      t2 b
    where
      b.status  = 'COM'
    and a.id1   = b.id
    or b.status ='ERR'
    and a.id1   = b.id1
  );

The execution plan of the doubly transformed query is shown below:

alter session set statistics_level=all;
select * from table(dbms_xplan.display_cursor(null,null, 'allstats last'));

--------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |   9890 |
|*  1 |  FILTER            |      |      1 |        |   9890 |
|   2 |   TABLE ACCESS FULL| T1   |      1 |  10000 |  10000 |
|*  3 |   TABLE ACCESS FULL| T2   |  10000 |      1 |   9890 |
--------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)
   3 - filter((("B"."ID"=:B1 AND "B"."STATUS"='COM') OR ("B"."ID1"=:B2 AND
              "B"."STATUS"='ERR')))

If you still have any doubts about whether the two subqueries have been coalesced or not then the single apparition of table T2 in the plan together with the predicate n°3 will certainly clear out this doubt for good. Indeed, it is the subquery coalescing process that resulted into Oracle visiting table T2 only one time.

In addition in the outline data used to generate the above execution plan reproduced here below:

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      DB_VERSION('12.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$ACD206C8")
      COALESCE_SQ(@"SEL$3")
      OUTLINE_LEAF(@"SEL$1")
      OUTLINE(@"SEL$7C83FF7B")
      COALESCE_SQ(@"SEL$2")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$1" "A"@"SEL$1")
      PQ_FILTER(@"SEL$1" SERIAL)
      FULL(@"SEL$ACD206C8" "B"@"SEL$2")
      END_OUTLINE_DATA
  */

The COALESCE_SQ(@”SEL$3″) and COALESCE_SQ(@”SEL$2″)represent the hints used to force a subquery coalescing transformation. If you want to cancel such a subquery coalescing transformation then the NO_COALESCE_SQ(qb_name) hint will be the one you should use. You can also change the hidden parameter _optimizer_coalesce_subqueries governing this transformation which defaults to true in 12cR1.

3. Subquery unnesting
        3.1. The 12cR1 effect

The any-to-exist conversion followed by the subquery coalescing should have normally paved the way for subquery unnesting transformation. Unfortunately, the real life query, which the actual model is a true representation of, the coalesced subqueries have not been unnested altering dramatically the performance of the original query as shown via its corresponding sql monitor report:

-- Get the report_id corresponding to the triplet (sql_id,sql_exec_id, sql_exec_start)
 select 
      report_id
    , key1 sql_id
    , key2 sql_exec_id
    , key3 sql_exec_start
  from
     dba_hist_reports
  where
     component_name ='sqlmonitor'
  and key1 = '97r22vwap7x9t';

--  Use this report_id to get the historical SQL monitor report
select
     dbms_auto_report.report_repository_detail
     (rid   => 8934
     ,type => 'text')
  from dual;

Global Information
---------------------  
 Status                :  DONE (ALL ROWS)       
 Instance ID           :  1                     
 Session               :  skskslsls (1343:42100) 
 SQL ID                :  97r22vwap7x9t         
 SQL Execution ID      :  16777216              
 Execution Started     :  06/27/2017 11:11:35   
 First Refresh Time    :  06/27/2017 11:11:41 
 Last Refresh Time     :  06/27/2017 12:02:57 
 Duration              :  3082s                    
 Module/Action         :  xxxxxxxxxxxx          
 Service               :  xxxxxxxxx          
 Program               :  xxxxxxxxxxx                  
 Fetch Calls           :  13             

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    3083 |    3079 |     3.67 |    13 |  294M  |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=4071256796)
========================================================================
| Id |               Operation |   Name   |  Rows   | Execs |   Rows   |
|    |                         |          | (Estim) |       | (Actual) |
========================================================================
|  0 | SELECT STATEMENT        |          |         |     1 |     1010 |
|  1 |   FILTER                |          |         |     1 |     1010 |
|  2 |    TABLE ACCESS FULL    | TABLE1   |   89457 |     1 |    89457 |
|  3 |    TABLE ACCESS FULL    | TABLE2   |       1 | 89457 |     1010 |
========================================================================

The FILTER at operation n°1 acts exactly as NESTED LOOPS join would do: for each row from table TABLE1 (A-Rows=89,457) looks for rows in table TABLE2 (Starts = 89,457) that satisfy the filter predicate n°3.

This is exactly what happens when a correlated subquery is not unnested: it will be evaluated as many times as there are rows in the outer query block.

       3.2. The 11gR2 effect

The on call DBA told me that this query was performing very well before the upgrade. My immediate reflex was then to run the same query under the previous 14 OFE(Optimizer Features Enable) . Useless to say that it completed very quickly using the following execution plan:

Plan hash value: 3372386476
-------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Time     |
-------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |       |          |
|*  1 |  HASH JOIN            |          |  2202 | 00:00:01 |
|   2 |   VIEW                | VW_NSO_1 |  2202 | 00:00:01 |
|   3 |    HASH UNIQUE        |          |  2202 | 00:00:01 |
|   4 |     UNION-ALL         |          |       |          |
|*  5 |      TABLE ACCESS FULL| TABLE2   |  1101 | 00:00:01 |
|*  6 |      TABLE ACCESS FULL| TABLE2   |  9890 | 00:00:01 |
|   7 |   TABLE ACCESS FULL   | TABLE1   | 89457 | 00:00:01 |
-------------------------------------------------------------

The presence of the term VW_NSO_1 at operation id n°2 in the previous plan is a sign of Oracle having unnested the two correlated subqueries in order to eliminate the prolific FILTER effect of multi-starting the TABLE2outer full scan operation.

However, the presence of TABLE2 twice in the execution plan tends to suggest that Oracle has not used the 12cR1 technic to coalesce the two correlated subqueries. Indeed in contrast to 12cR1, in 11gR2 Oracle has merged the two subqueries into a single one but without eliminating the two select against TABLE2. In the corresponding 10053 trace file we can read the following lines:

Final query after transformations:******* UNPARSED QUERY IS *******

select
  a.id1 id1,
  a.n1 n1,
  a.start_date start_date
from
  (  (select c.id1 id1 from t2 c where c.status='ERR')
    union
     (select b.id id from t2 b where b.status='COM')
  ) vw_nso_1,
  t1 a
where
  a.id1= vw_nso_1.id1;

-- 11.2.0.4
*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block SEL$1 (#1) that are valid to unnest.
Subquery removal for query block SEL$3 (#2)
RSW: Not valid for subquery removal SEL$3 (#2)
Subquery unchanged.
Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#3)
SU:   Checking validity of unnesting subquery SEL$3 (#2)
../..
SU:   Checking validity of unnesting subquery SET$E74BECDC (#6)
SU:   Passed validity checks.
SU:   Transform an ANY subquery to semi-join or distinct.

With the OFE set to 11.2.0.4 Oracle has transformed the two disjunctive subqueries into a single subquery in which the dijsunction has been replaced by a union operation (instead of union all because a distinct operation has to be applied for the in-list elements). The result of this transformation has been transformed into a inline view before being unnested with the outer block query:

Registered qb: SET$7FD77EFD 0x1c710c88 (SUBQ INTO VIEW FOR COMPLEX UNNEST SET$E74BECDC)

Thanks to the unnesting of the inline view ( VW_NSO_1) a more peformant and suitable hash join (at least for the real life query) has been used instead of the mutli-executed filter operation.

4. Why the unnesting has not been possible in 12cR1?

As always the 10053 trace file reproduced below (reduced to the bare minimum) gives an answer to the posed question:

-- 12.1.0.1
*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block SEL$1 (#1) that are valid to unnest.
Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU: Checking validity of unnesting subquery SEL$ACD206C8 (#2)
SU: SU bypassed: Invalid correlated predicates.
SU: Validity checks failed.
*******************************

It seems that Oracle has been unable to unnest the two subqueries it has coalesced because they are using two different correlated predicates as shown below:

where
  exists
  (
    select
      0 id
    from
      t2 b
    where
      b.status  = 'COM'
    and a.id1   = b.id  --1st correlated predicate
    or b.status ='ERR'
    and a.id1   = b.id1 -- 2nd correlated predicate
  );

Had those two correlated predicates been identical the unnesting of the coalesced disjunctive correlated subqueries would have been possible as demonstrated below:

explain plan for
select
    a.id1
   ,a.n1
   ,a.start_date
from 
    t1 a
where 
   (a.id1 in
         (select b.id1 –- spot the change here
          from t2 b
          where b.status = 'COM')
   or a.id1 in
         (select c.id1
          from t2 c
          where c.status = 'ERR')
        );

select * from table(dbms_xplan.display(format =>'+outline'));

------------------------------------------------------
| Id  | Operation                    | Name  | Rows  |
------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    10 |
|   1 |  NESTED LOOPS                |       |    10 |
|   2 |   NESTED LOOPS               |       |    10 |
|   3 |    SORT UNIQUE               |       |  9890 |
|*  4 |     TABLE ACCESS FULL        | T2    |  9890 |
|*  5 |    INDEX UNIQUE SCAN         | T1_PK |     1 |
|   6 |   TABLE ACCESS BY INDEX ROWID| T1    |     1 |
------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      SEMI_TO_INNER(@"SEL$82F4A621" "B"@"SEL$2")
      NLJ_BATCHING(@"SEL$82F4A621" "A"@"SEL$1")
      USE_NL(@"SEL$82F4A621" "A"@"SEL$1")
      LEADING(@"SEL$82F4A621" "B"@"SEL$2" "A"@"SEL$1")
      INDEX(@"SEL$82F4A621" "A"@"SEL$1" ("T1"."ID1"))
      FULL(@"SEL$82F4A621" "B"@"SEL$2")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$3")
      COALESCE_SQ(@"SEL$2")
      OUTLINE(@"SEL$7C83FF7B")
      COALESCE_SQ(@"SEL$3")
      OUTLINE(@"SEL$ACD206C8")
      OUTLINE(@"SEL$1")
      UNNEST(@"SEL$ACD206C8")
      OUTLINE_LEAF(@"SEL$82F4A621")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("B"."STATUS"='COM' OR "B"."STATUS"='ERR')
   5 - access("A"."ID1"="B"."ID1")

Note
-----
   - this is an adaptive plan

5. Summary

The Oracle way of coalescing subqueries seems to be going back and forth between different versions. Two dijsunctive correlated subqueries have been differently coalesced in 11.2.0.4 and 12.1.0.1. While in the former they are classically transformed into a single subquery combining two branches of a union of two tables, in the later release they have been coalesced to form a single subquery accesing a single table. The 12cR1 and 12cR2 way of coalescing two correlated subqueries introduces a new requirement for their unnesting: they need to have the same correlated predicates. Thereby you should watch out carrefully your 12c queries having two (or more) correlated subqueries. If they are using different (invalid:-)) predicates they might be still coalesced but they will not be unnested. If you judge that the unnesting of your subqueries is better than using their coalesced version as a predicate of a FILTER operation then you have at your dispoable the NO_COALESCE_SQ(qb_name)or the _optimizer_coalesce_subqueries hidden parameter to cancel the subqueries coalescing transformation.

6. Bonus (more…)

August 5, 2017

Parallel index rebuild: Oracle versus MS-SQL Server

Filed under: Oracle — hourim @ 4:57 pm

A couple of months ago I have been prompted by this MS-SQL Server article about rebuilding indexes in parallel. The bottom line of this article is that MS-SQL Server engine will not use parallel run when rebuilding a unique index starting with a column having a single distinct value. This article will first demonstrate this claim using MS-SQL Server 2016 and then extend the same demonstration to Oracle in its latest version.

1. SQL Server

The followings will create a table and its unique three columns composite clustered index

select @@version
Microsoft SQL Server 2016 (RTM-GDR) (KB3210111) - 13.0.1728.2 (X64)  
Dec 13 2016 04:40:28   Copyright (c) Microsoft Corporation 
Developer Edition (64-bit) on Windows 8.1 6.3 <X64> (Build 9600: ) 

-- create table
create table t (c1 int, c2 int, c3 int);

set nocount on
begin tran
declare @i int = 0
while @i < 10000000
begin
    insert into t values (@i, @i, @i)
    set @i = @i + 1
end
commit tran

-- create a unique clustered index
create unique clustered index ix on t(c1,c2,c3)

Now that we are done with the model we are going to rebuild the above clustered index using parallel run

-- This will use parallel plan
ALTER INDEX ix on [dbo].t rebuild WITH (maxdop=10, online=on)

Before rebuilding the index I have activated the ”Include Actual Execution Plan” icon of SSMS (SQL Server Management Studio) so that I got the following “row source” plan where the two yellow arrows indicate that the rebuild has been, indeed, parallelized:

However has the first column c1 of the index been single valued the index would have not been rebuild parallely by the SQL Server engine as the followings demonstrate:

-- set the leading column of the clustered index to a single value
 update t set c1 = 1

-- update statistics with 100% sample
update statistics t with fullscan

–- This will use serial plan
ALTER INDEX ix on [dbo].t rebuild WITH (maxdop=10, online=on)

The execution plan of the second index rebuild show that this time MS-SQL refuses to obey the parallel hint and decides to go serially as shown below:

The MS-SQL documentation stipulates that when the leading key of the index of a non-partitioned table has a single distinct value it becomes useless to give a parallel thread a range of single value to process them. In other words the leading column should have multiple distinct values in order for parallelism to be worth a utilization.

Now that we have set the scene for MS-SQL let’s reproduce it for Oracle

2. Oracle

SQL> select banner from v$version where rownum=1;

BANNER
-----------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> create table t (c1 int, c2 int, c3 int);

SQL> begin
       for j in 1..10000000
     loop
        insert /*+ append */ into t values (j, j, j);
     end loop;
     commit;
     end;
     /

SQL> create unique index ix on t(c1,c2,c3);

SQL> exec dbms_stats.gather_table_stats(user, 't');

SQL> select 
         column_name
         ,num_distinct
         ,histogram
    from 
        all_tab_col_statistics
    where table_name = 'T'
    order by 2;

COLUMN_NAM NUM_DISTINCT HISTOGRAM
---------- ------------ ---------------
C1              9914368 NONE
C3              9914368 NONE
C2              9914368 NONE

It’s now time to use parallelism to rebuild the above index

SQL> alter index ix rebuild parallel 4;

select
    dfo_number, tq_id, server_type, instance, process, num_rows
from
   v$pq_tqstat
order by
    dfo_number, tq_id, server_type, instance, process;

DFO_NUMBER      TQ_ID SERVER_TYPE    INSTANCE PROCESS    NUM_ROWS
---------- ---------- ------------ ---------- -------- ----------
         1          0 Consumer              1 P000        2923571
         1          0 Consumer              1 P001        2623371
         1          0 Consumer              1 P002        2270919
         1          0 Consumer              1 P003        2182139
         1          0 Producer              1 P004        2638901
         1          0 Producer              1 P005        2637316
         1          0 Producer              1 P006        2611722
         1          0 Producer              1 P007        2114251
         1          0 Ranger                1 QC               12
         1          1 Consumer              1 QC                4
         1          1 Producer              1 P000              1
         1          1 Producer              1 P001              1
         1          1 Producer              1 P002              1
         1          1 Producer              1 P003              1


Parallel Execution Details (DOP=4 , Servers Allocated=8)
===================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    |
|                |       |         | Time(s) | Time(s) | Waits(s) |
===================================================================
| PX Coordinator | QC    |         |    0.10 |    0.02 |     0.01 |
| p000           | Set 1 |       1 |    5.53 |    4.12 |     0.02 |
| p001           | Set 1 |       2 |    5.48 |    3.60 |     0.01 |
| p002           | Set 1 |       3 |    4.87 |    3.20 |     0.03 |
| p003           | Set 1 |       4 |    4.97 |    3.03 |     0.01 |
| p004           | Set 2 |       1 |    1.05 |    1.05 |          |
| p005           | Set 2 |       2 |    1.01 |    0.83 |          |
| p006           | Set 2 |       3 |    1.04 |    1.03 |          |
| p007           | Set 2 |       4 |    0.99 |    0.84 |          |
===================================================================

SQL Plan Monitoring Details (Plan Hash Value=3277881472)
===========================================================================
| Id |         Operation          |   Name   |  Rows   | Cost |   Time    |
|    |                            |          | (Estim) |      | Active(s) |
===========================================================================
|  0 | CREATE INDEX STATEMENT     |          |         |      |         1 |
|  1 |   PX COORDINATOR           |          |         |      |         1 |
|  2 |    PX SEND QC (ORDER)      | :TQ10001 |      82 |      |         1 |
|  3 |     INDEX BUILD UNIQUE     | IX       |         |      |         4 |
|  4 |      SORT CREATE INDEX     |          |      82 |      |         6 |
|  5 |       PX RECEIVE           |          |      82 |    2 |         4 |
|  6 |        PX SEND RANGE       | :TQ10000 |      82 |    2 |         2 |
|  7 |         PX BLOCK ITERATOR  |          |      82 |    2 |         1 |
|  8 |          TABLE ACCESS FULL | T        |      82 |    2 |         1 |
===========================================================================

As expected Oracle, in accordance with MS-SQL Server, has obeyed the instruction of rebuilding the index in parallel.

Let’s now unify the distinct values of the leading column c1 of the index and then try rebuilding it parallely:

 SQL> update t set c1 = 1;

SQL> exec dbms_stats.gather_table_stats(user, 't');

SQL> select 
         column_name
         ,num_distinct
         ,histogram
      from 
        all_tab_col_statistics
      where table_name = 'T'
      order by 2;

COLUMN_NAM NUM_DISTINCT HISTOGRAM
---------- ------------ ----------
C1                    1 NONE
C3              9914368 NONE
C2              9914368 NONE

SQL> alter index ix rebuild parallel 4;

select
    dfo_number, tq_id, server_type, instance, process, num_rows
from
   v$pq_tqstat
order by
    dfo_number, tq_id, server_type, instance, process;
DFO_NUMBER      TQ_ID SERVER_TYPE     INSTANCE PROCESS     NUM_ROWS
---------- ---------- ------------- ---------- --------- ----------
         1          0 Consumer               1 P000         2923571
         1          0 Consumer               1 P001         2623371
         1          0 Consumer               1 P002         2270919
         1          0 Consumer               1 P003         2182139
         1          0 Producer               1 P004         2694213
         1          0 Producer               1 P005         2542784
         1          0 Producer               1 P006         2814202
         1          0 Producer               1 P007         1950991
         1          0 Ranger                 1 QC                12
         1          1 Consumer               1 QC                 4
         1          1 Producer               1 P000               1
         1          1 Producer               1 P001               1
         1          1 Producer               1 P002               1
         1          1 Producer               1 P003               1

Parallel Execution Details (DOP=4 , Servers Allocated=8)
===================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    |
|                |       |         | Time(s) | Time(s) | Waits(s) |
===================================================================
| PX Coordinator | QC    |         |    0.78 |    0.02 |     0.01 |
| p000           | Set 1 |       1 |    5.16 |    3.95 |     0.01 |
| p001           | Set 1 |       2 |    5.10 |    3.56 |     0.01 |
| p002           | Set 1 |       3 |    4.09 |    3.15 |     0.01 |
| p003           | Set 1 |       4 |    4.70 |    3.23 |     0.00 |
| p004           | Set 2 |       1 |    1.38 |    1.08 |     0.02 |
| p005           | Set 2 |       2 |    1.26 |    1.23 |     0.03 |
| p006           | Set 2 |       3 |    1.32 |    1.12 |     0.03 |
| p007           | Set 2 |       4 |    1.31 |    0.99 |     0.03 |
===================================================================

SQL Plan Monitoring Details (Plan Hash Value=3277881472)
===========================================================================
| Id |         Operation          |   Name   |  Rows   | Cost |   Time    |
|    |                            |          | (Estim) |      | Active(s) |
===========================================================================
|  0 | CREATE INDEX STATEMENT     |          |         |      |         2 |
|  1 |   PX COORDINATOR           |          |         |      |         7 |
|  2 |    PX SEND QC (ORDER)      | :TQ10001 |     10M |      |         1 |
|  3 |     INDEX BUILD UNIQUE     | IX       |         |      |         4 |
|  4 |      SORT CREATE INDEX     |          |     10M |      |         5 |
|  5 |       PX RECEIVE           |          |     10M | 2397 |         3 |
|  6 |        PX SEND RANGE       | :TQ10000 |     10M | 2397 |         2 |
|  7 |         PX BLOCK ITERATOR  |          |     10M | 2397 |         1 |
|  8 |          TABLE ACCESS FULL | T        |     10M | 2397 |         1 |
===========================================================================		

In contrast to MS-SQL server which refuses to use parallelism when rebuilding a unique clustered index starting with a column having a unique distinct value, Oracle has successfully rebuild the same kind of unique index in approximatively the same execution time.

3. SUMMARY

We’ve considered two instances of the same unique composite index:

  1. A unique composite index starting with a column having a unique distinct value
  2. A unique composite index starting with a more selective column

It is interesting to see that Oracle, in contrast to MS-SQL Server, is in the capacity of using parallelism when rebuilding both type of indexes. In addition Oracle rebuild execution time of the second type of indexes is as good as that of the first type.

Next Page »

Create a free website or blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)