Mohamed Houri’s Oracle Notes

December 7, 2017

Optimizer statistics gathering

Filed under: direct path — hourim @ 8:24 pm

As from Oralce 12cR1, the Optimizer can automatically collect statistics on empty tables provided they are, notably, direct path loaded. This is recognisable in execution plans through the new row source operation named OPTIMIZER STATISTICS GATHERING

-----------------------------------------------------------
| Id  | Operation                        | Name   | Rows  |
-----------------------------------------------------------
|   0 | INSERT STATEMENT                 |        |  1000K|
|   1 |  LOAD AS SELECT                  | T1_TAR |       |
|   2 |   OPTIMIZER STATISTICS GATHERING |        |  1000K|
|   3 |    TABLE ACCESS FULL             | T1     |  1000K|
-----------------------------------------------------------

This is, in principle, a good initiative. But look at what I have been asked to trouble shoot a couple of days ago:

Global Information
------------------------------
 Status              :  EXECUTING                 
 Instance ID         :  1                    
 Session             :  xxxx (15:32901) 
 SQL ID              :  am6923m45s203        
 SQL Execution ID    :  16777216             
 Execution Started   :  12/02/2017 08:01:17  
 First Refresh Time  :  12/02/2017 08:01:28  
 Last Refresh Time   :  12/05/2017 11:55:51  
 Duration            :  272276s                  
 Module/Action       :  SQL*Plus/-           
 Service             :  xxxxx            
 Program             :  sqlplus@xxxx

===================================================================================================
| Id |             Operation               |  Name  | Activity | Activity Detail                  |
|    |                                     |        |   (%)    |   (# samples)                    |
===================================================================================================
| -> 0 | INSERT STATEMENT                  |        |   0.00   | Data file init write             |   
| -> 1 |   LOAD AS SELECT                  | T1_TAR |   0.06   | Cpu(28)                          |
|                                          |        |          | direct path write(2)             |
| -> 2 |    OPTIMIZER STATISTICS GATHERING |        |  99.93   | Cpu(52956)                       |
| -> 3 |     TABLE ACCESS FULL             | T1     |          |  SQL*Net more data from dblink(1)|               
==================================================================================================    

You have to believe me when I say that this insert had been running for 3 days when the above SQL report was taken. As you can see 99% of the direct path load execution time was spent gathering statistics at operation in line n°2. This is how a new feature designed to help starts disturbing you.

The insert/select statement selects a CLOB from a remote database. I was curious to see whether getting rid of this CLOB from the insert operation would make things better:

SQL>set timing on
 
    INSERT
   /*+ append */
 INTO T1_TAR
    (col1
   -- ,clob_col2
	,col3
	,col4
	,col5
	,col6
	,col7
	,col8
	)
 SELECT
    t1.col1 
   --,t1.clob_col2
   ,t1.col3
   ,t1.col4
   ,t1.col5
   ,t1.col6
   ,t1.col7
   ,t1.col8
  FROM
    t1@dblink t1;

338481182 rows created.

Elapsed: 00:11:38.85

------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes |IN-OUT|
-----------------------------------------------------------
|   0 | INSERT STATEMENT   |        |  318M|    28GB|      |   
|   1 |  LOAD AS SELECT    | T1_TAR |      |        |      |
|   2 |   REMOTE           | T1     |  318M|    28GB|R->S  |
------------------------------------------------------------

That’s really interesting.

Why would the presence or the absence of a CLOB column in the insert/select statement allow or prevent the CBO from gathering statistics online?

In fact the CLOB column has nothing to do with the online statistics gathering decision. Should I have commented any other column I would have had the same behaviour as the following prooves:

INSERT
   /*+ append */
 INTO T1_TAR
    (col1
    ,clob_col2
	,col3
	,col4
	,col5
	,col6
	--,col7
	,col8
	)
 SELECT
    t1.col1 
   ,t1.clob_col2
   ,t1.col3
   ,t1.col4
   ,t1.col5
   ,t1.col6
   --,t1.col7
   ,t1.col8
  FROM
    t1@dblink t1;
------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes |IN-OUT|
-----------------------------------------------------------
|   0 | INSERT STATEMENT   |        |  318M|    28GB|      |   
|   1 |  LOAD AS SELECT    | T1_TAR |      |        |      |
|   2 |   REMOTE           | T1     |  318M|    28GB|R->S  |
------------------------------------------------------------

The OPTIMIZER STATISTICS GATHERING operation requires the presence of the totality of the table columns in order to kick in during direct path load operations. This is probably because Oracle doesn’t want to collect statistics for a bunch of columns and keep others without statistics. This being said, this is another non-really documented restriction that comes with this new 12c feature.

Since I was not going to challenge my client getting rid of one column from the insert/select I finally opted for locally cancelling the online statistics gathering using the corresponding hint as shown below:

INSERT
   /*+ append 
       no_gather_optimizer_statistics
   */
 INTO T1_TAR
    (col1
    ,clob_col2
    ,col3
    ,col4
    ,col5
    ,col6
    ,col7
    ,col8
    )
 SELECT
    t1.col1 
   ,t1.clob_col2
   ,t1.col3
   ,t1.col4
   ,t1.col5
   ,t1.col6
   ,t1.col7
   ,t1.col8
  FROM
    t1@dblink t1;

Global Information
------------------------------
 Status              :  DONE                 
 Instance ID         :  1                    
 Session             :  xxxx (132:63271) 
 SQL ID              :  Od221zf0srs6q        
 SQL Execution ID    :  16777216             
 Execution Started   :  12/02/2017 15:10:16  
 First Refresh Time  :  12/02/2017 15:10:22  
 Last Refresh Time   :  12/05/2017 15:32:56  
 Duration            :  1360s                  
 Module/Action       :  SQL*Plus/-           
 Service             :  xxxxx            
 Program             :  sqlplus@xxxx

SQL Plan Monitoring Details (Plan Hash Value=2098243032)
================================================================
| Id |             Operation   |  Name  |   Time    |   Rows   |
|    |                         |        | Active(s) | (Actual) |
================================================================
|  0 | INSERT STATEMENT        |        |      1355 |        2 |
|  1 |   LOAD AS SELECT        | T1_TAR |      1355 |        2 |
|  2 |     REMOTE              | T1     |      1355 |     338M |
================================================================    

That’s how we fixed this issue.

Watch out your direct path load over empty tables selecting from either local or distant databases; the new 12c online table and columns statics gathering might considerably delay your treatement.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a free website or blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud's blog

Just another blog : Databases, Linux and other stuffs

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)