Mohamed Houri’s Oracle Notes

April 18, 2015

Parallel refreshing a materialized view

Filed under: Materialized view — hourim @ 4:58 pm

I have been asked to troubleshoot a monthly on demand materialized view refresh job which has got the bad idea to crash with the ORA-01555 error after 25,833 seconds (more than 7 hours) of execution. Despite my several years of professional experience this is the first time I have been asked to look at a materialized view refresh. This issue came up Friday afternoon so I was given a week-end to familiarize myself with materialized views. Coincidentally a couple of days before there was an Oracle webcast on Materialized view basics, architecture and internal working  which I have replayed on Saturday and practiced its demo. Christian Antognini book contains a chapter on this topic which I have also gone through as far as Christian book is from where I always like to start when trying to learn an Oracle concept.

Materialized view capabilities

The following Monday morning, armed with this week-end accelerated auto-training, I opened again the e-mail I have been sent about the failing refresh job and started re-reading it. The first thing that has retained my attention this time, in contrast to my last Friday quick pass through reading, was a suggestion made by the DBA to try fast refreshing the materialized view instead of completely refreshing it. I learnt from the Oracle webcast that Oracle is able to let us know wether a materialized view can be fast (also know as incremental) refreshed or not. Here below the steps to do if you want to get this information:

You need first to create the mv_capabilities_table table (in the schema you are going to use against it the dbms_mview package) using the following script :

SQL> $ORACLE_HOME/rdbms/admin/utlxmv.sql

SQL> select * from mv_capabilities_table;
no rows selected

Once this table created you can execute the dbms_mview.explain_mview package as shown below:

SQL> exec dbms_mview.explain_mview ('my_materialied_mv');

PL/SQL procedure successfully completed.

SQL> select
  2     mvname
  3    ,capability_name
  4    ,possible
  5  from
  6    mv_capabilities_table
  7  where
  8     mvname = 'MY_MATERIALIED_MV'
  9  and
 10    capability_name  like '%REFRESH%';

MVNAME                         CAPABILITY_NAME                P
------------------------------ ------------------------------ -
MY_MATERIALIED_MV              REFRESH_COMPLETE               Y  
MY_MATERIALIED_MV              REFRESH_FAST                   N --> spot this
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_INSERT      N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_ONETAB_DML  N
MY_MATERIALIED_MV              REFRESH_FAST_AFTER_ANY_DML     N
MY_MATERIALIED_MV              REFRESH_FAST_PCT               N

As spotted above, fast refreshing this materialized view is impossible.

The first learned lesson: instead of trying the create a materialized view log and fast refreshing a complex materialized view which might be impossible to be refreshed incrementally, try first getting the capabilities of the view using the explain_mview procedure. You will certainly save time and resource.

SQL> SELECT
         refresh_method
       , refresh_mode
       , staleness
       , last_refresh_type
       , last_refresh_date
    FROM
          user_mviews
    WHERE mview_name = 'MY_MATERIALIED_MV';

REFRESH_ REFRES STALENESS           LAST_REF LAST_REFRES
-------- ------ ------------------- -------- --------------------
COMPLETE DEMAND NEEDS_COMPILE       COMPLETE 02-APR-2015 16:16:35

Parallel clause in the SQL create statement : any effect on the mview creation?

Since I have ruled out an incremental refresh I decided to get the materialized view definition so that I can investigate its content

SQL> SELECT
       replace (dbms_metadata.get_ddl(replace(
                                      OBJECT_TYPE, ' ', '_'),    
                                      OBJECT_NAME,OWNER)
                    ,'q#"#'
                    ,'q#''#'
                    )
     FROM DBA_OBJECTS
     WHERE OBJECT_TYPE = 'MATERIALIZED VIEW'
     AND object_name = 'MY_MATERIALIED_MV';

------------------------------------------------------------------
CREATE MATERIALIZED VIEW MY_MATERIALIED_MV
   ({list of columns}) 
  TABLESPACE xxxx
  PARALLEL 16 –----------------------------------> spot this
  BUILD IMMEDIATE
  USING INDEX
  REFRESH COMPLETE ON DEMAND
  USING DEFAULT LOCAL ROLLBACK SEGMENT
  USING ENFORCED CONSTRAINTS DISABLE QUERY REWRITE
AS
-- select n°1
 SELECT
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
 GROUP BY
  {list of columns}
.../...
UNION ALL
-- select n°5
SELECT
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
GROUP BY
  {list of columns} ;

Have you noticed that parallel 16 clause in the materialized view create script? The developer intention was to create the materialized view using parallel process. Having a Production equivalent database I was happy enough to try re-creating this materialized view:

SQL> set timing on

SQL> start ddl_mv1.sql

Materialized view created.

Elapsed: 00:22:33.52

Global Information
------------------------------
 Status              :  DONE               
 Instance ID         :  1                  
 Session             :  XZYY (901:25027)  
 SQL ID              :  f9s6kdyysz84m      
 SQL Execution ID    :  16777216           
 Execution Started   :  04/16/2015 09:49:22
 First Refresh Time  :  04/16/2015 09:49:23
 Last Refresh Time   :  04/16/2015 10:11:48
 Duration            :  1346s              
 Module/Action       :  SQL*Plus/-         
 Service             :  XZYY
 Program             :  sqlplus.exe         

Global Stats
========================================================================
| Elapsed |   Cpu   |    IO    | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Gets  | Reqs | Bytes | Reqs  | Bytes |
========================================================================
|   20338 |    5462 |    14205 |    63M |   3M | 716GB |    2M | 279GB |
========================================================================

Parallel Execution Details (DOP=16 , Servers Allocated=32)

SQL Plan Monitoring Details (Plan Hash Value=853136481)
==================================================================================================
| Id  |                       Operation            | Name    |  Rows   | Execs |   Rows   |Temp  |
|     |                                            |         | (Estim) |       | (Actual) |(Max) |
==================================================================================================
|   0 | CREATE TABLE STATEMENT                     |         |         |    33 |       16 |      |
|   1 |   PX COORDINATOR                           |         |         |    33 |       16 |      |
|   2 |    PX SEND QC (RANDOM)                     | :TQ10036|         |    16 |       16 |      |
|   3 |     LOAD AS SELECT                         |         |         |    16 |       16 |      |
|   4 |      UNION-ALL                             |         |         |    16 |     117M |      |
|   5 |       HASH GROUP BY                        |         |    259M |    16 |      58M |  36G |
|   6 |        PX RECEIVE                          |         |    259M |    16 |     264M |      |
|   7 |         PX SEND HASH                       | :TQ10031|    259M |    16 |     264M |      |
|   8 |          HASH JOIN RIGHT OUTER BUFFERED    |         |    259M |    16 |     264M |  61G |
|   9 |           PX RECEIVE                       |         |      4M |    16 |       4M |      |
|  10 |            PX SEND HASH                    | :TQ10013|      4M |    16 |       4M |      |
|  11 |             PX BLOCK ITERATOR              |         |      4M |    16 |       4M |      |
|     |                                            |         |         |       |          |      |
| 180 |                PX RECEIVE                  |         |     19M |    16 |      20M |      |
| 181 |                 PX SEND HASH               | :TQ10012|     19M |    16 |      20M |      |
| 182 |                  PX BLOCK ITERATOR         |         |     19M |    16 |      20M |      |
| 183 |                   TABLE ACCESS FULL        | TABLE_M |     19M |   268 |      20M |      |
==================================================================================================

Surprisingly the materialized view has been created in less than 23 minutes. And this creation has been parallelised with a DOP of 16 as shown by the corresponding Real Time Sql Monitoring report (RTSM).The master table has been henceforth created with a DOP of 16 as shown below:

SQL> select
  2    table_name
  3   ,degree
  4  from
  5    user_tables
  6  where table_name = 'MY_MATERIALIED_MV';

TABLE_NAME                     DEGREE
------------------------------ ----------
MY_MATERIALIED_MV               16

A simple select against the created materialized view will go parallel as well

SQL> select count(1) from MY_MATERIALIED_MV;               

SQL Plan Monitoring Details (Plan Hash Value=3672954679)
============================================================================================
| Id |          Operation          |           Name           |  Rows   | Execs |   Rows   |
|    |                             |                          | (Estim) |       | (Actual) |
============================================================================================
|  0 | SELECT STATEMENT            |                          |         |     1 |        1 |
|  1 |   SORT AGGREGATE            |                          |       1 |     1 |        1 |
|  2 |    PX COORDINATOR           |                          |         |    17 |       16 |
|  3 |     PX SEND QC (RANDOM)     | :TQ10000                 |       1 |    16 |       16 |
|  4 |      SORT AGGREGATE         |                          |       1 |    16 |       16 |
|  5 |       PX BLOCK ITERATOR     |                          |    104M |    16 |     117M |
|  6 |        MAT_VIEW ACCESS FULL | MY_MATERIALIED_MV        |    104M |   191 |     117M |
============================================================================================

You might have already pointed out in the above RTSM report that the select part of the “create as select” statement has been parallelised as well. It is as if the parallel 16 clause of the “create” part of the SQL  materialized view script induced implicitly its  “select” part to be done in parallel with a DOP of 16.

Parallel clause in the SQL create statement : any effect on the mview refresh ?

As far as I am concerned, the problem I have been asked to trouble shoot resides in refreshing the materialized view and not in creating it.  Since, the materialized view has been created in 23 minutes, I should be optimistic for its refresh time; isn’t it?

SQL> exec dbms_mview.refresh ('MY_MATERIALIED_MV','C',atomic_refresh=>FALSE);

After more than 4,200 seconds of execution time I finally gave up and decided to stop this refresh. Below is an overview of its corresponding Real Time Sql Monitoring (RTSM) report:

Global Information
------------------------------
 Status              :  DONE (ERROR) --> I have cancelled it after more than 1 hour   
 Instance ID         :  1                  
 Session             :  XZYY (901:25027)  
 SQL ID              :  d5n03tuht2cg8      
 SQL Execution ID    :  16777216           
 Execution Started   :  04/16/2015 10:55:46
 First Refresh Time  :  04/16/2015 10:55:52
 Last Refresh Time   :  04/16/2015 12:06:39
 Duration            :  4253s               
 Module/Action       :  SQL*Plus/-         
 Service             :  XZYY
 Program             :  sqlplus.exe         

Global Stats
===================================================================================
| Elapsed |   Cpu   |    IO    |  Other   | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) |  Gets  | Reqs | Bytes | Reqs  | Bytes |
===================================================================================
|    4253 |    1640 |     2563 |       50 |    53M | 824K | 227GB |  570K | 120GB |
===================================================================================

SQL Plan Monitoring Details (Plan Hash Value=998958099)
=============================================================================
| Id |                  Operation                   |Name |  Rows   | Cost  |
|    |                                              |     | (Estim) |       |
=============================================================================
|  0 | INSERT STATEMENT                             |     |         |       |
|  1 |   LOAD AS SELECT                             |     |         |       |
|  2 |    UNION-ALL                                 |     |         |       |
|  3 |     HASH GROUP BY                            |     |    259M |       |
|  4 |      CONCATENATION                           |     |         |       |
|  5 |       NESTED LOOPS OUTER                     |     |       7 |  4523 |
|  6 |        NESTED LOOPS OUTER                    |     |       7 |  4495 |
|  7 |         NESTED LOOPS                         |     |       7 |  4474 |
|  8 |          NESTED LOOPS                        |     |       7 |  4460 |
|  9 |           PARTITION REFERENCE ALL            |     |       7 |  4439 |
…/…

In contrast to the creation process, the materialized view refresh has been done serially. This confirms that the above parallel 16 clause in the create DDL script concerns only the parallel materialized view creation and not its refresh process.

The second learned lesson : I think that a parallel clause specified in the create statement of a materialized view is not used during the refresh of the same materialized view.  The parallel run is considered in this kind of situations only at the materialized view creation time.

dbms_mview.refresh and its parallelism parameter : any effect on the mview refresh ?

The tables on which the materialized view is based have all a degree = 1

 SQL> select
  2      table_name
  3    , degree
  4  from user_tables
  5  where trim(degree) <> '1';

TABLE_NAME            DEGREE
--------------------- -------
MY_MATERIALIED_MV     16

Having said that, what if I try refreshing this materialized view using the parallelism parameter of the dbms_mview.refresh procedure as shown below:

SQL> exec dbms_mview.refresh ('MY_MATERIALIED_MV','C', atomic_refresh=>FALSE, parallelism =>16);

SQL Plan Monitoring Details (Plan Hash Value=998958099)
==========================================================================================
| Id |                  Operation                   |           Name           |  Rows   |
|    |                                              |                          | (Estim) |
==========================================================================================
|  0 | INSERT STATEMENT                             |                          |         |
|  1 |   LOAD AS SELECT                             |                          |         |
|  2 |    UNION-ALL                                 |                          |         |
|  3 |     HASH GROUP BY                            |                          |    259M |
|  4 |      CONCATENATION                           |                          |         |
|  5 |       NESTED LOOPS OUTER                     |                          |       7 |
|  6 |        NESTED LOOPS OUTER                    |                          |       7 |
|  7 |         NESTED LOOPS                         |                          |       7 |
|  8 |          NESTED LOOPS                        |                          |       7 |
|  9 |           PARTITION REFERENCE ALL            |                          |       7 |
| 10 |            TABLE ACCESS BY LOCAL INDEX ROWID | TABLE_XX_ZZ              |       7 |
../..
| 94 |           PARTITION RANGE ALL                |                          |    369M |
| 95 |            PARTITION LIST ALL                |                          |    369M |
| 96 |             TABLE ACCESS FULL                | TABLE_AA_BB_123          |    369M |
==========================================================================================

As confirmed by the above corresponding RTSM report, the parallelism parameter has not been obeyed and the refresh has been done serially in this case as well.

The third learned lesson : using the parameter parallelism of the dbms_mview.refresh procedure has no effect on the parallel refresh of the underlying materialized view.

Adding a parallel hint in the select part of the mview : any effect on the mview refresh ?

At this stage of the troubleshooting process I have emphasized the following points:

  • The parallel clause used in the create statement of a materialized view is considered only during the materialized view creation. This parallel clause is ignored during the refresh process
  • The parallelism parameter of the dbms_mview.refresh procedure will not refresh the materialized view in parallel

Now that I have ruled out all the above steps I was almost convinced that to expedite the refresh process I need to add a parallel hint directly in the materialized view definition (ddl_mv2.sql):

CREATE MATERIALIZED VIEW MY_MATERIALIED_MV
   ({list of columns}) 
  TABLESPACE xxxx
  PARALLEL 16
  BUILD IMMEDIATE
  USING INDEX
  REFRESH COMPLETE ON DEMAND
  USING DEFAULT LOCAL ROLLBACK SEGMENT
  USING ENFORCED CONSTRAINTS DISABLE QUERY REWRITE
AS
 SELECT /*+ parallel(8) pq_distribute(tab1 hash hash)*/
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
 GROUP BY
  {list of columns}
UNION ALL
 SELECT /*+ parallel(8) pq_distribute(tab1 hash hash)*/
    {list of columns}
 FROM
  {list of tables}
 WHERE
  {list of predicates}
 GROUP BY
    {list of columns}
;

Having changed the select part of materialized view DDL script I launched again it creation which completes in 25 minutes as shown below:

SQL> start ddl_mv2.sql
Materialized view created.
Elapsed: 00:25:05.37

And immediately after the creation I launched the refresh process :

SQL> exec dbms_mview.refresh ('MY_MATERIALIED_MV','C',atomic_refresh=>FALSE);

PL/SQL procedure successfully completed.
Elapsed: 00:26:11.12

And hopefully this time the refresh completed in 26 minutes thanks to the parallel run exposed below in the corresponding RTSM report:

Global Information
------------------------------
 Status              :  DONE               
 Instance ID         :  1                  
 Session             :  XZYY
 SQL ID              :  1w1v742mr35g3      
 SQL Execution ID    :  16777216           
 Execution Started   :  04/16/2015 13:38:13
 First Refresh Time  :  04/16/2015 13:38:13
 Last Refresh Time   :  04/16/2015 14:04:24
 Duration            :  1571s              
 Module/Action       :  SQL*Plus/-         
 Service             :  XZYY            
 Program             :  sqlplus.exe         

Parallel Execution Details (DOP=8, Servers Allocated=80)

SQL Plan Monitoring Details (Plan Hash Value=758751629)
===============================================================================
| Id  |                       Operation          |           Name   |  Rows   |
|     |                                          |                  | (Estim) |
===============================================================================
|   0 | INSERT STATEMENT                         |                  |         |
|   1 |   LOAD AS SELECT                         |                  |         |
|   2 |    UNION-ALL                             |                  |         |
|   3 |     PX COORDINATOR                       |                  |         |
|   4 |      PX SEND QC (RANDOM)                 | :TQ10005         |    259M |
|   5 |       HASH GROUP BY                      |                  |    259M |
| 177 |                PX RECEIVE                |                  |     19M |
| 178 |                 PX SEND HASH             | :TQ50004         |     19M |
| 179 |                  PX BLOCK ITERATOR       |                  |     19M |
| 180 |                   TABLE ACCESS FULL      | TABLE_KZ_YX      |     19M |
===============================================================================

I’ve added the pq_distribute (tab1 hash hash) hint above because several refreshes crashed because of the broadcast distribution that ended up by overconsuming TEMP space raising the now classical error:

ERROR at line 484:
ORA-12801: error signaled in parallel query server P012
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

The fourth learned lesson : if you want to parallelise your materialized view refresh process you had better to include the parallel hint in the select part of the materialized view. This is better than to change the parallel degree of the tables on which the materialized view is based on.

April 8, 2015

The dark side of using bind variables : sharing everything

Filed under: Tuning — hourim @ 5:48 pm

An interesting academic situation happened last week which I, honestly believe, is worth a blog article as far as experienced DBA have spent time trying to solve it without success. An overnight job was running for hours in the night from 01/04 to 02/04. The on call DBA spent all the night killing and re-launching the job (sql_id) several attempts without any success. When I arrived at work the next day I was asked to help. As far as this job was still running, I generated the Real Time SQL monitoring report (RTSM) for the corresponding sql_id which showed the classical NESTED LOOP having a huge outer data row set driving an inner data set in which at least 50 different operations have been started 519K times while one operation has been executed 2M times. The corresponding execution plan contains 213 operations. The underlying query uses 628 user bind variables and 48 system generated bind variables (thanks to cursor sharing set to FORCE)

SQL Plan Monitoring Details (Plan Hash Value=1511784243)

Global Information
------------------------------
 Status              :  EXECUTING               
 Instance ID         :  2                       
 Session             :  xxxxx (350:9211)   
 SQL ID              :  dmh5vhkcm877v           
 SQL Execution ID    :  33554436                
 Execution Started   :  04/02/2015 07:52:03     
 First Refresh Time  :  04/02/2015 07:52:47     
 Last Refresh Time   :  04/02/2015 10:04:28     
 Duration            :  7947s                   
 Module/Action       :  wwwww
 Service             :  zzzzz               
 Program             :  wwwww  
 DOP Downgrade       :  100%    

Global Stats
===================================================================================
| Elapsed |   Cpu   |    IO    | Application | Concurrency | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   |  Gets  | Reqs | Bytes |
====================================================================================
|    7900 |    7839 |       20 |        0.00 |        0.82 |   243M | 5946 | 659MB |
====================================================================================

The above 7,839 seconds spent consuming CPU with almost no user wait time represents the classical wrong NESTED LOOP operation starting many times several inner operations as mentioned above.

The job was running without any sign of improvement, the client was waiting for its critical report and I have a query with 700 bind variables honored via an execution plan of 213 operations to figure out how to make this report finishing smoothly as soon as possible.

I was dissecting the execution plan when the end user send me an e-mail saying that the same job ran successfully yesterday within 6 min. With that information in mind I have managed to get the RTSM of the yesterday successful job. The first capital information was that the yesterday query and the today not ending one used the same plan_hash_value (same execution plan). Comparing the 628 input bind variable values of both runs, I found that the yesterday job ran for a one month period (monthly job) while the current job is running for a one day interval (daily job).Of course the end user has not supplied any information about the kind of job they are currently running compared to the previous one. All what I have been told is the yesterday job completed in 6 minutes. It is only until I’ve found the difference in the input bind variable values that the end user said “the current run is for the daily job while the previous one was for the monthly job”.

And the sun starts rising. I was able to figure out that the two set of bind variables are not doing the same amount of work and sharing the same execution plan is probably not a good idea.  This is why I have suggested the DBA to do the following:

  • Kill the not ending session
  • Purge the sql_id from the shared pool
  • Ask the end user to re-launch the job
  • Cross fingers :-)

And you know what?  The job completed within a couple of hundreds of seconds:

SQL Plan Monitoring Details (Plan Hash Value=2729107228)

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)         
 Instance ID         :  2                       
 Session             :  xxxxx (1063:62091) 
 SQL ID              :  dmh5vhkcm877v           
 SQL Execution ID    :  33554437                
 Execution Started   :  04/02/2015 10:43:17     
 First Refresh Time  :  04/02/2015 10:43:20     
 Last Refresh Time   :  04/02/2015 10:47:38     
 Duration            :  261s                    
 Module/Action       :  wwwww
 Service             :  zzzzz
 Program             :  wwwww
 Fetch Calls         :  57790     

Global Stats
==============================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs | Bytes |
==============================================================================
|     134 |     107 |     7.10 |          17 | 57790 |    18M | 7857 | 402MB |
==============================================================================

This is the dark side of using bind variable: when sharing resource we share also execution plan. The current daily job was running using the plan optimized for the monthly job. The solution was to force the CBO compiling a new execution plan for the new input bind variable. The new plan (2729107228) is still showing 200 operations and several operations started 578K times. I have the intention to study both execution plans to know exactly from where the enhancement is coming. The clue here might be that in the first shared monthly execution plan the query, for a reason I am unable to figure out, run serially

 DOP Downgrade       :  100%   

While the new hard parsed execution has been executed in parallel:

 Parallel Execution Details (DOP=4 , Servers Allocated=20)

Bottom Line: when you have the intention to run a critical report once per day (and once per month) then it is worth to let the CBO compiling a new execution plan for each execution. All what you will have is one hard parse for one execution. This will never hurt from a memory and CPU point of view

April 6, 2015

BIND_EQUIV_FAILURE – Or when you will regret using Adaptive Cursor Sharing

Filed under: Tuning — hourim @ 4:42 pm

Previously, when I was asked to define the Adaptive Cursor Sharing(ACS) feature I’ve often used the following definition: “it represents an answer to the always threating and challenging Oracle task of sharing cursors and optimizing SQL”.

Time passes and I have altered a little bit this definition to become: “it represents a short answer to the always threating and challenging Oracle task of sharing cursors and optimizing SQL’’.

Time passes again and I ended up by drastically altering my initial ACS definition to become:“In certain very plausible situations, It might represent a serious threat for your application where you will be happy to disable it provided you have enough experience to identify the link between ACS and your threat”.

If you want to know what has changed my mind about ACS, then follow this situation taken from a real life running system I am going to summarize:

When you see something like this in the library cache (11.2.0.3.0)

SQL> select
        sql_id
       ,count(1)
     from
        v$sql
     where executions < 2
     group by sql_id
     having count(1) > 10
     order by 2 desc;

SQL_ID          COUNT(1)
------------- ----------
7zwq7z1nj7vga      44217

You start wondering what makes this sql_id having such a big count of different versions in memory.

After few minutes of investigation you end up by ruling out the bind variable hypothesis. And then you finish by asking yourself what the heck is this sql_id?

Hopefully Tanel Poder nonshared script shed a small light on that:

SQL> @nonshared 7zwq7z1nj7vga
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)...

-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000042CE36F7E8
CHILD_NUMBER         : 0
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>0</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                       <selectivity>1097868685</selectivity>
                      </ChildNode>

-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000045B5C5E478
CHILD_NUMBER         : 1
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>1</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>915662630</selectivity>
                      </ChildNode>
-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000038841E2868
CHILD_NUMBER         : 2
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>2</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>163647208</selectivity>
                      </ChildNode>
-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000038841E2708
CHILD_NUMBER         : 3
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>3</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>4075662961</selectivity>
                      </ChildNode>

…/…

-----------------
SQL_ID               : 7zwq7z1nj7vga
ADDRESS              : 000000406DBB30F8
CHILD_ADDRESS        : 00000045B5C5E5D8
CHILD_NUMBER         : 99
BIND_EQUIV_FAILURE   : Y
REASON               :<ChildNode><ChildNumber>99</ChildNumber><ID>40</ID>
                      <reason>Bindmismatch(33)</reason><size>2x4</size>
                      <init_ranges_in_first_pass>0</init_ranges_in_first_pass>
                      <selectivity>3246589452</selectivity>
                      </ChildNode>

Moreover a direct select on the v$sql_shared_cursor shows this:

SQL> select
       count(1)
     from
         v$sql_shared_cursor
     where
        sql_id = '7zwq7z1nj7vga';

  COUNT(1)
----------
     45125

SQL> select
        count(1)
     from
       v$sql_shared_cursor
     where
        sql_id = '7zwq7z1nj7vga'
     and BIND_EQUIV_FAILURE = 'Y';

  COUNT(1)
----------
     45121

Hmmm…. A huge count of non shared child cursors due to BIND_EQUIV_FAILURE.

The official Oracle documentation about BIND_EQUIV_FAILURE says : the bind value’s selectivity does not match that used to optimize the existing child cursor.This definition together with the selectivity xml tag mentioned above gave me a first clue: Adaptive Cursor Sharing (in this case Extended Cursor Sharing).

SQL> select
       count(1)
    from
        v$sql_cs_selectivity
    where
      sql_id = '7zwq7z1nj7vga';

  COUNT(1)
----------
  16,847,320

That is an impressive number of records in this dynamic view. For a single sql_id we have about 17 million of rows in this ACS monitoring view!!! This is dramatically altering the execution time of the underlying sql_id query.

If you don’t know what v$sql_cs_selectivity view stands for then look:

Once a cursor becomes bind aware, each time this cursor is executed, the Extended Cursor Sharing layer code peeks at the bind variable values (and in this particular case there are 9 bind variables), and execute, behind the scene, a select against v$sql_cs_selectivity view in order to check if any existing child cursor already covers the selectivity of the peeked bind variables. If a child cursor is found it will be shared. If not then a new child cursor is optimized and inserted into v$sql_cs_selectivity with a new range of bind variable value selectivity.

In this particular case each time the Extended Cursor Sharing layer code fails to find a child cursor in v$sql_cs_selectivity with an adequate range of selectivity(BIND_EQUIV_FAILURE) and compile a new execution plan ending up by filling dramatically v$sql view with multiple “optimal” plans.

We have been asked to use ACS to answer the need of sharing cursor and optimizing SQL. We end up by having neither the first nor the second desire in this particular and very plausible case.

Few extra words about this runing system case:

  • Query is using 9 bind variables in 9 different predicates
  • Columns on which bind variables are used have histograms(Frequency and Height Balanced) collected on them
  • Query is a simple select on a single heap table
  • Table has 4 indexes (which produces 4 distinct execution plans among those 99 ones)

 

April 1, 2015

12c Adaptive Cursor Sharing

Filed under: Oracle — hourim @ 7:21 pm

This is neither a 12c new feature you are still not aware of nor an extension of the 11 g Adaptive Cursor Sharing I am going to have the scoop to announce it. It is rather something I have pointed out when writing a complete chapter on Adaptive Cursor Sharing and that I wanted to share with you. Here we go.

11g Release

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

I have a simple query executed against two bind variables values. The first one favors an index range scan path while the second one desires a full table scan. The bind variable against which this query is run is related to a column having a frequency histogram collected on it so that when executed several times its underlying cursor starts being bind sensitive before ending up being bind aware following a warmup period. Once the cursor is bind aware, the Extended Cursor Sharing (ECS) layer code kicks in by peeking at the bind variable (there might be several ones), checking its selectivity and deciding whether to share an existing child cursor or compile a new one and update the v$sql_cs_selectivity dynamic view accordingly.

SQL> select
        sql_id
       ,child_number
       ,is_bind_aware
       ,is_bind_sensitive
       ,to_char(exact_matching_signature) sig
       ,executions
       ,plan_hash_value
    from v$sql
    where sql_id = '6fbvysnhkvugw'
    and is_shareable = 'Y';

SQL_ID        CHILD_NUMBER I I SIG                  EXECUTIONS PLAN_HASH_VALUE
------------- ------------ - - -------------------- ---------- ---------------
6fbvysnhkvugw            3 Y Y 15340826253708983785    1       3625400295
6fbvysnhkvugw            4 Y Y 15340826253708983785    1       3724264953

In order to finish setting up the blog article scene, I have previously loaded the above two execution plans(plan_hash_value) from cursor cache into a SPM baseline so that I will pre-empt the CBO from using a plan I don’t accept.

SQL> declare
           rs pls_integer;
     begin
           rs := dbms_spm.load_plans_from_cursor_cache('6fbvysnhkvugw');
     end;
     /
PL/SQL procedure successfully completed.

The engineered model (ACS + SPM) is so that when I running the following query alternating between a ”full table scan” bind variable value and an ”index range scan”  one I got the following picture:

SQL> select count(*), max(col2) from t1 where flag = 'N1';-- full table scan

  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
     49999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 50135 |  2643K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)
Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

And

SQL> select count(*), max(col2) from t1 where flag = 'Y1'; -- index range scan

  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |    54 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |     9 |   486 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |     9 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg98576eb1f used for this statement

After a short period of ACS warmup which depends on the number of executions we start observing a perfect harmony between ACS and SPM: an index range scan plan for an ‘’index range scan’’ bind variable and a full table scan plan for a ‘’full table scan’’ bind variable.

However, it suffices to create an extra index and this perfect harmony ceases to work as shown below:

SQL> create index i2 on t1(flag,col2);

SQL> select count(*), max(col2) from t1 where flag = 'N2';

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

SQL> select plan_name
    from dba_sql_plan_baselines
    where accepted = 'NO';

PLAN_NAME
------------------------------
SQL_PLAN_d9tch6banyzg9495f4ddb

The CBO has come up with a new execution plan(index fast full scan) which has been constrained by the full table scan SPM plan. This new CBO plan looks like:

SQL> select * from table(dbms_xplan.display_sql_plan_baseline(plan_name => 'SQL_PLAN_d9tch6banyzg9495f4ddb'));

--------------------------------------------------------------------------------
SQL handle: SQL_d4e59032d54f7de9
SQL text: select count(*), max(col2) from t1 where flag = :"SYS_B_0"
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_d9tch6banyzg9495f4ddb         Plan id: 1230982619
Enabled: YES     Fixed: NO      Accepted: NO      Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------
Plan hash value: 2348726875
------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    54 |   249   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |      |     1 |    54 |            |          |
|*  2 |   INDEX FAST FULL SCAN| I2   | 25000 |  1318K|   249   (1)| 00:00:03 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Let’s now execute the query with a bind variable favoring an index range scan

SQL> select count(*), max(col2) from t1 where flag = 'Y2'; -- index range scan
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
--------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

The index range scan plan has not been used. May be it still needs a warm up period?

SQL> select count(*), max(col2) from t1 where flag = 'Y2';
SQL> /
SQL> /
SQL> /
SQL> /
SQL> /

SQL> select * from table(dbms_xplan.display_cursor);
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=:SYS_B_0)

Note
-----
   - SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

SQL> select
          sql_id
         ,child_number
         ,is_bind_aware
         ,is_bind_sensitive
         ,to_char(exact_matching_signature) sig
         ,executions
         ,plan_hash_value
     from v$sql
     where sql_id = '6fbvysnhkvugw'
     and is_shareable = 'Y'
     ;

SQL_ID        CHILD_NUMBER I I SIG                  EXECUTIONS PLAN_HASH_VALUE
------------- ------------ - - -------------------- ---------- ---------------
6fbvysnhkvugw            3 N N 15340826253708983785  10      3724264953

10 executions later and the switch to the index range scan didn’t occured.

Bizarrely, it suffices to disable the use of sql_plan_baselines and the plan switch occurs immediately

SQL> alter session set optimizer_use_sql_plan_baselines = FALSE;

SQL> select count(*), max(col2) from t1 where flag = 'Y2';

Plan hash value: 3625400295
-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |    54 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |   972 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("FLAG"=:SYS_B_0)

Set back the use of sql_plan_baselines and the plan switch will cease to happen immediately as well

SQL> alter session set optimizer_use_sql_plan_baselines = TRUE;

SQL> select count(*), max(col2) from t1 where flag = 'Y2';

Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49846 |  2628K|   273   (1)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

12c Release

Repeat exactly the same experiment in 12c and you will realize that things have changed

SQL> select * from v$version where rownum =1;

BANNER                                                                       CON_ID
---------------------------------------------------------------------------- ------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production   0

SQL> create index i2 on t1(flag,col2);

SQL> select count(*), max(col2) from t1 where flag = 'N2';

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49874 |  2630K|   273   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

SQL> select count(*), max(col2) from t1 where flag = 'Y2';
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   273 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    54 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 49874 |  2630K|   273   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg9616acf47 used for this statement

Let’s see now if a second execution of the same query with a bind variable favoring index range scan will switch to an index range scan plan


SQL> select count(*), max(col2) from t1 where flag = 'Y2';

----------------------------------------------------------------------------------
| Id  | Operation                            | Name | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |       |       |     2 (100)|
|   1 |  SORT AGGREGATE                      |      |     1 |    54 |            |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |     1 |    54 |     2   (0)|
|*  3 |    INDEX RANGE SCAN                  | I1   |     1 |       |     1   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("FLAG"=:SYS_B_0)

Note
-----
- SQL plan baseline SQL_PLAN_d9tch6banyzg97823646b used for this statement

And the plan switch happens.

Something new has been introduced in Oracle 12c which has dramatically enhanced the collaboration between ACS and SPM.

In the 11g investigation I’ve executed the same query 10 times without provoking a plan switch. In 12c, I have executed the same query with the ”index range scan” bind variable as many times as the number of executions done with the “full table scan” bind variable (2 executions) to see the ACS kicking off as shown below:

<pre>SQL> select
      sql_id
     ,child_number
     ,is_bind_aware
     ,is_bind_sensitive
     ,to_char(exact_matching_signature) sig
     ,executions
     ,plan_hash_value
    from v$sql
    where sql_id = '6fbvysnhkvugw'
    and is_shareable = 'Y';

SQL_ID        CHILD_NUMBER I I SIG                  EXECUTIONS PLAN_HASH_VALUE
------------- ------------ - - -------------------- ---------- ---------------
6fbvysnhkvugw            1 Y Y 15340826253708983785     1      3724264953
6fbvysnhkvugw            2 Y Y 15340826253708983785     2       497086120
6fbvysnhkvugw            4 Y Y 15340826253708983785     1       497086120
6fbvysnhkvugw            6 Y Y 15340826253708983785     1      3724264953

March 17, 2015

A precise index

Filed under: Oracle — hourim @ 7:50 pm

Below are a select statement not performing very well and its corresponding row source execution plan:

SQL> select
       {list of colums}
    from
      tx_tables bot
    inner join ty_tables_tmp tmp
     on account_id      = tmp.account_id
     and trade_id       = tmp.trd_id
     where transferred <> 1;
----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                        | Starts | E-Rows | A-Rows | 
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                             |      1 |        |    301K|
|   1 |  NESTED LOOPS                |                             |      1 |        |    301K|
|   2 |   NESTED LOOPS               |                             |      1 |     75 |    301K|
|   3 |    TABLE ACCESS FULL         | TY_TABLES_TMP               |      1 |      2 |      2 |        
|*  4 |    INDEX RANGE SCAN          | TX_TABLES_IDX1              |      2 |  43025 |    301K|
|*  5 |   TABLE ACCESS BY INDEX ROWID| TX_TABLES                   |    301K|     38 |    301K|
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("TRADE_ID"="TMP"."TRD_ID")
   5 - filter(("BOT"."TRANSFERRED"<>1 AND "ACCOUNT_ID"="TMP"."ACCOUNT_ID"))

Statistiques
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     278595  consistent gets
          0  physical reads
          0  redo size
   10597671  bytes sent via SQL*Net to client
     221895  bytes received via SQL*Net from client
      20131  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     301944  rows processed

Let’s put aside the inadequate statistics (the CBO is estimating to get 43K rows at operation in line 4 while actually it has generated 301K) and let’s try to figure out if there is a way we can follow to avoid starting TX_TABLES TABLE ACCESS BY INDEX ROWID operation 301K times. The double NESTED LOOP (known as the 11g NLJ_BATCHING) is driving here an outer row source of 301K rows (NESTED LOOP operation at line 2) which starts henceforth the inner operation TABLE ACCESS BY INDEX ROWID 301K times (see the Starts column at line 5). If we get rid of the NESTED LOOP at line 1 we might then be able to reduce the number of times the operation at line 5 is started. And maybe we will also, as a consequence of this starts operation reduction, decrease the number of corresponding logical I/O. Annihilating the nlj_batching feature can be achieved by using the no_nlj_batching hint as shown below:

SQL> select
       /*+ no_nlj_batching(bot) */
       {list of colums}
     from
      tx_tables bot
    inner join ty_tables_tmptmp
     on account_id     = tmp.account_id
     and trade_id       = tmp.trd_id
     where transferred <> 1;
----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                        | Starts | E-Rows | A-Rows |  
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                             |      0 |        |      0 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| TX_TABLES                   |      1 |     38 |    301K|
|   2 |   NESTED LOOPS              |                             |      1 |     75 |    301K|
|   3 |    TABLE ACCESS FULL        | TY_TABLES_TMP               |      1 |      2 |      2 |
|*  4 |    INDEX RANGE SCAN         | TX_TABLES_IDX1              |      2 |  43025 |    301K|
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("BOT"."TRANSFERRED"<>1 AND
       "ACCOUNT_ID"="TMP"."ACCOUNT_ID"))
   4 - access("TRADE_ID"="TMP"."TRD_ID")
Note
-----
   - dynamic sampling used for this statement (level=2)

Statistiques
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     278595  consistent gets
          0  physical reads
          0  redo size
   10597671  bytes sent via SQL*Net to client
     221895  bytes received via SQL*Net from client
      20131  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     301944  rows processed

Although the Starts column is not showing anymore those 301K executions, the number of logical I/O is still exactly the same. The performance issue, in contrast to what I was initially thinking about, is not coming from the nlj_batching feature. As a next step I have decided that it is time to look carefully to this query from the indexes point of view. The above two execution plans have both made use of the TX_TABLES_IDX1 index defined as shown below:

TX_TABLES_IDX1(TRADE_ID, EXT_TRD_ID)       

There is still a room to create a precise index which might help in this case. This index might look like the following one:

SQL> create index TX_TABLES_FBI_IDX2
                 (TRADE_ID
                 ,ACCOUNT_ID
                 ,CASE WHEN TRANSFERRED <>1 THEN -1 ELSE NULL END
                 );

Which, once created, it has allowed the initial query (without any hint) to be honored with the following execution plan:

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                         | Starts | E-Rows | A-Rows |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                              |      1 |        |    301K|
|   1 |  NESTED LOOPS                |                              |      1 |     75 |    301K|
|   2 |   TABLE ACCESS FULL          | TY_TABLES_TMP                |      1 |      2 |      2 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| TX_TABLES                    |      2 |     38 |    301K|
|*  4 |    INDEX RANGE SCAN          | TX_TABLES_FBI_IDX2           |      2 |   4141 |    301K|
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("BOT"."TRANSFERRED"<>1)
   4 - access("TRADE_ID"="TMP"."TRD_ID" AND
       "ACCOUNT_ID"="TMP"."ACCOUNT_ID")
Note
-----
   - dynamic sampling used for this statement (level=2)

Statistiques
----------------------------------------------------------
        199  recursive calls
          0  db block gets
     108661  consistent gets
        229  physical reads
          0  redo size
    8394791  bytes sent via SQL*Net to client
     221895  bytes received via SQL*Net from client
      20131  SQL*Net roundtrips to/from client
          7  sorts (memory)
          0  sorts (disk)
     301944  rows processed

Spot how the new index has not only get rid of the nlj_batching double nested loop and reduced the number of operations Oracle has to start but it has also reduced the logical I/O consumption to 108K instead of the initial 278K. However, we still have not changed the predicate part of the query to match exactly the function based part of the new index (CASE WHEN TRANSFERRED <>1 THEN -1 ELSE NULL END) which explains why we still have a filter on the TX_TABLES operation at line 3. As always with function based indexes, you need to have the predicate part of the query matching the definition of the function based index expression. Which in other words translate to this new query (look at the last line of the query):

SQL> select
       {list of colums}
     from
       tx_tables bot
       inner join ty_tables_tmptmp
     on account_id     = tmp.account_id
     and trade_id       = tmp.trd_id
     where (case  when transferred <> 1 then -1 else null end)  = -1;

Here it is the new resulting executions plan:

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                         | Starts | E-Rows | A-Rows |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                              |      1 |        |    301K|
|   1 |  NESTED LOOPS                |                              |      1 |      1 |    301K|
|   2 |   TABLE ACCESS FULL          | TY_TABLES_TMP                |      1 |      1 |      2 |
|   3 |   TABLE ACCESS BY INDEX ROWID| TX_TABLES                    |      2 |     75 |    301K|
|*  4 |    INDEX RANGE SCAN          | TX_TABLES_FBI_IDX2           |      2 |   4141 |    301K|
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("TRADE_ID"="TMP"."TRD_ID"
               AND "ACCOUNT_ID"="TMP"."ACCOUNT_ID"
               AND "BOT"."SYS_NC00060$"=(-1))
Statistiques
----------------------------------------------------
          0  recursive calls
          0  db block gets
     108454  consistent gets
          0  physical reads
          0  redo size
    8394791  bytes sent via SQL*Net to client
     221895  bytes received via SQL*Net from client
      20131  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     301944  rows processed

Notice now that when I have matched the predicate part of the query with the function based index definition there is no filter anymore on the TX_TABLES table which, despite this time has not been of a noticeable effect, it might reveal to be a drastic enhancement.

Bottom Line: precise index can help the CBO following a better path .

March 16, 2015

Diagnosing the past

Filed under: Tuning — hourim @ 8:08 pm

This is a simple note about diagnosing a situation that happened in the past.

A running application suffered a delay in many of its multi-user insert statements blocked on an enq: TM-row lock contention.

Capture2

The on call DBA was just killing the culprit blocking session when I received an e-mail asking to investigate the root cause of this lock . As far as this was a task of diagnosing a very recent past, using v$active_session_history imposes itself:

select sql_id,event, count(1)
from v$active_session_history
where sample_time > to_date('05032015 10:00:00','ddmmrrrr hh24:mi:ss')
and   sample_time < to_date('05032015 11:00:00','ddmmrrrr hh24:mi:ss')
and event like '%TM%'
group by sql_id,event
order by 3 desc;

3fbwp7qdqxk9v     enq: TM - contention    1

Surprisingly there was only 1 recorded enq: TM-row lock contention wait event during the exact same period of blocked insert statements; and the corresponding sql_id has nothing to do with the blocked inserts.

I stumped few minutes looking bizarrely to the above select and started thinking about the basics which say:

  • v$active_session_history is a sample of all active sessions taken every second.
  • dba_hist_active_sess_history isa one-in-ten samples of v$active_session_history

I knew as well that v$active_session_history being an in-memory buffer the retention period is henceforth depending on the size of the buffer and the volume of active sessions. But, I was diagnosing almost a real time situation; so why there were no  “functional” enq: TM-row lock contention in v$active_session_history?

Finally, I decided to use the less precise one-in-ten sample table and this is what I’ve got:

select sql_id,event, count(1)
from dba_hist_active_sess_history
where sample_time > to_date('05032015 10:00:00','ddmmrrrr hh24:mi:ss')
and   sample_time < to_date('05032015 11:00:00','ddmmrrrr hh24:mi:ss')
and event like '%TM%'
group by sql_id,event
order by 3 desc;

53xthsbv8d7yk     enq: TM - contention    4878
1w95zpw2fy021     enq: TM - contention    340
35ghv3bugv22a     enq: TM - contention    264
8b9nqpzs24n0t     enq: TM - contention    163
aqdaq2ybqkrpa     enq: TM - contention    156
50gygyqsha3nr     enq: TM - contention    103
fzfvzhjg0p6y0     enq: TM - contention    82
bs359cfsq4fvc     enq: TM - contention    80
15xpc3um0c3a2     enq: TM - contention    58
d0rndrymh0b18     enq: TM - contention    49
864jbgkbpvcnf     enq: TM - contention    40
9cn21y7hbya46     enq: TM - contention    36
8419w8jnhfa3m     enq: TM - contention    33
f71jbkdy94pph     enq: TM - contention    5
2zpyy8wbnp5d0     enq: TM - contention    3
0d6gq7b9j522p     enq: TM - contention    2

Normally what we can see in dba_hist_active_sess_history has certainly travelled via v$active_session_history; and the more recent is the situation the more is the chance we have to find this situation mentioned in v$active_session_history. Why then I have not found what I was expecting? Before answering this simple question let me tell you few words about how I have explained this TM lock and the solution I have proposed to get rid of it.

TM-enqueue is almost always related to an unindexed foreign key. The session killed by the DBA was deleting from a parent table (parent_tab). The underlying child table (child_tab) was pointing to this “deleted” parent table via an unindexed foreign key which consequently has been locked. In the meantime, the application was inserting concurrently into other tables. Those inserts add child values coming from the above locked child_tab table (which becomes a parent table in the eyes of the insert statements) which Oracle has to check their existence in the locked “parent” child_tab table and hence the lock sensation reported by the end user during their insert statements. The solution was simply to index the foreign key in the child_tab.

Back to the reason that prompted me to write this note, why v$active_session_history is not showing the same very recent TM lock as the dba_hist_active_sess_history? The answer is simply because I forget that the application is running under RAC instance and I was pointing to the other node of the RAC so that when I issued the same select against gv$ the discrepancy between the two view ceases immediately as shown below:

SQL> select sql_id,event, count(1)
    from gv$active_session_history
    where sample_time > to_date('05032015 10:00:00','ddmmrrrr hh24:mi:ss')
    and   sample_time < to_date('05032015 11:00:00','ddmmrrrr hh24:mi:ss')
    and event like '%TM%'
    group by sql_id,event
    order by 3 desc;

SQL_ID        EVENT                       COUNT(1)
------------- --------------------------- ---------
53xthsbv8d7yk enq: TM - contention        48483
1w95zpw2fy021 enq: TM - contention        3370
35ghv3bugv22a enq: TM - contention        2635
8b9nqpzs24n0t enq: TM - contention        1660
aqdaq2ybqkrpa enq: TM - contention        1548
50gygyqsha3nr enq: TM - contention        1035
fzfvzhjg0p6y0 enq: TM - contention        821
bs359cfsq4fvc enq: TM - contention        801
15xpc3um0c3a2 enq: TM - contention        585
d0rndrymh0b18 enq: TM - contention        491
864jbgkbpvcnf enq: TM - contention        378
9cn21y7hbya46 enq: TM - contention        366
8419w8jnhfa3m enq: TM - contention        331
f71jbkdy94pph enq: TM - contention        46
2zpyy8wbnp5d0 enq: TM - contention        33
0d6gq7b9j522p enq: TM - contention        15
dmpafdd7anvrw enq: TM - contention        1
3fbwp7qdqxk9v enq: TM - contention        1

select sql_id,event, count(1)
from dba_hist_active_sess_history
where sample_time > to_date('05032015 10:00:00','ddmmrrrr hh24:mi:ss')
and   sample_time < to_date('05032015 11:00:00','ddmmrrrr hh24:mi:ss')
and event like '%TM%'
group by sql_id,event
order by 3 desc;

53xthsbv8d7yk     enq: TM - contention    4878
1w95zpw2fy021     enq: TM - contention    340
35ghv3bugv22a     enq: TM - contention    264
8b9nqpzs24n0t     enq: TM - contention    163
aqdaq2ybqkrpa     enq: TM - contention    156
50gygyqsha3nr     enq: TM - contention    103
fzfvzhjg0p6y0     enq: TM - contention    82
bs359cfsq4fvc     enq: TM - contention    80
15xpc3um0c3a2     enq: TM - contention    58
d0rndrymh0b18     enq: TM - contention    49
864jbgkbpvcnf     enq: TM - contention    40
9cn21y7hbya46     enq: TM - contention    36
8419w8jnhfa3m     enq: TM - contention    33
f71jbkdy94pph     enq: TM - contention    5
2zpyy8wbnp5d0     enq: TM - contention    3
0d6gq7b9j522p     enq: TM - contention    2

Spot by the way the formidable illustration of the basics mentioned earlier in this article i.e. dba_hist_active_sess_history is a one-in-ten samples of v$active_session_history

48483/10 ~ 4878 
3370/10  ~  340
2635/10  ~ 264
1660/10  ~ 163
1548/10  ~ 156

March 9, 2015

Index compression: two birds with one stone

Filed under: Index — hourim @ 7:52 pm

Have you ever been asked to trouble shoot a performance issue of a complex query with the following restrictions:

  • You are not allowed to change the code of the query because it belongs to a third party software
  • You are not allowed to create a new index because of disk space stress
  • You are supposed to solve the issue without using  neither a SPM baseline nor a SQL Profile

What do you think you still have on your hands to tackle this issue?

I was quite confident that the performance issue was coming, as almost always, from a poor or not representative statistics which ultimately have biased the Optimizer choosing a wrong execution path. The row source execution plan taken this time from the Real Time Sql Monitoring report confirms my initial feeling about non representative statistics as shown by the several 1 estimated cardinality below (Rows Estim):

SQL Plan Monitoring Details (Plan Hash Value=2278065992)
==============================================================================================
| Id |                Operation                 |          Name |  Rows   | Execs |   Rows   |
|    |                                          |               | (Estim) |       | (Actual) |
==============================================================================================
|  0 | SELECT STATEMENT                         |               |         |     1 |     1059 |
|  1 |   HASH GROUP BY                          |               |       1 |     1 |     1059 |
|  2 |    FILTER                                |               |         |     1 |     135K |
|  3 |     NESTED LOOPS                         |               |       1 |     1 |     135K |
|  4 |      NESTED LOOPS                        |               |       1 |     1 |     135K |
|  5 |       NESTED LOOPS OUTER                 |               |       1 |     1 |     135K |
|  6 |        NESTED LOOPS                      |               |       1 |     1 |     135K |
|  7 |         HASH JOIN OUTER                  |               |       1 |     1 |     145K |
|  8 |          NESTED LOOPS                    |               |         |     1 |     145K |
|  9 |           NESTED LOOPS                   |               |       1 |     1 |     145K |
| 10 |            NESTED LOOPS                  |               |       1 |     1 |     146K |
| 11 |             NESTED LOOPS                 |               |       1 |     1 |     146K |
| 12 |              NESTED LOOPS                |               |       1 |     1 |        1 |
| 13 |               NESTED LOOPS               |               |       1 |     1 |        1 |
| 14 |                FAST DUAL                 |               |       1 |     1 |        1 |
| 15 |                FAST DUAL                 |               |       1 |     1 |        1 |
| 16 |               FAST DUAL                  |               |       1 |     1 |        1 |
| 17 |              TABLE ACCESS BY INDEX ROWID | TABLE_XY      |       1 |     1 |     146K |
| 18 |               INDEX RANGE SCAN           | IDX_TABLE_XY23|       1 |     1 |      12M |
==============================================================================================
17 - filter(("AU"."NEW_VALUE"=:SYS_B_119 AND "AU"."ATTRIBUTE_NAME"=:SYS_B_118))
18 - access("AU"."UPDATED_DATE">=TO_DATE(:SYS_B_001||TO_CHAR(EXTRACT(MONTH FROM
              ADD_MONTHS(CURRENT_DATE,(-:SYS_B_000))))||:SYS_B_002||TO_CHAR(EXTRACT(YEAR FROM
              ADD_MONTHS(CURRENT_DATE,(-:SYS_B_000))))||:SYS_B_003,:SYS_B_004)
              AND "AU"."COURSE_NAME"=:SYS_B_117
              AND "AU"."UPDATED_DATE"<=TO_DATE(TO_CHAR(LAST_DAY(ADD_MONTHS(CURRENT_DATE,(-
             :SYS_B_000))),:SYS_B_005)||:SYS_B_006,:SYS_B_007))
     filter("AU"."COURSE_NAME"=:SYS_B_117)

Collecting statistics adequately was of course the right path to follow; however looking at the above execution plan I have realized that the most consuming operation was an index range scan followed by a table access by index rowid. The index operation at line 18 was supplying its parent operation at line 17 with 12 million worth of rows from which the filter at this line has allowed only 2% (146K) of the rows to survive its elimination. This is a classical problem of imprecise index wasting a lot of time and energy in throwing away rows that should have been eliminated earlier.  There is also another indication about the imprecision of the index used at operation line 18. Its predicate part contains both an access and a filter operation. In order to make the picture clear here below is the index definition and the predicate part used in the problematic query:

IDX_TABLE_XY23 (UPDATED_DATE, COURSE_NAME)

FROM TABLE_XY     
WHERE AU.COURSE_NAME = ‘Point’
AND AU.UPDATED_DATE  >= PidDates.END_DATE

The index has been defined to start with the column on which an inequality is applied. We should instead always place at the leading edge of the index the column on which we have the intention to apply an equality predicate. One solution would be to reverse the above index columns. But I was not going to do that without checking the whole application looking for queries using the UPDATED_DATE column in equality predicate so that reversing the columns of that index will harm them. Hopefully there were no such queries and the way was already paved for me to proceed to that index columns order reversing proposition.

Few minutes before going ahead I remembered that this application is suffering from a disk space shortage so that compressing the index would certainly help. Moreover the “new” leading index column is the less repetitive one which hopefully will give a better level of compressibility:

SQL> select column_name, num_distinct
     from all_tab_col_statistics
     where table_name = 'TABLE_XY'
     and column_name in ('UPDATED_DATE','COURSE_NAME');

COLUMN_NAME                    NUM_DISTINCT
------------------------------ ------------
UPDATED_DATE                   1309016064
COURSE_NAME                    63
SQL> create index idx_TABLE_XYmh on TABLE_XY(COURSE_NAME, UPDATED_DATE) compress 1;

And here are the results

Before the new index

1059 rows selected.
Elapsed: 00:32:37.34

Global Stats
================================================================
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
================================================================
|    2020 |     154 |     1867 |    72 |    10M |   1M |   9GB |
================================================================

After the new index

1059 rows selected.
Elapsed: 00:19:56.08

Global Stats
========================================================
| Elapsed |   Cpu   |    IO    | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Gets  | Reqs | Bytes |
========================================================
|    1204 |      70 |     1134 |     9M | 463K |   4GB |
========================================================

The query execution time dropped from 33 minutes to 19 minutes and so did the logical and physical I/O.

Of course I am still throwing the same amount of rows at the table level a shown below

SQL Plan Monitoring Details (Plan Hash Value=2577971998)
===================================================================
| Id    |  Operation                 |          Name   |   Rows   |
|       |                            |                 | (Actual) |
===================================================================
|    17 |TABLE ACCESS BY INDEX ROWID | TABLE_XY        |     146K |
|    18 | INDEX RANGE SCAN           | IDX_TABLE_XYMH  |      12M |
===================================================================
17 - filter(("AU"."NEW_VALUE"=:SYS_B_119 AND "AU"."ATTRIBUTE_NAME"=:SYS_B_118))
18 - access("AU"."COURSE_NAME"=:SYS_B_117 AND
           "AU"."UPDATED_DATE">=TO_DATE(:SYS_B_001||TO_CHAR(EXTRACT(MONTH FROM
            ADD_MONTHS(CURRENT_DATE,(-:SYS_B_000))))||:SYS_B_002||TO_CHAR(EXTRACT(YEAR FROM
            ADD_MONTHS(CURRENT_DATE,(-:SYS_B_000))))||:SYS_B_003,:SYS_B_004) AND
           "AU"."UPDATED_DATE"<=TO_DATE(TO_CHAR(LAST_DAY(ADD_MONTHS(CURRENT_DATE,(-:SYS_B_000)))
           ,:SYS_B_005)||:SYS_B_006,:SYS_B_007))

But I have drastically reduced the part of the index I was traversing before reversing the columns order and I have got rid of the filter at the index predicate.  And finally, because it is the title of this article, by compressing the index I have gained 67GB

SQL> select segment_name, trunc(bytes/1024/1024/1024) GB
    from dba_segments
    where segment_type = 'INDEX'
    and segment_name in ('IDX_TABLE_XYMH','IDX_TABLE_XY23');

SEGMENT_NAME                           GB
------------------------------ ----------
IDX_TABLE_XYMH                        103
IDX_TABLE_XY23                        170

 

February 24, 2015

Parallel Query, Broadcast distribution and TEMP space

Filed under: Parallel Processing — hourim @ 6:24 pm

When it comes to parallel operations there are fundamentally two data distribution methods between individual parallel (PX) server sets: hash-hash and broadcast distribution. Suppose that you are going to operate a parallel HASH JOIN between two full table scans using a Degree of Parallelism (DOP) of 2. Since the HASH JOIN operation needs two distinct set of PX servers there will be actually always 2*DOP = 4 slaves acting in parallel. They represent two slaves per PX set: (PX1 and PX2) for the first set and (PX3 and PX4) for the second set. Each table in the join is read by one of the PX server set in parallel. When both PX servers have finished collecting their data set they need to distribute it to the subsequent parallel HASH JOIN operation. This data distribution can be done using, commonly, either a hash-hash or a broadcast distribution. Using the former method (hash-hash) the result set gathered by the parallel scan of the build and the probe tablesin the join are both sent to the parallel server set responsible for the hash join operation. Using a BROADCAST distribution method, instead of distributing rows from both result sets Oracle sends the smaller result set to all parallel server slaves of the set responsible of the subsequent parallel hash joins operation.

I bolded the word smaller result set to emphasize that smaller is relative to the other row source in the join. Do you consider that a build table with 78 million of rows is a small data set? It might be considered smaller in the eye of the Oracle optimizer when the probe table is estimated to generate 770 million of rows. This is exactly what happened to me.

If this concept of parallel data distribution is still not clear for you and you want to understand what I have encountered and how I’ve managed to pull out myself from this nightmare then continuing reading this article might be worth the effort.

I was asked to create a big table based on a join with two other big tables so that doing this serially was practically impossible to complete in an acceptable execution time. I decided, therefore, to create it in parallel. After having enabled parallel DML and forced parallel DDL I launched the following create table statement with a DOP of 8 both for DML and DDL

SQL> create /*+ parallel(8) */
       table table_of_dates
       tablespace dtbs
       pctfree 0
    as
    select
        /*+ parallel(8)
          full(t)
          full(tr)
      */
       t.tr_tabl_id
     , t.table1_date_time
     , t.writing_date
     , min(tr.arrivl_date)
     , max(tr.arrivl_date)
   from
      table1 t
   left outer join table2 tr
   on t.tr_tabl_id = tr.tr_tabl_id
   join table3
   on t.order_id = table3.order_id
   and tr.status not in ('CANCELED')
   where t.writing_date <= to_date('17.06.2011', 'dd.mm.yyyy')
   and table3.order_type = ‘Broadcast’
   group by
         t.tr_tabl_id
       , t.table1_date_time
       , t.writing_date;

create /*+ parallel(8) */
*
ERROR at line 1:
ORA-12801: error signaled in parallel query server P013
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

As you can notice, it went with an ORA-01652 error. Below it the corresponding Real Time SQL Monitoring Report showing that the HASH JOIN operation at line 7 reaches 67G which is beyond the size of the current physical TEMP tablespace (64G) and hence the ORA-01652 error.

Parallel Execution Details (DOP=8, Servers Allocated=16)
SQL Plan Monitoring Details (Plan Hash Value=3645515647)
=====================================================================================
| Id |              Operation              |        Name | Execs |   Rows   | Temp  |
|    |                                     |             |       | (Actual) | (Max) |
=====================================================================================
|  0 | CREATE TABLE STATEMENT              |             |    17 |          |       |
|  1 |   PX COORDINATOR                    |             |    17 |          |       |
|  2 |    PX SEND QC (RANDOM)              | :TQ10003    |       |          |       |
|  3 |     LOAD AS SELECT                  |             |       |          |       |
|  4 |      HASH GROUP BY                  |             |       |          |       |
|  5 |       PX RECEIVE                    |             |       |          |       |
|  6 |        PX SEND HASH                 | :TQ10002    |     8 |          |       |
|  7 |         HASH JOIN                   |             |     8 |        0 |   67G |
|  8 |          PX RECEIVE                 |             |     8 |     626M |       |
|  9 |           PX SEND BROADCAST         | :TQ10001    |     8 |     626M |       |
| 10 |            HASH JOIN                |             |     8 |      78M |       |
| 11 |             PX RECEIVE              |             |     8 |     372K |       |
| 12 |              PX SEND BROADCAST      | :TQ10000    |     8 |     372K |       |
| 13 |               PX BLOCK ITERATOR     |             |     8 |    46481 |       |
| 14 |                INDEX FAST FULL SCAN | IDX_TABLE3_3|   182 |    46481 |       |
| 15 |             PX BLOCK ITERATOR       |             |     8 |      88M |       |
| 16 |              TABLE ACCESS FULL      | TABLE1      |   120 |      88M |       |
| 17 |          PX BLOCK ITERATOR          |             |     8 |     717M |       |
| 18 |           TABLE ACCESS FULL         | TABLE2      |   233 |     717M |       |
=====================================================================================

I stumped few minutes looking at the above execution plan and have finally decided to try a second create table with a reduced degree of parallelism (4 instead of 8) and here what I got

SQL> create /*+ parallel(4) */
       table table_of_dates
       tablespace dtbs
       pctfree 0
    as
    select
        /*+ parallel(4)
         full(t)
         full(tr)
      */
       t.tr_tabl_id
     , t.table1_date_time
     , t.writing_date
     , min(tr.arrivl_date)
     , max(tr.arrivl_date)
   from
      table1 t
   left outer join table2 tr
   on t.tr_tabl_id = tr.tr_tabl_id
   join table3
   on t.order_id = table3.order_id
   and tr.status not in ('CANCELED')
   where t.writing_date <= to_date('17.06.2011', 'dd.mm.yyyy')
   and table3.order_type = ‘Broadcast’
   group by
         t.tr_tabl_id
       , t.table1_date_time
       , t.writing_date;

Table created.
Elapsed: 00:31:42.29

The table has been this time successfully created within 32 minutes approximatively.

Before going to the next issue in the pipe, I wanted to understand why reducing the Degree Of Parallelism (DOP) from 8 to 4 made the create statement successful? The obvious thing I have attempted was to compare the 8 DOP execution plan with the 4 DOP one. The first plan has already been shown above. The second one is presented here below (reduced only to the information that is vital to the aim of this article):

Parallel Execution Details (DOP=4 , Servers Allocated=8)
SQL Plan Monitoring Details (Plan Hash Value=326881411)
=============================================================================================
| Id |                 Operation                  |       Name     | Execs |   Rows   |Temp |
|    |                                            |                |       | (Actual) |(Max)|
=============================================================================================
|  0 | CREATE TABLE STATEMENT                     |                |     9 |        4 |     |
|  1 |   PX COORDINATOR                           |                |     9 |        4 |     |
|  2 |    PX SEND QC (RANDOM)                     | :TQ10003       |     4 |        4 |     |
|  3 |     LOAD AS SELECT                         |                |     4 |        4 |     |
|  4 |      HASH GROUP BY                         |                |     4 |      75M |     |
|  5 |       PX RECEIVE                           |                |     4 |     168M |     |
|  6 |        PX SEND HASH                        | :TQ10002       |     4 |     168M |     |
|  7 |         HASH JOIN                          |                |     4 |     168M | 34G |
|  8 |          PX RECEIVE                        |                |     4 |     313M |     |
|  9 |           PX SEND BROADCAST                | :TQ10001       |     4 |     313M |     |
| 10 |            HASH JOIN                       |                |     4 |      78M |     |
| 11 |             BUFFER SORT                    |                |     4 |     186K |     |
| 12 |              PX RECEIVE                    |                |     4 |     186K |     |
| 13 |               PX SEND BROADCAST            | :TQ10000       |     1 |     186K |     |
| 14 |                TABLE ACCESS BY INDEX ROWID | TABLE3         |     1 |    46481 |     |
| 15 |                 INDEX RANGE SCAN           | IDX_ORDER_TYPE |     1 |    46481 |     |
| 16 |             PX BLOCK ITERATOR              |                |     4 |      88M |     |
| 17 |              TABLE ACCESS FULL             | TABLE1         |   115 |      88M |     |
| 18 |          PX BLOCK ITERATOR                 |                |     4 |     770M |     |
| 19 |           TABLE ACCESS FULL                | TABLE2         |   256 |     770M |     |
=============================================================================================

They don’t share exactly the same execution plan (they have two different plan hash values). The irritating question was: why, by halving down the degree of parallelism from 8 to 4 the same SQL statement necessitated almost half (34G) the amount of TEMP space and completed successfully?

The answer resides into the parallel distribution (PQ Distrib) method used by the parallel server sets to distribute their collected set of rows to the subsequent parallel server set (doing the hash join).

In the DOP 4 execution plan above we can see that we have 4 PX parallel server sets each one responsible of filling up one of the virtual TQ tables: TQ10000, TQ10001, TQ10002 and TQ10003. Here below is how to read the above execution plan:

  1.  The first PX1 set of slaves reads TABLE3 and broadcast its data set to the second PX2 set of slaves through the TQ10000 virtual table.
  2.  PX2 set reads TABLE1, hash join it with the data set it has received (TQ10000) from PX1 set and broadcast its result to the next parallel server
    set which is PX3 via the second TQ10001 virtual table.
  3. PX3 set of parallel slaves probes TABLE2 by parallel full scanning it and hash join it with the build result set (TQ10001) it has received from
    PX2 parallel set. This operation ends up by filling up the third virtual TQ table TQ10002 and by sending it to the next and last PX server PX4
    using a hash distribution.
  4. Finally, PX4 set of slaves will receive the TQ10002 data, hash group by it, fill the last virtual table (TQ10003) table and send it to the query
    coordinator (QC) which will end up by creating the table_of_dates table

That is a simple way of reading a parallel execution plan. By listing the above parallel operations I aimed to emphasize that data (HASH JOIN operation at line 10) produced by the parallel set of slaves PX2 is broadcasted (PX SEND BROADCAST operation at line 9) to the next parallel set PX3. And this means that each slave of PX2 set will pass every row it has received to every slave of the PX3 set. A typical reason to do such a distribution method is that the data set of the first row source (78M of rows in this case) is ”smaller” than the second row source to be joined with (770M of rows).

In order to make the picture clear let’s zoom around the hash join operation in the 4 DOP plan

Parallel Execution Details (DOP=4 , Servers Allocated=8)
======================================================================
| Id |        Operation    |       Name     | Execs |   Rows   |Temp |
|    |                     |                |       | (Actual) |(Max)|
======================================================================
|  7 |HASH JOIN            |                |     4 |     168M | 34G |
|  8 | PX RECEIVE          |                |     4 |     313M |     |
|  9 |  PX SEND BROADCAST  | :TQ10001       |     4 |     313M |     |
| 10 |   HASH JOIN         |                |     4 |      78M |     |
| 18 | PX BLOCK ITERATOR   |                |     4 |     770M |     |
| 19 |  TABLE ACCESS FULL  | TABLE2         |   256 |     770M |     |
======================================================================

The ”smaller” result set produced by the PX2 set of slaves at line 10 is 78M of rows. As far as Oracle decided to broadcast those 78M of rows towards PX3 set it has been duplicated as many times as there are slaves in PX3 set to receive the broadcasted data. As far as we have a DOP of 4 then this means that the HASH JOIN operation at line 7 has received 4*78 = 313M of rows (operation at line 8) that has been built and hashed in memory in order to be able to probe the result set produced by the second parallel server set coming from operations 18-19.

Multiply the DOP by 2, keep the same parallel distribution method and the same HASH JOIN operation will have to build and hash a table of 8*78 = 626M of rows. Which ultimately has required more than 67GB of TEMP space (instead of the initial DOP 4 34GB) as shown below:

Parallel Execution Details (DOP=8, Servers Allocated=16)
===================================================================
| Id |     Operation     |        Name | Execs |   Rows   | Temp  |
|    |                   |             |       | (Actual) | (Max) |
===================================================================
|  7 |HASH JOIN          |             |     8 |        0 |   67G |
|  8 | PX RECEIVE        |             |     8 |     626M |       |
|  9 |  PX SEND BROADCAST| :TQ10001    |     8 |     626M |       |
| 10 |   HASH JOIN       |             |     8 |      78M |       |
| 17 | PX BLOCK ITERATOR |             |     8 |     717M |       |
| 18 |  TABLE ACCESS FULL| TABLE2      |   233 |     717M |       |
===================================================================

Now that I know from where the initial ORA-01652 error is coming from, changing the parallel distribution method from BROADCAST to HASH might do the job without requiring more than 64GB of TEMP space. A simple way to accomplish this task is to hint the parallel select to use the desired parallel distribution method. From the outline of the successful DOP 4 execution plan I took the appropriate hint (pq_distribute(alias hash hash)), adapted it and issued the following create table:

SQL> create /*+ parallel(8) */
       table table_of_dates
       tablespace dtbs
       pctfree 0
    as
    select
        /*+ parallel(8)
          full(t)
          full(tr)
	  pq_distribute(@"SEL$E07F6F7C" "T"@"SEL$2" hash hash)
          px_join_filter(@"SEL$E07F6F7C" "T"@"SEL$2")
          pq_distribute(@"SEL$E07F6F7C" "TR"@"SEL$1" hash hash)
          px_join_filter(@"SEL$E07F6F7C" "TR"@"SEL$1")
      */
       t.tr_tabl_id
     , t.table1_date_time
     , t.writing_date
     , min(tr.arrivl_date)
     , max(tr.arrivl_date)
   from
      table1 t
   left outer join table2 tr
   on t.tr_tabl_id = tr.tr_tabl_id
   join table3
   on t.order_id = table3.order_id
   and tr.status not in ('CANCELED')
   where t.writing_date <= to_date('17.06.2011', 'dd.mm.yyyy')
   and table3.order_type = ‘Broadcast’
   group by
         t.tr_tabl_id
       , t.table1_date_time
       , t.writing_date;	   

Table created.
Elapsed: 00:12:46.33

And the job has been done in less than 13 minutes instead of the initial 33 minutes with DOP 4.

The new execution plan is:

SQL Plan Monitoring Details (Plan Hash Value=5257928)
====================================================================================
| Id |              Operation              |        Name |Execs |   Rows   | Temp  |
|    |                                     |             |      | (Actual) | (Max) |
====================================================================================
|  0 | CREATE TABLE STATEMENT              |             |   17 |        8 |       |
|  1 |   PX COORDINATOR                    |             |   17 |        8 |       |
|  2 |    PX SEND QC (RANDOM)              | :TQ10004    |    8 |        8 |       |
|  3 |     LOAD AS SELECT                  |             |    8 |        8 |       |
|  4 |      HASH GROUP BY                  |             |    8 |      75M |   10G |
|  5 |       HASH JOIN                     |             |    8 |     168M |       |
|  6 |        JOIN FILTER CREATE           | :BF0000     |    8 |      78M |       |
|  7 |         PX RECEIVE                  |             |    8 |      78M |       |
|  8 |          PX SEND HASH               | :TQ10002    |    8 |      78M |       |
|  9 |           HASH JOIN BUFFERED        |             |    8 |      78M |    3G |
| 10 |            JOIN FILTER CREATE       | :BF0001     |    8 |    46481 |       |
| 11 |             PX RECEIVE              |             |    8 |    46481 |       |
| 12 |              PX SEND HASH           | :TQ10000    |    8 |    46481 |       |
| 13 |               PX BLOCK ITERATOR     |             |    8 |    46481 |       |
| 14 |                INDEX FAST FULL SCAN | IDX_TABLE3_3|  181 |    46481 |       |
| 15 |            PX RECEIVE               |             |    8 |      88M |       |
| 16 |             PX SEND HASH            | :TQ10001    |    8 |      88M |       |
| 17 |              JOIN FILTER USE        | :BF0001     |    8 |      88M |       |
| 18 |               PX BLOCK ITERATOR     |             |    8 |      88M |       |
| 19 |                TABLE ACCESS FULL    | TABLE1      |  121 |      88M |       |
| 20 |        PX RECEIVE                   |             |    8 |     770M |       |
| 21 |         PX SEND HASH                | :TQ10003    |    8 |     770M |       |
| 22 |          JOIN FILTER USE            | :BF0000     |    8 |     770M |       |
| 23 |           PX BLOCK ITERATOR         |             |    8 |     770M |       |
| 24 |            TABLE ACCESS FULL        | TABLE2      |  245 |     770M |       |
====================================================================================

Changing the parallel distribution method from broadcast to hash-hash has not only reduced the TEMP space usage but has halved the execution time of the create table.

Conclusion
Watch carefully the degree of parallelism you are going to use. In case of a broadcast distribution method for relatively big ”smaller result set” you might end up by devouring a huge amount of TEMP space and failing to succeed with the ORA-01652: unable to extend temp segment by 128 in tablespace TEMP. Hopefully, with the arrival of the 12c Adaptive Distribution Method and the new HYBRID HASH parallel distribution method, the STATISTIC COLLECTOR operation placed below the distribution method will operate a dynamic switch from a BROADCAST distribution to a HASH distribution whenever the number of rows to be distributed exceeds a threshold which is 2*DOP. With my initial DOP of 8 I would have had a threshold of 16 which is largely below the 78 million of rows that I am due to distribute further up.

February 14, 2015

Resumable time out

Filed under: Tuning — hourim @ 9:47 am

I was trying to create a very big table through a create table as select from a join between two huge tables forcing both parallel DDL and parallel QUERY and was mainly concerned about the amount of temp space the hash join between those two big tables will require to get my table smoothly created. I said hash join because my first attempt to create this table used a nested loop join which was driving a dramatic 100 million starts of table access by index rowid since more than 10 hours when I decided to kill the underlying session. The create table being a one-shot process I decided to hint the optimizer so that it will opt for a hash join operation instead of the initial nested loop. But my concern has been transferred from a long running non finishing SQL statement to how much my create table will need of TEMP space in order to complete successfully.

I launched the create table and started monitoring it with Tanel Poder snapper script, v$active_session_history and the Real Time SQL Monitoring feature (RTSM). This is what the monitoring was showing

SQL> select 
          event
	 ,count(1) 
      from 
          v$active_session_history 
      where sql_id = '0xhm7700rq6tt' 
      group by event 
      order by 2 desc;


EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

However few minutes later Tanel Poder snapper started showing the following dominant wait event

SQL> @snapper ash 5 1 all

----------------------------------------------------------------------------------------------------
Active% | INST | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
----------------------------------------------------------------------------------------------------
   800% |    1 | 0xhm7700rq6tt   | 0         | statement suspended, wait error to  | Configuration

--  End of ASH snap 1, end=2015-02-12 09:12:57, seconds=5, samples_taken=45

Event which keeps incrementing in v$active_session_history as shown below:

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                          7400
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

7 rows selected.

SQL> /

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                          7440
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

7 rows selected.

SQL> /

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                          7480
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                         11943
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        980
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

The unique wait event that was incrementing is that bizarre statement suspended, wait error to be cleared. The session from which I launched the create table statment was hanging without reporting any error.

As far as I was initially concerned by the TEMP space I checked the available space there

SQL> select 
       tablespace_name,
       total_blocks,
       used_blocks,
       free_blocks
     from v$sort_segment;

TABLESPACE_NAME                 TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS
------------------------------- ------------ ----------- -----------
TEMP                                 8191744     8191744           0    

No available free space left in TEMP tablespace

The RTSM of the corresponding sql_id was showing the following

=====================================================================================
| Id    |              Operation              |        Name        |  Rows   | Temp |
|       |                                     |                    |(Actual) |      |
=====================================================================================
|     0 | CREATE TABLE STATEMENT              |                    |       0 |      |
|     1 |   PX COORDINATOR                    |                    |         |      |
|     2 |    PX SEND QC (RANDOM)              | :TQ10003           |         |      |
|     3 |     LOAD AS SELECT                  |                    |         |      |
|     4 |      HASH GROUP BY                  |                    |         |      |
|     5 |       PX RECEIVE                    |                    |         |      |
|     6 |        PX SEND HASH                 | :TQ10002           |         |      |
|  -> 7 |         HASH JOIN                   |                    |       0 |  66G | --> spot this 
|       |                                     |                    |         |      |
|       |                                     |                    |         |      |
|       |                                     |                    |         |      |
|     8 |          PX RECEIVE                 |                    |    626M |      |
|     9 |           PX SEND BROADCAST         | :TQ10001           |    626M |      |
|    10 |            HASH JOIN                |                    |     78M |      |
|    11 |             PX RECEIVE              |                    |    372K |      |
|    12 |              PX SEND BROADCAST      | :TQ10000           |    372K |      |
|    13 |               PX BLOCK ITERATOR     |                    |   46481 |      |
|    14 |                INDEX FAST FULL SCAN | INDX_12453656_TABL |   46481 |      |
|    15 |             PX BLOCK ITERATOR       |                    |     88M |      |
|    16 |              TABLE ACCESS FULL      | TABL1              |     88M |      |
|       |                                     |                    |         |      |
| -> 17 |          PX BLOCK ITERATOR          |                    |    682M |      |
| -> 18 |           TABLE ACCESS FULL         | TABLE123456        |    682M |      |
|       |                                     |                    |         |      |
=====================================================================================

The HASH JOIN operation at line 7 has already gone above the available TEMP tablespace size which is 64G. I was wondering then, why my session has not been stopped with an ORA-01652 error instead of hanging there and letting the wait event statement suspended, wait error to be cleared popping up in my different monitoring tools.

After few minutes of googling tuning steps I ended up by executing the following query

SQL>select
        coord_session_id
       ,substr(sql_text, 1,10)
       ,error_msg
    from dba_resumable;

SESSION_ID  SQL        ERROR_MSG
----------- ---------  -------------------------------------------------------------------
146         create /*+  ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

This 146 SID is my create table session! It is in error but hanging and waiting on that statement suspend wait event.

After few minutes of googling tuning again I ended up by checking the following parameter:

SQL> sho parameter resumable

NAME                           TYPE                             VALUE
------------------------------ -------------------------------- ------
resumable_timeout              integer                          9000

Notice now how I finally have made a link between the title and the purpose of this article: resumable_timeout. The Oracle documentation states that when this parameter is set to a nonzero value the resumable space allocation might kicks in.

Put is simply what happens in my case is that since this parameter has been set to 9000 minutes and since my create table as select has encountered an ORA-01652 temp tablespace error, Oracle decided to put my session in a resumable state until someone will point out that space shortage,correct the situation and allow the session to continue its normal processing

The next step I did is to abort my session from its resumable state

SQL> exec dbms_resumable.abort(146);
 
PL/SQL procedure successfully completed.

Unfortunately I waited a couple of minutes in front of my sqlplus session waiting for it to resume without success so that I decided to use the brute force

ERROR:
ORA-03113: end-of-file on communication channel
Process ID: 690975
Session ID: 146 Serial number: 3837

Summary
I would suggest to do not set the resumable_timeout parameter to a nonzero value. Otherwise you will be hanging on a statement suspended, wait error to be cleared event during an amount of time indicated by this resumable_timeout parameter looking around for what is making your SQL statement hanging.

Footnote
Here below few extra words from Oracle documentation about Resumable space allocation

“Resumable space allocation avoids headaches and saves time by suspending, instead
of terminating, a large database operation requiring more disk space than is currently
available. While the operation is suspended, you can allocate more disk space on
the destination tablespace or increase the quota for the user. Once the low space
condition is addressed, the large database operation automatically picks up where it
left off.
As you might expect, the statements that resume are known as resumable statements.
The suspended statement, if it is part of a transaction, also suspends the transaction.
When disk space becomes available and the suspended statement resumes, the
transaction can be committed or rolled back whether or not any statements in
the transactions were suspended. The following conditions can trigger resumable
space allocation:
■ Out of disk space in a permanent or temporary tablespace
■ Maximum extents reached on a tablespace
■ User space quota exceeded
You can also control how long a statement can be suspended. The default time
interval is two hours, at which point the statement fails and returns an error message
to the user or application as if the statement was not suspended at all.
There are four general categories of commands that can be resumable:
(1) SELECT statements, (2) DML commands, (3) SQL*Loader operations, and
(4) DDL statements that allocate disk space.”

January 6, 2015

Approximate_ndv

Filed under: Oracle,Statistics — hourim @ 8:17 pm

A couple of months ago I have been asked to do a pro-active tuning task on an application running Oracle 11.2.0.3.0 under Linux x86 64-bit which is going life in production within the next couple of months. I have been supplied a 60 minutes AWR pre-production report covering a period during which a critical batch job was running. At my surprise the following SQL statement pops up at the top of several SQL Ordered by AWR parts:

select
  /*+ no_parallel(t) 
      no_parallel_index(t) 
      dbms_stats cursor_sharing_exact 
      use_weak_name_resl 
      dynamic_sampling(0) 
      no_monitoring 
      no_substrb_pad 
   */
    count(*),
    count(distinct "SABR_ID"),
    sum(sys_op_opnsize("SABR_ID")),
    substrb(dump(min("SABR_ID"), 16, 0, 32), 1, 120),
    substrb(dump(max("SABR_ID"), 16, 0, 32), 1, 120),
    count(distinct "TYPE_ID"),
    sum(sys_op_opnsize("TYPE_ID")),
    substrb(dump(min(substrb("TYPE_ID", 1, 32)), 16, 0, 32), 1, 120),
    substrb(dump(max(substrb("TYPE_ID", 1, 32)), 16, 0, 32), 1, 120)
from 
    "XXX"."SOM_ET_ORDER_KS" sample (33.0000000000) t

This is a call to dbms_stats package collecting statistics for SOM_ET_ORDER_KS table.
So far so good right?
Hmmm……
Instructed and experimented eyes would have been already caught by two things when observing the above SQL statement:

 count(distinct)
 sample(33,000000)

Do you know what do those two points above mean in an 11gR2 database?
The count (distinct) indicates that the application is collecting statistics under the global preference approximate_ndv set to false which I have immediately checked via this command:

SQL> select dbms_stats.get_prefs ('approximate_ndv') ndv from dual;
NDV
------
FALSE

While  the (sample 33) indicates that the estimate_percent parameter used by this application when collecting statistics is set to 33%

     (estimate_percent => 33)

Is this the best way of collecting statistics?
I don’t think so.
Let’s build a proof of concept. Below is the table on which I will be collecting statistics with and without approximate_ndv

create table t_ndv
as select
      rownum n1
     ,trunc((rownum-1)/3) n2
     ,trunc(dbms_random.value(1,100)) n3
     ,dbms_random.string('L',dbms_random.value(1,5))||rownum v4
  from dual
  connect by level <=1e6;

There are 1 million rows in this table with 1 million distinct n1 values and 1 million distinct v4 values. There are however 333,334 n2 distinct values and only 99 distinct n3 values.
Lets check the Oracle behaviour when approximate_ndv is disabled. All the following tests have been done on 12 Oracle database instance:


BANNER                                                                               
---------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production        
PL/SQL Release 12.1.0.1.0 - Production                                                    
CORE    12.1.0.1.0      Production                                                                
TNS for 64-bit Windows: Version 12.1.0.1.0 - Production                                   
NLSRTL Version 12.1.0.1.0 - Production                                                    
SQL> select dbms_stats.get_prefs ('approximate_ndv') ndv from dual;

NDV
-----
FALSE

SQL> alter session set events '10046 trace name context forever, level 12';

SQL> begin
  2      dbms_stats.gather_table_stats
  3      (user
  4      ,'t_ndv'
  5      ,method_opt => 'FOR ALL COLUMNS SIZE 1'
  6      ,estimate_percent => 57);
  7  end;
 8  /

SQL> alter session set events '10046 trace name context off';                                               

The corresponding tkprofed trace file shows a piece of code which is, as expected, similar to above SQL I have found at the top of SQL Ordered by parts of the AWR I was asked to analyse

select /*+  no_parallel(t) 
            no_parallel_index(t) 
            dbms_stats 
            cursor_sharing_exact 
            use_weak_name_resl 
            dynamic_sampling(0) 
            no_monitoring 
            xmlindex_sel_idx_tbl 
           no_substrb_pad  
       */
  count(*)
, count("N1")
, count(distinct "N1")
, sum(sys_op_opnsize("N1"))
, substrb(dump(min("N1"),16,0,64),1,240)
, substrb(dump(max("N1"),16,0,64),1,240)
, count("N2")
, count(distinct "N2"), sum(sys_op_opnsize("N2")) 
 …etc...
from
 "C##MHOURI"."T_NDV" sample (57.0000000000)  t 

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      8.03       8.03          0       3861          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      8.03       8.04          0       3861          0           1

And the collected statistics information on t_ndv table is

SQL> select num_rows, blocks, avg_row_len, sample_size 
     from user_tables 
     where table_name = 'T_NDV';

  NUM_ROWS     BLOCKS AVG_ROW_LEN SAMPLE_SIZE
---------- ---------- ----------- -----------
   1000586       3924          23      570334

SQL> select
         column_name
        ,num_distinct
        ,sample_size
    from user_tab_columns
    where table_name = 'T_NDV';

COLUMN_NAME          NUM_DISTINCT SAMPLE_SIZE
-------------------- ------------ -----------
V4                        1001467      570836
N3                             99      570836
N2                         333484      570836
N1                        1001467      570836

When I did the same experiments but with approximate_ndv set to true this is below what I have obtained

SQL> exec dbms_stats.set_global_prefs('approximate_ndv','TRUE');

SQL> exec dbms_stats.delete_table_stats(user ,'t_ndv');

SQL> alter session set events '10046 trace name context forever, level 12';

SQL> begin
  2      dbms_stats.gather_table_stats
  3      (user
  4      ,'t_ndv'
  5      ,method_opt       => 'FOR ALL COLUMNS SIZE 1'
         ,estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
  6     );
  7  end;
  8  /

SQL> alter session set events '10046 trace name context off';

And finally the corresponding trace file particularly the part that corresponds to gathering number of distinct values

select /*+  full(t)    
            no_parallel(t) 
            no_parallel_index(t) 
            dbms_stats 
            cursor_sharing_exact 
            use_weak_name_resl 
            dynamic_sampling(0) 
            no_monitoring 
            xmlindex_sel_idx_tbl 
            no_substrb_pad  */
  to_char(count("N1")),
  to_char(substrb(dump(min("N1"),16,0,64),1,240)),
  to_char(substrb(dump(max("N1"),16,0,64),1,240)),
  to_char(count("N2")),
  to_char(substrb(dump(min("N2"),16,0,64),1,240)),
  to_char(substrb(dump(max("N2"),16,0,64),1,240)),
  to_char(count("N3")),
  to_char(substrb(dump(min("N3"),16,0,64),1,240)),
  to_char(substrb(dump(max("N3"),16,0,64),1,240)),
  to_char(count("V4")),
  to_char(substrb(dump(min("V4"),16,0,64),1,240)),
  to_char(substrb(dump(max("V4"),16,0,64),1,240)) 
from
 "C##MHOURI"."T_NDV" t  /* NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL*/
 
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      2.85       2.88          0       3861          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Total        3      2.87       2.88          0       3861          0           1

SQL> select num_rows, blocks, avg_row_len, sample_size 
     from user_tables 
    where table_name = 'T_NDV';

  NUM_ROWS     BLOCKS AVG_ROW_LEN SAMPLE_SIZE
---------- ---------- ----------- -----------
   1000000       3924          23     1000000

SQL> select
        column_name
       ,num_distinct
       ,sample_size
    from user_tab_columns
    where table_name = 'T_NDV';

COLUMN_NAME          NUM_DISTINCT SAMPLE_SIZE
-------------------- ------------ -----------
V4                         990208     1000000
N3                             99     1000000
N2                         337344     1000000
N1                        1000000     1000000

Several important remarks can be emphasized here. First, the elapsed time(which is almost cpu time) has been reduced from 8 seconds to less than 3 seconds. Second, the sample size used automatically by Oracle is 100 (in response to dbms_stats.auto_sample_size parameter I have set during my second statistics gathering attempt) which has the consequence of computing the exact number of rows (num_rows) that are actually present in the t_ndv table(1000000).

With less time and less cpu consumption, Oracle, under approximate_ndv preference set to true and estimate_percent set to dbms_stat.auto_sample_size, produced a perfect estimation when compared to the time and resource consumption it has necessitated when the same property was set to false.

When approximate_ndv is enabled Oracle is fast and accurate. Because if you compare the SQL code used when this property is disabled to the corresponding SQL when it is enabled you will point out that in the later case (enabled) there is no call anymore to the costly count(distinct) function. There is instead a strange call to a couple of NDV, NIL, NIL function as shown below:

from
 "C##MHOURI"."T_NDV" t  /* NDV,NIL,NIL
                          ,NDV,NIL,NIL
                          ,NDV,NIL,NIL
                          ,NDV,NIL,NIL
                         */

which I have intentionnaly arranged to show you that the number of NDV, NIL, NIL coincides with the number of columns (4) of the t_ndv table

SQL> desc t_ndv
     Name      Type
-------------- ------------------
N1             NUMBER
N2             NUMBER
N3             NUMBER     
V4             VARCHAR2(4000)

And immediately after this piece of code the trace file show the following SQL statement

SELECT /*+ parallel */ TO_NUMBER(EXTRACTVALUE(VALUE(T), 
  '/select_list_item/pos') + 1) POS, EXTRACTVALUE(VALUE(T), 
  '/select_list_item/value') VAL, TO_NUMBER(EXTRACTVALUE(VALUE(T), 
  '/select_list_item/nonnulls')) NONNULLS, TO_NUMBER(EXTRACTVALUE(VALUE(T), 
  '/select_list_item/ndv')) NDV, TO_NUMBER(EXTRACTVALUE(VALUE(T), 
  '/select_list_item/split')) SPLIT, TO_NUMBER(EXTRACTVALUE(VALUE(T), 
  '/select_list_item/rsize')) RSIZE, TO_NUMBER(EXTRACTVALUE(VALUE(T), 
  '/select_list_item/rowcnt')) ROWCNT, TO_NUMBER(EXTRACTVALUE(VALUE(T), 
  '/select_list_item/topncnt')) TOPNCNT, EXTRACT(VALUE(T), 
  '/select_list_item/topn_values').GETCLOBVAL() TOPN, NULL MINFREQ, NULL 
  MAXFREQ, NULL AVGFREQ, NULL STDDEVFREQ 
FROM
 TABLE(XMLSEQUENCE(EXTRACT(:B1 , '/process_result/select_list_item'))) T 
  ORDER BY TOPNCNT DESC

which means that Oracle is using internally an in memory xml structue when calculating the number of distinct values of each table column making this process fast and accurate

Bottom Line: I was not asked to look at the manner used by the local DBAs to collect statistics but, being curious at all what I look at when diagnosing performance issues, my attention was kept by this TOP SQL statement one would have considered as a purely coincidence and would have not analysed while it has permitted me to discuss with the local DBAs and to change the way they are collecting statistics.

Next Page »

The Rubric Theme. Create a free website or blog at WordPress.com.

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Oracle Database 11g

Oracle Database

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

Tanel Poder's Performance & Troubleshooting blog

Linux, Oracle, Exadata and Hadoop.

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

Follow

Get every new post delivered to your Inbox.

Join 124 other followers