Oracle | Mohamed Houri’s Oracle Notes

February 5, 2023

SPM baseline and historical execution plans

Filed under: Oracle — hourim @ 8:33 am

I have often wondered why when we consult a past execution plan that is protected by an SPM baseline, there is no Note at the bottom of that execution plan showing the use of that SPM?

SQL> select banner_full from v$version;

BANNER_FULL
------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0                                                                 

SQL> select count(1) from emp where deptno=30;

  COUNT(1)
----------
         6

SQL> select * from table(dbms_xplan.display_cursor)

SQL_ID  89mvrxmzsd1v3, child number 1
-------------------------------------
Plan hash value: 2083865914
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|*  2 |   TABLE ACCESS FULL| EMP  |     6 |
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=30)

Note
-----
   - SQL plan baseline SQL_PLAN_1vbjb8bc7qryjc392520a used for this statement


exec dbms_workload_repository.add_colored_sql('89mvrxmzsd1v3');
exec dbms_workload_repository.create_snapshot;

SQL> select * from table(dbms_xplan.display_awr('89mvrxmzsd1v3')); --> please note: display_awr


SQL_ID 89mvrxmzsd1v3
--------------------
Plan hash value: 2083865914
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|   2 |   TABLE ACCESS FULL| EMP  |     6 |
-------------------------------------------

--> There is no Note on SPM

There is good news. Starting with 21c this has been corrected by Oracle. Here is the demonstration:

SQL> select banner_full from v$version;

BANNER_FULL
------------------------------------------------------------------------
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0                                                                  

SQL> select count(1) from emp where deptno=30;

  COUNT(1)
----------
         6

SQL> select * from table(dbms_xplan.display_cursor);


SQL_ID  89mvrxmzsd1v3, child number 1
-------------------------------------
select count(1) from emp where deptno=30

Plan hash value: 2083865914

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     3 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| EMP  |     6 |    18 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=30)

Note
-----
   - SQL plan baseline SQL_PLAN_1vbjb8bc7qryjc392520a used for this statement


exec dbms_workload_repository.add_colored_sql('89mvrxmzsd1v3');
exec dbms_workload_repository.create_snapshot;

SQL> select * from table(dbms_xplan.display_awr('89mvrxmzsd1v3')); --> please note: display_awr

SQL_ID 89mvrxmzsd1v3
--------------------
select count(1) from emp where deptno=30

Plan hash value: 2083865914
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|*  2 |   TABLE ACCESS FULL| EMP  |     6 |
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=30)

Note
-----
   - SQL plan baseline "SQL_PLAN_1vbjb8bc7qryjc392520a" used for this statement

This is the first time if my memory doesn’t fail me, that I see this Note on an SPM baseline appearing at the bottom of an execution plan coming from AWR. This becomes possible because Oracle has finally enriched the other_xml column of dba_hist_sql_plan with the SPM baseline information as shown below:

SQL> select banner_full from v$version;

BANNER_FULL
----------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SELECT
     p.sql_id
    ,p.plan_hash_value
    ,t.spm_baseline
FROM   
  dba_hist_sql_plan p
 ,xmltable('for $i in /other_xml/info
               where $i/@type eq "baseline"
               return $i'
               passing xmltype(p.other_xml)
               columns spm_baseline varchar2(100) path '/') t
 WHERE p.sql_id = '89mvrxmzsd1v3'
 and   p.other_xml is not null;

no rows selected


SQL> select banner_full from v$version;

BANNER_FULL
-----------------------------------------------------------------------
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0

SELECT
     p.sql_id
    ,p.plan_hash_value
    ,t.spm_baseline
FROM   
  dba_hist_sql_plan p
 ,xmltable('for $i in /other_xml/info
               where $i/@type eq "baseline"
               return $i'
               passing xmltype(p.other_xml)
               columns spm_baseline varchar2(100) path '/') t
 WHERE p.sql_id = '89mvrxmzsd1v3'
 and   p.other_xml is not null;

SQL_ID        PLAN_HASH_VALUE SPM_BASELINE
------------- --------------- -----------------------------------
89mvrxmzsd1v3      2083865914 "SQL_PLAN_1vbjb8bc7qryjc392520a"

I remind for those who are not yet in 21c, that they can use Dom Brooks’s script to track the use of SPM in AWR

SQL> select banner_full from v$version;

BANNER_FULL
----------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> @wasSPMUsed19c
Enter value for sql_id: 89mvrxmzsd1v3

   SNAP_ID DT                         SQL_ID        PLAN_HASH_VALUE       PHV2        SIG D I WHEN_BASELINE_CREATED
---------- -------------------------- ------------- --------------- ---------- ---------- - - --------------------------
      2241 04-FEB-2023 16:54          89mvrxmzsd1v3      2083865914 3281146378 2,1387E+18 Y Y 04-FEB-2023 16:52

January 22, 2023

Force matching signature

Filed under: Oracle — hourim @ 11:43 am

It often happens that changes are implemented in new Oracle releases without most (if not all) DBAs and developers being informed. Among those changes, introduced most probably in version 19.13, is a behavior change in the force matching the signature of a SQL query using a mix of bind variables and literals. Here is a setup that I will run in two different databases 19.3 and 21.3 respectively:

SQL> select banner_full from v$version;

BANNER_FULL
-----------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> var   empnum number
SQL> exec :empnum := 7839

SQL> select count(1) from emp where empno = :empnum and deptno=10; ----> deptno =10

SQL> select * from table(dbms_xplan.display_cursor);


SQL_ID  9kmf47x7byqq8, child number 0
-------------------------------------
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=10)
   3 - access("EMPNO"=:EMPNUM)


SQL> select count(1) from emp where empno = :empnum and deptno=20; ----> deptno =20

SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  gz41rf5f6cww1, child number 0
-------------------------------------
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=20)
   3 - access("EMPNO"=:EMPNUM)


SQL> select
       sql_id
      ,to_char(force_matching_signature)
    from
       gv$sql
    where
    sql_id in ('9kmf47x7byqq8','gz41rf5f6cww1');

SQL_ID        TO_CHAR(FORCE_MATCHING_SIGNATURE)
------------- ----------------------------------------
gz41rf5f6cww1 13157356546279728935
9kmf47x7byqq8 12268692852591778641

Notice that I ran the same query twice changing only the value of the department 10 and then 20. As a result, the force-matching signatures of the two SQL_IDs are different. This is quite normal and predictable, isn’t it?

But now look what I get when I re-execute the same setup on a 21.3:

SQL> select banner_full from v$version;

BANNER_FULL
------------------------------------------------------------------------
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0

SQL> var   empnum number
SQL> exec :empnum := 7839


SQL> select count(1) from emp where empno = :empnum and deptno=10; ----> deptno =10

SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  9kmf47x7byqq8, child number 0
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=10)
   3 - access("EMPNO"=:EMPNUM)

SQL> select count(1) from emp where empno = :empnum and deptno=20; ----> deptno =20

SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  gz41rf5f6cww1, child number 0
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=20)
   3 - access("EMPNO"=:EMPNUM)


select
   sql_id
  ,to_char(force_matching_signature)
from
   gv$sql
where  
sql_id in ('9kmf47x7byqq8','gz41rf5f6cww1');

SQL_ID        TO_CHAR(FORCE_MATCHING_SIGNATURE)
------------- ---------------------------------
gz41rf5f6cww1 12531360796234248997
9kmf47x7byqq8 12531360796234248997

The two forces matching signatures of the two SQL_IDs are now identical

What impact can this have?

The first positive collateral side effect, if it can be considered as a positive effect, concerns the use of SQL profiles to fix an execution plan for a query using both bind variables and literals. If you read the Hacking Profiles article by Jonathan Lewis, you will understand that Jonathan has already explained the critical limitations (restrictions) of SQL Profiles when used with queries mixing bind variables and literals. They won’t be used in this case even if you force the SQL profile to use their force_mathing property. Here’s below the demonstration starting with 19.3 and then 21.3 respectively.

SQL> @coe_xfr_sql_profile.sql  ----> Carlos Sierra script to fix a SQL profile

Parameter 1:
SQL_ID (required)

Enter value for 1: gz41rf5f6cww1


PLAN_HASH_VALUE AVG_ET_SECS
--------------- -----------
     1126804136        ,002

Parameter 2:
PLAN_HASH_VALUE (required)

Enter value for 2: 1126804136

SQL>select to_char(signature), force_matching from dba_sql_profiles where name like '%gz41rf5f6cww1%';

TO_CHAR(SIGNATURE)                       FOR
---------------------------------------- ---
13157356546279728935                     YES


SQL> select count(1) from emp where empno = :empnum and deptno=20;

SQL_ID  gz41rf5f6cww1, child number 0
-------------------------------------
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=20)
   3 - access("EMPNO"=:EMPNUM)

Note
-----
   - SQL profile coe_gz41rf5f6cww1_1126804136 used for this statement


SQL> select count(1) from emp where empno = :empnum and deptno=10;

SQL_ID  9kmf47x7byqq8, child number 0
-------------------------------------
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=10)
   3 - access("EMPNO"=:EMPNUM)

Exactly as demonstrated by Jonathan Lewis, as soon as we change the value of the département(from 20 to 10) the SQL profile stops being used.

As of version 19.13, this is no longer the case. Indeed, in the same situation as above, the SQL Profile is now usable. Here is a demonstration done on 21.3. version:

SQL> select banner_full from v$version;

BANNER_FULL
------------------------------------------------------------------------
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0


SQL> @coe_xfr_sql_profile.sql

Parameter 1:
SQL_ID (required)

Enter value for 1: gz41rf5f6cww1


PLAN_HASH_VALUE AVG_ET_SECS
--------------- -----------
     1126804136        .125

Parameter 2:
PLAN_HASH_VALUE (required)

Enter value for 2: 1126804136

SQL>select to_char(signature), force_matching from dba_sql_profiles where name like '%gz41rf5f6cww1%';

TO_CHAR(SIGNATURE)                       FOR
---------------------------------------- ---
12531360796234248997                     YES

SQL> select count(1) from emp where empno = :empnum and deptno=20;

SQL> select * from table(dbms_xplan.display_cursor);


SQL_ID  gz41rf5f6cww1, child number 0
-------------------------------------
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=20)
   3 - access("EMPNO"=:EMPNUM)

Note
-----
   - SQL profile coe_gz41rf5f6cww1_1126804136 used for this statement



SQL> select count(1) from emp where empno = :empnum and deptno=10;

SQL> select * from table(dbms_xplan.display_cursor);

 
SQL_ID  9kmf47x7byqq8, child number 0
-------------------------------------
-------------------------------------------------------
| Id  | Operation                    | Name   | Rows  |
-------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |
|   1 |  SORT AGGREGATE              |        |     1 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| EMP    |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP |     1 |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DEPTNO"=10)
   3 - access("EMPNO"=:EMPNUM)

Note
-----
   - SQL profile coe_gz41rf5f6cww1_1126804136 used for this statement

As you can see, unlike version 19.3, in 21.3, the SQL profile has been used for both queries even though I changed the literal department value from 20 to 10

Bottom Line

I don’t really know if this change was worth a blog post but, even if some people will say that you should avoid using SQL Profiles at the detriment of SPM Baseline, it’s always a good thing to see a restriction in the use of SQL profiles removed by Oracle.

Comments (1)

August 15, 2022

enq: TM – contention [mode=3/5]

Filed under: Oracle — hourim @ 4:34 pm

Here is a simple but very practical example to show how to identify the root cause of a sudden and intermittent lock that considerably affects the proper functioning of a database:

SQL> @dashtopmho
@dashtop2 instance_number,session_id,sql_id,event2,blocking_inst_id,blocking_session,blocking_session_status 1=1 "timestamp'2020-&from'" "timestamp'2020-&to'"

Enter value for from: 11-19 06:00:00
Enter value for to: 11-19 09:00:00

   Total
  Seconds %This  SESSION_ID SQL_ID        EVENT2                         BLOCKING_INST_ID BLOCKING_SESSION BLOCKING_SE
--------- ----------------- ------------- ------------------------------ ---------------- ---------------- -----------
      860    8%        1368 bnygjh7hdf9v9 enq: TM - contention [mode=3]         1             1363 VALID
      840    8%         976 bnygjh7hdf9v9 enq: TM - contention [mode=3]         1             1363 VALID
      610    6%         574               ON CPU                                                   NOT IN WAIT
      570    5%        1363 1suthd9j77v21 enq: TM - contention [mode=5]         1             1396 VALID
      430    4%        1363 1suthd9j77v21 enq: TM - contention [mode=5]         1              802 VALID
      410    4%        1170 bnygjh7hdf9v9 enq: TM - contention [mode=3]         1             1363 VALID
      370    3%        1171 bnygjh7hdf9v9 enq: TM - contention [mode=3]         1             1363 VALID
      350    3%         244 bd9anzv38cdsh enq: TM - contention [mode=3]         1             1363 VALID
      340    3%         426 fgtt9x4dd6ahn enq: TM - contention [mode=3]         1             1363 VALID
      130    1%           1               ON CPU                                                   NOT IN WAIT
      120    1%         766               ON CPU                                                   NOT IN WAIT
      110    1%        1205 8msv09nr0rcx0 ON CPU                                                   NOT IN WAIT
      100    1%         195 2b064ybzkwf1y ON CPU                                                   NOT IN WAIT
      100    1%        1170 8ju64nrz3r1b9 ON CPU                                                   NOT IN WAIT
      100    1%        1205 8ayncy95bk88w ON CPU                                                   NOT IN WAIT
      100    1%        1205 8ayncy95bk88w direct path read                                         UNKNOWN
      100    1%        1205 ay8kmp9188rvw ON CPU                                                   NOT IN WAIT
       90    1%         786 8msv09nr0rcx0 ON CPU                                                   NOT IN WAIT
       80    1%         226 bd9anzv38cdsh enq: TM - contention [mode=3]         1             1363 VALID
       80    1%         408 8msv09nr0rcx0 ON CPU                                                   NOT IN WAIT

SQL> @dashtopmho
Enter value for from: 11-18 19:10:00
Enter value for to: 11-18 19:30:00

    Total
  Seconds %This   SESSION_ID SQL_ID        EVENT2                         BLOCKING_INST_ID BLOCKING_SESSION BLOCKING_SE
--------- ------- ---------- ------------- ------------------------------ ---------------- ---------------- -----------
       80    6%          786 8ayncy95bk88w direct path read                                         UNKNOWN
       80    6%         1368 bnygjh7hdf9v9 enq: TM - contention [mode=3]         1             1374 VALID
       60    4%         1004 ay8kmp9188rvw direct path read                                         UNKNOWN
       60    4%         1374 19fan32ayuy6w enq: TM - contention [mode=5]         1              792 VALID
       50    3%          574               ON CPU                                                   NOT IN WAIT
       40    3%          213 2qpjahhyzy7mm direct path read                                         UNKNOWN
       40    3%          792 254vjyr1z0svd direct path read                                         UNKNOWN
       40    3%         1004 8ayncy95bk88w direct path read                                         UNKNOWN
       30    2%            1               ON CPU                                                   NOT IN WAIT
       30    2%          408 ay8kmp9188rvw direct path read                                         UNKNOWN
       30    2%          599 ay8kmp9188rvw direct path read                                         UNKNOWN
       20    1%          213 2qpjahhyzy7mm ON CPU                                                   NOT IN WAIT
       20    1%          213 ab5v870f5xa82 ON CPU                                                   NOT IN WAIT
       20    1%          408 8msv09nr0rcx0 direct path read                                         UNKNOWN
       20    1%          599 ay8kmp9188rvw ON CPU                                                   NOT IN WAIT
       20    1%          617 8msv09nr0rcx0 direct path read                                         UNKNOWN
       20    1%          792 6p188706cjbdc direct path read                                         UNKNOWN
       20    1%         1004 8msv09nr0rcx0 ON CPU                                                   NOT IN WAIT
       20    1%         1170 fvvtz709a9pb4 direct path read                                         UNKNOWN
       20    1%         1374 19fan32ayuy6w enq: TM - contention [mode=5]         1             1170 VALID

Above, I have presented the output of the dashtop Tanel Poder script applied during two periods where the application was suffering a clear slowdown.

As you can see, dashtop, not only shows the lock type the above sessions were waiting on, enq: TM – contention, but it shows as well the mode on which those enqueues are requested : [mode=3] and [mode=5]. If this script had not already done this work for us, we would have been obliged to interpret the name and the mode from the value of the parameter p1(name|mode) linked with this enqueue wait event

SQL> @sed "enq: TM - contention"
Show wait event descriptions matching %enq: TM - contention%..

EVENT# EVENT_NAME            WAIT_CLASS   PARAMETER1  PARAMETER2  PARAMETER3       ENQUEUE_NAME 
------ --------------------- ------------ ----------- ----------- ---------------- -------------
   301 enq: TM - contention  Application  name|mode   object #    table/partition  DML

So, let me start by pointing out that this is already quite a great simplification implemented into the ashtop/dashtop script to make the diagnosis easier and faster.

Well, now that the diagnosis is well established, we will have to move on to the interpretation stage before concluding with the ideal remedy for this situation. We see two pieces of information here:

The top sessions wait events shown by dashtop concern a table lock contention : enq: TM – contention
Few of these table locks are held on mode 3 while others are held at mode 5

Jonathan Lewis did a very nice summary about the definition of the different Oracle lock modes:

mode 3 : Sub-exclusive(SX)/Row-exclusive(RX)
- Update (also “select for update” from 9.2.0.5/6)
- Lock table in row exclusive mode
- Since 11.1 used at opposite end of RI during DML

mode 5 : share sub exclusive (SSX)/share row exclusive (SRX)
- Lock table in share row exclusive mode
- Less common symptom of “foreign key locking” but likely to be more frequent if the FK constraint is defined with “on delete cascade.”

If I want to simplify this situation, I can say that when you see a couple of concurrent sessions waiting on an enq: TM contention held on mode 3 and/or requested on mode 5, then this can almost always be traced back to a non-indexed foreign key in an existing parent-child relationship.

Since this lock situation appeared immediately after a maintenance operation on the database, I immediately understood that I had to check the status of all the application’s indexes as a priority. Indeed, if this lock occurred just after an intervention on the database it probably means that some indexes are in an UNUSABLE status.

SQL> select table_name, index_name,status  
    from all_indexes 
	where owner='CTEST' 
	and status ='UNUSABLE';
								  
TABLE_NAME      INDEX_NAME          STATUS
--------------- ------------------ --------- 
ABC_M        ABC_M_I_FK_ME_IDX1    UNUSABLE
ABC_M        ABC_M_I_FK_ME_IDX2    UNUSABLE
ABC_M        ABC_M_I_FK_ME_IDX3    UNUSABLE
.../...                           
DEF_M        DEF_M_I_FK_DEF_IDX1   UNUSABLE
DEF_M        DEF_M_I_FK_DEF_IDX2   UNUSABLE
DEF_M        DEF_M_I_FK_DEF_IDX3   UNUSABLE
../..                             
GHI_M  		 GHI_M_I_FK_SIT_IDX1   UNUSABLE
GHI_M        GHI_M_I_FK_SIT_IDX2   UNUSABLE
GHI_M        GHI_M_I_FK_SIT_IDX3   UNUSABLE
GHI_M        GHI_M_I_FK_SIT_IDX4   UNUSABLE

Effectively, not only there are a bunch of UNUSABLE indexes, but all these indexes have the suffix FK in their name. This clearly indicates that they were created to cover the lock and deadlock threat an unindexed foreign key may introduce. All I had to do then was to rebuild these indexes:

SQL> select 'alter index '||owner ||'.'|| index_name ||' rebuild;' 
     from all_indexes 
     where owner='CTEST' 
	 and status ='UNUSABLE';

alter index CTEST.ABC_M_I_FK_ME_IDX1 rebuild;
alter index CTEST.ABC_M_I_FK_ME_IDX2 rebuild;
alter index CTEST.ABC_M_I_FK_ME_IDX3 rebuild;
../..
alter index CTEST.DEF_M_I_FK_DEF_IDX1 rebuild;
alter index CTEST.DEF_M_I_FK_DEF_IDX2 rebuild;
alter index CTEST.DEF_M_I_FK_DEF_IDX3 rebuild;
../..
alter index CTEST.GHI_M_I_FK_SIT_IDX1 rebuild;
alter index CTEST.GHI_M_I_FK_SIT_IDX2 rebuild;
alter index CTEST.GHI_M_I_FK_SIT_IDX3 rebuild;

Bottom line

Sometimes it is enough to follow a simple logic of investigation which consists in saying that if an event happened following a particular action then this action must have very certainly paved the way for this event by removing what prevented it from happening. Applied to the current situation it becomes: that a maintenance action has invalidated several indexes. These indexes, created to prevent a DML lock on tables linked by a parent-child relationship, have stopped working causing an application slowdown. The use of a standard naming convention for the indexes made the diagnosis much easier. In itself, this is another proof of the usefulness of a good naming convention

Comments (5)

January 30, 2022

Or expansion: Linear

Filed under: Oracle — hourim @ 2:39 pm

PREAMBLE

In the previous article, we discussed the two-pass technique used by Oracle to evaluate the cost of the different OR-Expansion states. We have shown that this technique evaluates the cost of only two states, the initial NT state, and the FORE state. We have demonstrated as well that this strategy is used by Oracle whenever the number of conjuncts (aka OR operands) equals 2 or is greater or equal to 5. In this article, we will examine the so-called Linear technique. We will show that this technique is used when the number of OR operands equals to 3 or 4. We will also demonstrate that, under this strategy, Oracle might EXPAND the NT or FUSE the FORE. The former is when the cost of the NT is less than that of the FORE while the latter takes place in the opposite situation. We will also show that, although the or-expansion strategy obeys the Stirling number of the second kind, Oracle has, nevertheless, implemented safeguards to prevent this number from exploding. For example, we will show that when the FORE is FUSED, Oracle will evaluate S(n,k) iterations where k ranges from 1 to n-1. However, the order of the conjunctions (the order in which the FORE has been costed) cannot be permuted. This will implicitly cap the value of S(n,k). Finally, we will also see that when Oracle decides to EXPAND the NT, although the permutation of states is possible, the maximum number of states cannot exceed n+2.

LINEAR TECHNIQUE

Since we’ve demonstrated in the previous article that the two-pass technique is reserved for a DNF having m number of conjuncts where m =2 or m >=5, let’s then analyze a DNF of 3 and 4 conjuncts respectively:

alter session set tracefile_identifier='3Ored';
@53on
select 
   *
from
   t1
where
   (n1 =1          -- conjunct n°1
      or n2  = 42  -- conjunct n°2
      or vc1 ='20' -- conjunct n°3
    );
@53off

egrep "ORE: Using search type|conjunction chain|Switching" ORCLCDB_ora_16232_3Ored.trc 

ORE: Using search type: linear
ORE: # conjunction chain – 3

Notice that Oracle did start by considering the linear technique in order to explore the different or-expansion states, and did not switch, this time, to the two-pass technique contrary to what we observed in the previous article.

egrep "ORE: Switching to|state space|Updated best state|Not update best state|conjunction chain" ORCLCDB_ora_16232_3Ored.trc

ORE: # conjunction chain - 3
ORE: Starting iteration 1, state space = [{ 1 2 3 }]
ORE: Updated best state, Cost = 4.046110
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 2.000095
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 3.000134
ORE: Starting iteration 3, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 3, state space = [{ 2 3 }]
ORE: Not update best state, Cost = 3.023117
ORE: Starting iteration 4, state space = [{ 1 2 }]
ORE: Not update best state, Cost = 3.034589
ORE:   Transferring best state space to preserved query.
ORE:   Transferring best state space to original query.

From the above trace file, we can see that Oracle has explored 4 iterations for a DNF of 3 conjunctions.

•	The NT state[{ 1 2 3 }]
•	The FORE  state [{ 1 }] or [{ 2 }]or [{ 3 }]
•	[{ 1 }] or [{ 2 3 }]
•	[{ 1 2 }] or [{ 3 }]

However, should Oracle used the Stirling number to explore all possible combinations it would have then have considered S(3,k) iterations where k ranges from 1 to 3. That is 5 iterations instead of 4 as shown below:

Number of Iterations = S(3,1) + S(3,2) + S(3,3)
                     = S(3,1) + S(3,2) + S(3,3)= 1+3+1= 5

So why did Oracle only consider 4 iterations in this case?

There are 3 ways to write a DNF of 3 conjuncts into boxes of 2 elements; namely S(3,2)=3. Why then iteration 3 has not considered the 3rd possibility?

[{ 1 3 }] or [{ 2 }]

The answer is again in the 10053 trace file. The linear technique starts by evaluating the cost of both the NT and the FORE. If the cost of the NT is smaller than that of the FORE, Oracle will iteratively EXPAND the NT to cost the different or-expansion states. If it is the opposite then Oracle will iteratively FUSE the FORE.

Let’s go back to the trace file to make it clearer. The cost of the NT is 4.046110

ORE: Starting iteration 1, state space = [{ 1 2 3 }]
ORE: Updated best state, Cost = 4.046110

While the cost of the FORE is 3.000134

ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 2.000095
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 3.000134

Sine Cost (FORE) < Cost (NT) Oracle will then FUSE the FORE as the 10053 trace file clearly indicates:

kkoqbc: finish optimizing query block SEL$1_9 (#1)
ORE: Updated best state, Cost = 3.000134
ORE: Applying FUSE on FORE state
ORE: Starting iteration 3, state space = [{ 1 }]

However, there is a subtlety here. When the FORE is FUSED, this fusion can only be done in the order in which the FORE has been costed. In other words, the order of the conjunctions (the order of the states) cannot be permuted. Or more precisely each state of the DNF is fused with its element to the right. This is precisely why the iteration [{ 1 3 }] or [{ 2 }] has not been considered because it permutes states 2 and 3.

Hopefully this answer Nenad Noveljic question you can find in this article and reproduced here below:

Interestingly, what wasn’t considered at all was the UNION ALL consisting of the state space = [{ 2 }] and state space = [{ 1 3 }], which, generally, could yield a good execution plan.

Finally, when Oracle FUSE the FORE it uses S(n,k) combinations where n represents the number of conjunctions (3 here) and where k ranges from 1 to n-1. Applied to the current case Oracle would then have fused the FORE into S(3,1)+S(3,2) = 1+3 = 4. This is exactly what the corresponding 10053 trace file shows.

Let’s now repeat the experiment for a DNF of 4 conjunctions

alter session set tracefile_identifier='4Ored';
@53on
select 
   *
from
   t1
where
   (n1 =1           -- conjunct n°1 
      or n2  = 42   -- conjunct n°2
      or vc1 = '20' -- conjunct n°3
      or n3  = 9    -- conjunct n°4
     );

@53off

egrep "ORE: Using search type|conjunction chain" ORCLCDB_ora_16232_4Ored.trc

ORE: Using search type: linear
ORE: # conjunction chain – 4

egrep "ORE: Switching to|state space|Applying FUSE|Updated best state|Not update best state|conjunction chain" ORCLCDB_ora_16232_4Ored.trc

ORE: # conjunction chain - 4
ORE: Starting iteration 1, state space = [{ 1 2 3 4 }]
ORE: Updated best state, Cost = 5.069131
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 2.000095
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 3.000134
ORE: Starting iteration 2, state space = [{ 4 }]
ORE: Updated best state, Cost = 4.000174
ORE: Applying FUSE on FORE state              -----> cost(FORE) = 4.000174 < cost (NT)=5.069131
ORE: Starting iteration 3, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 3, state space = [{ 2 }]
ORE: Updated best state, Cost = 2.000095
ORE: Starting iteration 3, state space = [{ 3 4 }]
ORE: Not update best state, Cost = 4.034666
ORE: Applying FUSE on FORE state
ORE: Starting iteration 4, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 4, state space = [{ 2 3 }]
ORE: Updated best state, Cost = 3.023117
ORE: Starting iteration 4, state space = [{ 4 }]
ORE: Not update best state, Cost = 4.023156
ORE: Applying FUSE on FORE state
ORE: Starting iteration 5, state space = [{ 1 2 }]
ORE: Updated best state, Cost = 3.034589
ORE: Starting iteration 5, state space = [{ 3 }]
ORE: Not update best state, Cost = 4.034628
ORE:   Transferring best state space to preserved query.

As you can see, since the cost of the NT is greater than that of the FORE, Oracle decided to FUSE the FORE leading to an exploration of 5 states. Since we are dealing with a set of 4 conjuncts, the size of the exhaustive state space would have been S(4,k) where k ranges from 1 to 3:

S(4,k) = S(4,1)+S(4,2)+S(4,3)= 1+7+6=14

In other words, there should have been at least more than 5 iterations considering the non-permutation of OR-expansion states when the FORE is fused. Why then there are only 5 states.

Here’s a case where Oracle decides to EXPAND the NT of 4 conjuncts

kkoqbc: finish optimizing query block SEL$1_9 (#1)
ORE: Not update best state, Cost = 7.001340
ORE: Applying EXPAND on no transformation state
ORE: Starting iteration 3, state space = [{ 1 }]

And, where the size of the expansion states is still equal to 5 despite the permutation states is possible under EXPAND of the NT:

egrep "ORE: Switching to|state space|Updated best state|Not update best state" ORCLCDB_ora_25335_Expand4Ored.trc
ORE: Starting iteration 1, state space = [{ 1 2 3 4 }]
ORE: Updated best state, Cost = 6.689797   ----> cost of NT
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 3.000551
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 5.000945
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Not update best state, Cost = 7.001340     ----> cost of FORE
ORE: Starting iteration 3, state space = [{ 1 }]
ORE: Updated best state, Cost = 3.000551
ORE: Starting iteration 3, state space = [{ 2 3 4 }]
ORE: Updated best state, Cost = 6.460474
ORE: Starting iteration 4, state space = [{ 1 }]
ORE: Updated best state, Cost = 3.000551
ORE: Starting iteration 4, state space = [{ 2 }]
ORE: Updated best state, Cost = 5.000945
ORE: Starting iteration 4, state space = [{ 3 4 }]
ORE: Not update best state, Cost = 7.345876
ORE: Starting iteration 5, state space = [{ 1 }]
ORE: Updated best state, Cost = 3.000551
ORE: Starting iteration 5, state space = [{ 3 }]
ORE: Updated best state, Cost = 5.000945
ORE: Starting iteration 5, state space = [{ 2 4 }]
ORE: Not update best state, Cost = 7.345876
ORE:   Transferring best state space to preserved query.
ORE:   Transferring best state space to original query.

CONCLUSION

In fact, as per the US Patent (Selecting From OR-Expansion States Of A Query) whatever the number of conjuncts is, under a linear technique, when Oracle decides to FUSE the FORE or to EXPAND the NT, although it will use the Stirling number of the second kind, the maximum number of states it can explore is n+2 where n is the number of conjuncts.

Comments (1)

January 23, 2022

Or expansion: Two-pass, Linear or Greedy

Filed under: Oracle — hourim @ 5:21 pm

Abstract

The OR Expansion represents the ability of the CBO to transform an initial query that includes predicates linked together by an OR operator into a UNION ALL operator applied on two or more query blocks known as the UNION ALL branches. This transformation allows the CBO to consider new index access paths and join methods that would have been impossible should the initial query has not been transformed. But, the higher the number of ORs in the original query, the higher the number of semantically equivalent OR-expansion transformations. Letting the CBO evaluate the cost-benefit of this high percentage of OR-expansion states would have been very expensive. This is why Oracle has implemented three different best-cost evaluation techniques when the OR Expansion transformation is considered: Two-pass, Linear, and Greedy. This article aims to explain when and how these techniques are used.

Warning

This is not an article that will help you in your daily performance and trouble-shooting work since it doesn’t really matter which strategy Oracle has used to evaluate the best-costed OR-expansion transformation, Two-pass, Linear, or Greedy. What matters, from a performance point of view, is whether Oracle has used the OR expansion or not. It even seems that an alter session to switch from one strategy to another is ignored. So don’t be surprised if the content of this article does not add any value to your diagnostic and performance tuning skills:

Terminology

a) Disjunctive Normal Form

Before the CBO considers the method or the procedure of evaluating the best cost of one or many OR-Expansion states, it must first transform the initial query into a form called Disjunctive Normal Form. Let’s consider the following query

select 
    *
from
   t1
where
   (n1 =1)                  -- conjunct n°1
 and ( n2 = 42 or vc1 ='20')-- conjunct n°2
;

The predicate part of the above query is not a DNF form because one of its conjuncts (n°2) is a disjunction (contains or). Before applying one of the three OR expansion techniques Oracle will transform the initial query into its DNF form by distributing the conjunction (and) over the disjunction (or) as shown below:

select 
    *
from
   t1
where
     (n1 =1 and n2=42)     -- conjunct n°1
 or  (n1 =1 and vc1 ='20') -- conjunct n°2;

The corresponding 10053 trace file contains a part related to this normalization which is labeled DNF Matrix as shown below:

ORE: # conjunction chain - 2
ORE: Checking validity of disjunct chain

DNF Matrix (Before sorting OR branches)
            P1  P2  P3
CNJ (#1) :   1   1   0
CNJ (#2) :   1   0   1

ORE:  Predicate list
P1 : "T1"."N1"=1
P2 : "T1"."N2"=42
P3 : "T1"."VC1"='20'

DNF Matrix (After OR branch sorting)
            P1  P2  P3
CNJ (#1) :   1   1   0
CNJ (#2) :   1   0   1

It is thus very clear that Oracle has built a chain of conjunctions with two elements on which it will apply a disjunction. For more details please read this Nenad Noveljic blog post.

b) NT and FORE

NT represents the Non Transformed initial query when the OR expansion transformation is not applied to the query. FORE refers to the Full OR Expansion where all the disjunctive predicates are transformed into UNION ALL branches. Concretely the following query is in an NT state:

select 
    *
from
   t1
where
     (n1 =1 and n2=42)     -- conjunct n°1
 or  (n1 =1 and vc1 ='20') -- conjunct n°2;

But the following query is in a FORE state:

select 
    *
from
   t1
where
  (n1 =1 and n2=42)    
union all
select 
    *
from
   t1
where
  (n1 =1 and vc1 ='20')
and LNVNL( ((n1 =1 and n2=42));

c) Stirling Number of the second kind

There exist two types of Stirling Number (SN): SN of the first kind and SN of the second kind. The SN of the second kind is denoted by S(n,k) and it represents the number of ways to partition n objects into k non-empty similar boxes. In the context of the OR-expansion, the total number of expansion states costed by Oracle follows the Stirling Number S(n,k) where n represents the number of disjunctive predicates (aka the number of conjunction chains) and where k (could very well represent the number of Or predicate) ranges from 1 to n. To calculate the number of possible Or-expansion states that obey the Stirling number of the second kind we will use the recursive way as shown below:

S(n,k) = S(n-1,k-1)+ k*S(n-1,k)

With the following fixed values known in advance:

S(0,0) = 1
S(n,0) = 0 for n>=1 
S(n,n) = 1

Following the above recursive formula, S(4,3) will be found to equal 6 as shown below:

S(4,3) = S(3,2) + 3*S(3,3)
S(4,3) = S(3,2) + 3*1 
S(4,3) = S(3,2) + 3 
 
-- where
S(3,2) = S(2,1) + 2*S(2,2) = 1 + 2 = 3
-- Hence
S(4,3) = 3+3 = 6

Two-pass technique

There are five possible values for the _optimizer_cbqt_or_expansion parameter:

SQL> @pvalid _optimizer_cbqt_or_expansion
Display valid values for multioption parameters matching "_optimizer_cbqt_or_expansion"...

  PAR# PARAMETER                     ORD VALUE    
------ ----------------------------- --- ---------
  4675 _optimizer_cbqt_or_expansion    1 OFF
       _optimizer_cbqt_or_expansion    2 ON
       _optimizer_cbqt_or_expansion    3 LINEAR
       _optimizer_cbqt_or_expansion    4 GREEDY
       _optimizer_cbqt_or_expansion    5 TWO_PASS

In the two-pass search strategy, Oracle will evaluate the cost of only two states, the initial NT state, and the FORE state. My investigations have shown that the two-pass technique is systematically used when one of the following conditions is met:

either the number of conjunct chains= 2
or the number of conjunct chains>=5

Here’s below the demonstration starting with 2 conjuncts, 5 conjuncts and then 6 conjuncts:

alter session set tracefile_identifier='1Ored';
@53on
-- query n°1
select 
   *
from
   t1
where
   (n1    = 1  -- conjunct 1
    or n2 = 42 -- conjunct 2       
    );
@53off

egrep "ORE: Using search type|conjunction chain" ORCLCDB_ora_9365_1Ored.trcORE: 
ORE: Using search type: linear
ORE: # conjunction chain – 2

As you can see, Oracle has started by considering to evaluate the cost of the different OR expansion states using the Linear technique. It has also recognized that it has to deal with a DNF of two conjunctions.

However, a few lines further down in the same trace file we can see that Oracle has changed its mind and decided to switch to the two-pass technique:

egrep "ORE: Switching to|state space|Updated best state|Not update best state|conjunction chain" ORCLCDB_ora_9365_1Ored.trc
ORE: # conjunction chain - 2
ORE: Switching to two pass because of # conjunction: 2 -----> switch occurs here
ORE: Starting iteration 1, state space = [{ 1 2 }]
ORE: Updated best state, Cost = 581.063707
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 2.000095
ORE:   Transferring best state space to preserved query.
ORE:   Transferring best state space to original query.

As you can read, Oracle decided to switch to two-pass because the DNF form to be processed contains 2 conjunctions.

By the way, it is useless to re-explain here what Nenad Noveljic has already done with certain brilliance in the popularization of what a state space, [{ 1 2 }] for example, represents. I invite you to read his article to get a clear picture. Let me just say the following:

state space = [{ 1 2 }] = original non transformed NT query 
state space = [{ 1 }]   = select * from t1 where (n1 =1)
state space = [{ 2 }]   = select * from t1 where (n12 =42) and lnnvl (n1=1)

Let’s now repeat the same experiment for a DNF of 5 and 6 conjuncts respectively:

alter session set tracefile_identifier='5Ored';
@53on
select 
   *
from
   t1
where
   (n1 =1
      or n2  = 42
      or vc1 = '20'
	  or n3  = 9 
	  or vc2 = '10'
    );
@53off

egrep "ORE: Using search type|conjunction chain" ORCLCDB_ora_9365_5Ored.trc
ORE: Using search type: linear
ORE: # conjunction chain - 5

egrep "ORE: Switching to|state space|Updated best state|Not update best state|conjunction chain" ORCLCDB_ora_9365_5Ored.trc
ORE: # conjunction chain - 5
ORE: Switching to two pass because of # conjunction: 5   -----> switch occurs here
ORE: Starting iteration 1, state space = [{ 1 2 3 4 5 }]
ORE: Updated best state, Cost = 581.744310
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 2.000095
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 3.000134
ORE: Starting iteration 2, state space = [{ 4 }]
ORE: Updated best state, Cost = 4.000174
ORE: Starting iteration 2, state space = [{ 5 }]
ORE: Updated best state, Cost = 5.000247
ORE:   Transferring best state space to preserved query.
ORE:   Transferring best state space to original query.
ORE: # conjunction chain – 5

alter session set tracefile_identifier='6Ored';
@53on
select 
   *
from
   t1
where
   (n1 =1
      or n2  = 42
      or vc1 = '20'
	  or n3  = 9 
	  or vc2 = '10'
	  or n5  = 36
    );
@53off

egrep "ORE: Using search type|conjunction chain" ORCLCDB_ora_9365_6Ored.trc
ORE: Using search type: linear
ORE: # conjunction chain - 6

egrep "ORE: Switching to|state space|Updated best state|Not update best state|conjunction chain" ORCLCDB_ora_9365_6Ored.trc
ORE: # conjunction chain - 6
ORE: Switching to two pass because of # conjunction: 6  -----> switch occurs here
ORE: Starting iteration 1, state space = [{ 1 2 3 4 5 6 }]
ORE: Updated best state, Cost = 581.870344
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 1.000055
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 2.000095
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 3.000134
ORE: Starting iteration 2, state space = [{ 4 }]
ORE: Updated best state, Cost = 4.000174
ORE: Starting iteration 2, state space = [{ 5 }]
ORE: Updated best state, Cost = 5.000247
ORE: Starting iteration 2, state space = [{ 6 }]
ORE: Updated best state, Cost = 6.000287
ORE:   Transferring best state space to preserved query.
ORE:   Transferring best state space to original query.

It is more and more clear now that whatever the DNF is, if it has a number of conjunctions equal to 2 or greater than or equal to 5, then Oracle will use the two-pass technique when considering the Cost based Or Expansion. This technique consists in evaluating the cost of the original query (NT) and that of the FULL transformed UNION ALL query (FORE). There is no use of the Stirling number in this case.

Moreover, as already mentioned in the abstract section, it seems that forcing the linear technique has no effect. It is fairly likely that the condition of the number of conjunction 2 (or >=5) that governs the two-pass technique is hard coded in the CBO code as the following tends to prove:

alter session set tracefile_identifier='Linear';

***************************************
PARAMETERS USED BY THE OPTIMIZER
********************************
  *************************************
  PARAMETERS WITH ALTERED VALUES
  ******************************
Compilation Environment Dump
optimizer_index_cost_adj            = 10
_optimizer_cbqt_or_expansion        = linear
_swat_ver_mv_knob                   = 0

egrep "ORE: Using search type|conjunction chain" ORCLCDB_ora_16232_Linear.trc
ORE: Using search type: linear
ORE: # conjunction chain – 5

egrep "ORE: Switching to|state space|Updated best state|Not update best state|conjunction chain" ORCLCDB_ora_16232_Linear.trc
ORE: # conjunction chain - 5
ORE: Switching to two pass because of # conjunction: 5
ORE: Starting iteration 1, state space = [{ 1 2 3 4 5 }]

In order not to make this blog post too heavy I will try to devote a separate article to the Linear technique as soon as possible.

Model

Here is the model I used to conduct my experiments:

CREATE TABLE t1 (
    n1   NUMBER,
    n2   NUMBER,
    n3   NUMBER,
    n4   NUMBER,
    n5   NUMBER,
    n6   NUMBER,
    vc1  VARCHAR2(10),
    vc2  VARCHAR2(100),
    d1   DATE
);

INSERT INTO t1
    SELECT
        rownum,
        mod(rownum, 10),
        trunc((rownum - 1 / 3)),
        trunc((rownum - 1 / 5)),
        trunc((rownum - 1 / 7)),
        mod(rownum,10),
        lpad('x', 10),
        lpad('y',100),
        date '2022-01-01' + (level-1) * interval '15' minute
    FROM
        dual
    CONNECT BY
        level <= 1e5;
		
create index t1_idx1 on t1(n1,n2);
create index t1_idx2 on t1(n2);
create index t1_idx3 on t1(n3);
create index t1_idx4 on t1(n4);
create index t1_idx5 on t1(n5);
create index t1_idx6 on t1(n6);
create index t1_idx7 on t1(vc1);
create index t1_idx8 on t1(vc2);
create index t1_idx9 on t1(d1);

exec dbms_stats.gather_table_stats(user, 't1');
alter session set optimizer_index_cost_adj =10;

Comments (2)

July 28, 2021

Why my execution plan has not been shared: Part 7

Filed under: cursor sharing,Oracle — hourim @ 4:21 pm

In this 7^th episode of the series of articles about why my execution plan has not been shared, I will deep dive into the AUTO_CHECK_MISMATCH. Before I start, I wanted, first, to remind you that I actually already explained this reason in part 5 of the same series. I have, in fact, demonstrated in that part, that the implementation of a RLS (Row Level Security) was the ingredient of this reason. But what I missed at that time was that this reason was accompanied by a piece of information that is not at all innocent and that I overlooked:

<ChildNode>
   <ChildNumber>0</ChildNumber>
     <ID>37</ID>
     <reason>Authorization Check failed(9)</reason>
      <size>0x0</size>
    <details>row_level_access_control_mismatch</details>
</ChildNode>

The number (9) that appears in

<reason>Authorization Check failed(9)</reason>

is not trivial. Indeed, 9 refer to a Row Level Security rule that is causing a new execution plan optimisation.

So, in the same vein as for the different values of Optimizermismatch() that goes hand in hand with the OPTIMIZER_MODE_MISMATCH explained in part 4

•	Optimizermismatch(1)
•	Optimizermismatch(2)
•	Optimizermismatch(3)
•	Optimizermismatch(4)  → points to a px_mismatch for serial plan (not clear)
•	Optimizermismatch(5)
•	Optimizermismatch(6)
•	Optimizermismatch(7)
•	Optimizermismatch(8)
•	Optimizermismatch(9)
•	Optimizermismatch(10) → optimizer mode (first_rows, all_rows) mismatch
•	Optimizermismatch(11)
•	Optimizermismatch(12) → optimizer parameter mismatch
•	Optimizermismatch(13) → cardinality feedback

It looks like Oracle has also used the same philosophy to differentiate the root-causes for AUTO_CHECK_MISMATCH using the Authorization Check failed() tag:

•	Authorization Check failed(1)
•	Authorization Check failed(2)
•	Authorization Check failed(3)
•	Authorization Check failed(4) → we are going to see in this blog post
•	Authorization Check failed(5)
•	Authorization Check failed(6)
•	Authorization Check failed(7)
•	Authorization Check failed(8)
•	Authorization Check failed(9) → VPD and Row Level Security

Let’s start by trying to explain what we should think about when we see this double information

AUTO CHECK MISMTACH
- Authorization Check failed(4)

To simulate this situation, I will simply execute a query in two different schemas. I will start with the c##mhouri schema followed by the c##test schema. This will lead us to see that the optimized execution plan in the first schema cannot be shared in the second. The reason for this hard parse is, as you may have guessed, Authorization Check failed(4)

SQL> show user

USER is "C##MHOURI"

SQL> create table t1 as select rownum n1 from dual connect by level <=10;

Table created.

SQL> select count(1) from t1;

  COUNT(1)
----------
        10

SQL> @xpsimp

SQL_ID  a2d8wpfzju8fr, child number 0
-------------------------------------
select count(1) from t1

Plan hash value: 3724264953

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |     3 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |    10 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------

SQL> show user

USER is "C##TEST"

SQL> create table t1 as select rownum n1 from dual connect by level <=10;

Table created.

SQL> select count(1) from t1;

  COUNT(1)
----------
        10

SQL> @xpsimp

SQL_ID  a2d8wpfzju8fr, child number 1
-------------------------------------
select count(1) from t1

Plan hash value: 3724264953

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |     3 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |    10 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------


14 rows selected.

SQL> @nonshared a2d8wpfzju8fr
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)...

SQL_ID                        : a2d8wpfzju8fr
ADDRESS                       : 0000000064C9EED0
CHILD_ADDRESS                 : 0000000064BC2640
CHILD_NUMBER                  : 0
REASON                        : <ChildNode><ChildNumber>0</ChildNumber><ID>36</ID> 
                                <reason>Authorization Check failed(4)</reason> ---> here
                                <size>5x4</size<translation_table_position>0
                                </translation_table_position><original_handle>
                                  2141856400</original_handle><temp_handle>
                                  2147116632</temp_handle><schema>107</schema> 
                                <synonym_object_number>0</synonym_object_number>
                                 </ChildNode>
CON_ID                        : 1
-----------------
SQL_ID                        : a2d8wpfzju8fr
ADDRESS                       : 0000000064C9EED0
CHILD_ADDRESS                 : 00000000696A23A0
CHILD_NUMBER                  : 1
AUTH_CHECK_MISMATCH           : Y
TRANSLATION_MISMATCH          : Y
REASON                        :
CON_ID                        : 1
-----------------

PL/SQL procedure successfully completed.

As a conclusion, I can say that for the moment I managed to decipher two reasons among at least 9 that can generate a hard parse identified by AUTO_CHECK_MISMATCH

•	Authorization Check failed(4) → query using same objects but for different user schemas
•	Authorization Check failed(9) → VPD and Row Level Security

I hope to be able to explain the other 7 Authorization Check failed() that remain

Comments (6)

July 2, 2021

Statistics: On the importance of the Notes column

Filed under: Oracle,Statistics — hourim @ 4:43 pm

Notice the output of the following query:

SQL> getPartCol

col partition_name format a20
col num_distinct   format a20
col last_anal      format a20
col column_name    format a12
col notes          format a35

select
       partition_name
      ,column_name
      ,num_distinct
      ,density
      ,to_char(last_analyzed,'dd/mm/yyyy hh24:mi:ss') last_anal
      ,histogram
      ,notes
from
      all_part_col_statistics
where owner     = upper('&owner')
and table_name  = upper('&table_name');

PARTITION_NAME       COLUMN_NAME  LAST_ANAL            HISTOGRAM       NOTES
-------------------- ------------ -------------------- --------------- -----------------------------------
P_A                  PKEY         30/06/2021 17:39:50  FREQUENCY       HYPERLOGLOG
P_A                  N1           30/06/2021 17:39:50  FREQUENCY       HYPERLOGLOG
P_A                  N2           30/06/2021 17:39:50  HYBRID          HYPERLOGLOG
P_A                  N_50         30/06/2021 17:39:50  FREQUENCY       HYPERLOGLOG
P_A                  N_800        30/06/2021 17:39:50  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_B                  PKEY         30/06/2021 17:39:51  FREQUENCY       HYPERLOGLOG
P_B                  N1           30/06/2021 17:39:51  FREQUENCY       HYPERLOGLOG
P_B                  N2           30/06/2021 17:39:51  HYBRID          HYPERLOGLOG
P_B                  N_50         30/06/2021 17:39:51  FREQUENCY       HYPERLOGLOG
P_B                  N_800        30/06/2021 17:39:51  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_C                  PKEY         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_C                  N1           30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_C                  N2           30/06/2021 17:39:52  HYBRID          HYPERLOGLOG
P_C                  N_50         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_C                  N_800        30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_D                  PKEY         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_D                  N1           30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_D                  N2           30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG
P_D                  N_50         30/06/2021 17:39:52  FREQUENCY       HYPERLOGLOG
P_D                  N_800        30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG

20 rows selected.

Do you know what the impact of the Notes HIST_FOR_INCREM_STATS HYPERLOGLOG is?

Impact n°1

The first impact I have recently discovered concerns the bind sensitive property of cursors using predicates on columns flagged with the above Notes. Here’s a simple demonstration. I will first start with a “normal column” N_50 and then with an “abnormal” N_800 one

SQL> select count(1) from t1_part where n_50=14;

  COUNT(1)
----------
     31984

SQL> @xpsimp

SQL_ID  18fhcmjy68zv6, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     3 |       |       |
|   2 |   PARTITION LIST ALL|         | 31984 | 95952 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| T1_PART | 31984 | 95952 |     1 |     4 | -- very good Estimation
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N_50"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: 18fhcmjy68zv6

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
18fhcmjy68zv6      2324535125            0 Y          1 -- cursor is bind sensitive

SQL> select count(1) from t1_part where n_800=14;

  COUNT(1)
----------
      2072

SQL> @xpsimp

SQL_ID  b2b4nm8tqqukg, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST ALL|         |  1958 |  7832 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| T1_PART |  1958 |  7832 |     1 |     4 | -- not a very good Estimation
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N_800"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: b2b4nm8tqqukg

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
b2b4nm8tqqukg      2324535125            0 N          1  -- cursor is not bind sensitive

Thus, once again, this needs to be brought to someone’s attention: a cursor will not be marked bind sensitive when it uses a predicate column having histogram but signaled by the Notes HIST_FOR_INCREM_STATS HYPERLOGLOG.

And what about a column that has a mix of different not null Notes values?

PARTITION_NAME       COLUMN_NAME  LAST_ANAL            HISTOGRAM       NOTES
-------------------- ------------ -------------------- --------------- -----------------------------------
P_A                  N2           30/06/2021 17:39:50  HYBRID          HYPERLOGLOG
P_B                  N2           30/06/2021 17:39:51  HYBRID          HYPERLOGLOG
P_C                  N2           30/06/2021 17:39:52  HYBRID          HYPERLOGLOG
P_D                  N2           30/06/2021 17:39:52  HYBRID          HIST_FOR_INCREM_STATS HYPERLOGLOG

In order to answer this question we need also to see the corresponding Notes of this column(N2) at the global level

select       
       column_name
      ,num_distinct
      ,density
      ,to_char(last_analyzed,'dd/mm/yyyy hh24:mi:ss') last_anal
      ,histogram
      ,notes
from
      all_tab_col_statistics
where owner     = 'C##MHOURI'
and table_name  = 'T1_PART'
and column_name ='N2';


COLUMN_NAME  NUM_DISTINCT    DENSITY LAST_ANAL            HISTOGRAM       NOTES
------------ ------------ ---------- -------------------- --------------- ------------
N2                   1031    .000814 30/06/2021 17:39:55  HYBRID          INCREMENTAL

When Oracle uses statistics at global level, the cursor will be marked bind sensitive as the following proves:

SQL> select count(1) from t1_part where n2=100;

  COUNT(1)
----------
      1725

SQL_ID  a7car6agnw2xd, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST ALL|         |   571 |  2284 |     1 |     4 | -- global level
|*  3 |    TABLE ACCESS FULL| T1_PART |   571 |  2284 |     1 |     4 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: a7car6agnw2xd

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
a7car6agnw2xd      2324535125            0 Y          2 -- cursor is bind sensitive

When Oracle decides to visit any of the first 3 partitions, the cursor will be also marked bind sensitive as shown below:

SQL> select count(1) from t1_part partition(p_a) where n2=100;

  COUNT(1)
----------
       420

SQL_ID  cz3vd6pyh783s, child number 0
-------------------------------------
--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |       |       |       |       |
|   1 |  SORT AGGREGATE        |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST SINGLE|         |   724 |  2896 |     1 |     1 |
|*  3 |    TABLE ACCESS FULL   | T1_PART |   724 |  2896 |     1 |     1 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: cz3vd6pyh783s

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
cz3vd6pyh783s      2382952772            0 Y          1 -- cursor is bind sensitive

However, when Oracle visits the p_d partition using this particular N2 column, the underlying cursor will not be marked bind sensitive in this case:

SQL> select count(1) from t1_part partition(p_d) where n2=100;

  COUNT(1)
----------
       465

SQL> @xpsimp

SQL_ID  2aju26ccnxrzp, child number 0
-------------------------------------
-------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes |Pstart| Pstop |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |       |       |      |       |
|   1 |  SORT AGGREGATE        |         |     1 |     4 |      |       |
|   2 |   PARTITION LIST SINGLE|         |   490 |  1960 |    4 |     4 |
|*  3 |    TABLE ACCESS FULL   | T1_PART |   490 |  1960 |    4 |     4 |
-------------------------------------------------------------------------
														  
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: 2aju26ccnxrzp

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
2aju26ccnxrzp      2382952772            0 N          2 -- cursor is not bind sensitive

Spot, in passing, how I have managed to force a single partition scan without using the partition key column (pkey) in the predicate part. I proceeded as such because, whenever the partition key is present in the predicate part, the underlying cursor will be marked bind sensitive regardless of the Notes column that goes along with the N2 predicate or even with the partition key as the following proves:

SQL> select count(1) from t1_part where n2=100 and pkey='D';

  COUNT(1)
----------
       465

SQL> @xpsimp

SQL_ID  dfr7cz8640wav, child number 0
-------------------------------------

--------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Pstart| Pstop |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |       |       |       |       |
|   1 |  SORT AGGREGATE        |         |     1 |     6 |       |       |
|   2 |   PARTITION LIST SINGLE|         |   490 |  2940 |   KEY |   KEY |
|*  3 |    TABLE ACCESS FULL   | T1_PART |   490 |  2940 |   KEY |   KEY |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N2"=:SYS_B_1)


SQL> @gvsql
Enter value for sql_id: dfr7cz8640wav

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER I EXECUTIONS
------------- --------------- ------------ - ----------
dfr7cz8640wav       573263709            0 Y          2 --> cursor is bind sensitive

Impact n°2

The second impact is that the histogram will be ignored when computing cardinality estimation. This is why the cardinality estimation of column N_800 is not very precise. Here’s the corresponding 10053 trace file:

SQL> select count(1) from t1_part where n_800=14;

  COUNT(1)
----------
      2072

SQL> @xpsimp

SQL_ID  b2b4nm8tqqukg, child number 0
-------------------------------------
-----------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Pstart| Pstop |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |       |       |       |       |
|   1 |  SORT AGGREGATE     |         |     1 |     4 |       |       |
|   2 |   PARTITION LIST ALL|         |  1958 |  7832 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| T1_PART |  1958 |  7832 |     1 |     4 | -- not a very good Estimation
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("N_800"=:SYS_B_1)

kkecdn: Single Table Predicate:"T1_PART"."N_800"=:B1
NOTE: ignoring histogram of column (T1_PART.N_800) -------> it ignores the Hybrid Histogram
      used only for incremental stats maintenance  -------> spot this
  Column (#5): N_800(NUMBER)
    AvgLen: 4 NDV: 817 Nulls: 0 Density: 0.001224 Min: 0.000000 Max: 799.000000
  Estimated selectivity: 0.001224 , col: #5
  Table: T1_PART  Alias: T1_PART
    Card: Original: 1600000.000000  Rounded: 1958  Computed: 1958.384333  Non Adjusted: 1958.384333
  Scan IO  Cost (Disk) =   2183.000000
  Scan CPU Cost (Disk) =   425370320.640000

Instead of using this formula to get the cardinality estimation of a predicate having HYBRID histogram, Oracle simply has ignored this histogram and used the following classical cardinality estimation:

E-Rows = sample_size/num_distinct 

SELECT
    sample_size / num_distinct e_rows
FROM
    user_tab_col_statistics
WHERE
        table_name = 'T1_PART'
    AND column_name = 'N_800';

    E_ROWS
----------
1958.38433

Conclusion

I think that from now and on, the Notes column should always go along with the histogram column (both in tab_col_statistics and part_col_statistics) as this can clearly remove the confusion of why a cursor is not bind sensitive. It can also explain why Oracle is ignoring histograms during its cardinality estimation when it looks like it shouldn’t ignore them.

PS1-here is below the model I built to reproduce the above case and for further incoming blogs about this subject

create table t1
as
select
     rownum n1	
	,trunc(dbms_random.value(0,50))  n_50
    ,trunc(dbms_random.value(0,800)) n_800    
	,trunc(50* dbms_random.normal)   n2
from
    (select /*+ materialize */ rownum from dba_objects where rownum <= 3000),
    (select /*+ materialize */ rownum from dba_objects where rownum <= 3000)
where
    rownum <= 4e5
;

create table t1_part
  (pkey varchar2(11)
  ,n1    number 
  ,n_50  number
  ,n_800 number
  ,n2    number
  )
 partition by list(pkey)
                (partition p_a values ('A')
				,partition p_b values ('B')
				,partition p_c values ('C')
				,partition p_d values ('D')
				);

insert into t1_part select 'A', t1.* from t1;
insert into t1_part select 'B', t1.* from t1;
insert into t1_part select 'C', t1.* from t1;
insert into t1_part select 'D', t1.* from t1;

-- create skew into n2 column
update t1_part set n2=n_800 where pkey='D';

--set incremental mode
execute dbms_stats.set_table_prefs(user, 't1_part', 'incremental', 'true');

-- gather statistics
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_a', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_b', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_c', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', partname =>'p_d', method_opt => 'for all columns size skewonly', granularity => 'PARTITION');
exec dbms_stats.gather_table_stats(user, 't1_part', method_opt => 'for all columns size skewonly', granularity => 'GLOBAL');

PS2-scripts used in this blog post can be found here

December 21, 2020

Disk file operations I/O

Filed under: Oracle — hourim @ 9:58 am

When Disk file operations I/O wait event starts showing at the top of the AWR Top 10 Foreground Events by Total Wait Time it can consume a lot of CPU and freeze your database. Since I have gone through that myself a couple of weeks ago, I thought it might be interesting to write a succinct blog article explaining how to find and fix the root cause of this non-common wait event.

When I started investigating the real-life case all that I have on my hand was the p1 and p3 parameters of this wait event:

p1=8 corresponds to wait for miscellaneous io (ftp, block dump, passwd file)
p3=8 corresponds to any other type of file

To say the least I have been unfortunate with this number.

As I was searching online, I came across a forum question where one of the contributors was pointing to an audit file type and operation. And from there I built the following simple model

SQL> show parameter audit

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
audit_file_dest                      string      /u01/app/oracle/product/19c/dbhome_1/rdbms/audit                                                
audit_sys_operations                 boolean     TRUE
audit_syslog_level                   string
audit_trail                          string      OS
unified_audit_common_systemlog       string
unified_audit_sga_queue_size         integer     1048576
unified_audit_systemlog              string

AUDIT ALL BY c##test BY ACCESS;
AUDIT SELECT TABLE, UPDATE TABLE, INSERT TABLE, DELETE TABLE BY c##test BY ACCESS;
AUDIT EXECUTE PROCEDURE BY c##test BY ACCESS;

create table t1 as select rownum n1, mod(rownum,2) n2, lpad('x',10) vc from dual connect by level <=1e5;

create or replace procedure p1
as
   cursor c1 is
 	select *
    from t1;
begin
  for r1 in c1
  loop
    update t1 set vc = 'yz' where n1 = r1.n1;
    insert into t1 values (r1.n1, r1.n2, r1.vc);
    delete from t1 where n1=r1.n1 and n2=r1.n2 and vc =r1.vc;
   end loop;
commit;
end p1;

I’ve set the audit trail parameter at OS level and I managed to audit several DML operations done by the c##test user.

All that I have to do now to reproduce the Disk file operations I/O wait event on the p1 and p3 values set to 8 is to execute the p1 procedure as shown below

  SQL> exec p1();

While the session executing the p1 procedure (sid=205) was running I snapped it to see what it was doing behind the scene

SQL> @snapper all 5 1 205
Sampling SID 205 with interval 5 seconds, taking 1 snapshots...

-- Session Snapper v4.30 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) 
- Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! 

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                            ,         DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    205, C##TEST   , STAT, opened cursors cumulative            ,          1859,     188.24,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, recursive calls                      ,          1866,     188.95,         ,             ,          ,           ,     1.28ms recursive CPU per recursive call
    205, C##TEST   , STAT, recursive cpu usage                  ,           238,       24.1,         ,             ,          ,           ,      2.38s total recursive CPU usage
    205, C##TEST   , STAT, session logical reads                ,        413327,     41.85k,         ,             ,          ,           ,    414.55k total buffer visits
    205, C##TEST   , STAT, CPU used by this session             ,           241,       24.4,         ,             ,          ,           ,      2.41s total CPU used by this session
    205, C##TEST   , STAT, user I/O wait time                   ,             6,        .61,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, non-idle wait time                   ,             6,        .61,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, non-idle wait count                  ,          1858,     188.14,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, physical read total IO requests      ,             1,         .1,         ,             ,          ,           ,     16.38k bytes per request
    205, C##TEST   , STAT, physical read total bytes            ,         16384,      1.66k,         ,             ,          ,           ,       8.94 per execution
    205, C##TEST   , STAT, cell physical IO interconnect bytes  ,         16384,      1.66k,         ,             ,          ,           ,       8.94 per execution
    205, C##TEST   , STAT, db block gets                        ,          1911,      193.5,         ,             ,          ,           ,       1.04 per execution
    205, C##TEST   , STAT, db block gets from cache             ,          1911,      193.5,         ,             ,          ,           ,       1.04 per execution
    205, C##TEST   , STAT, db block gets from cache (fastpath)  ,           644,      65.21,         ,             ,          ,           ,        .35 per execution
    205, C##TEST   , STAT, consistent gets                      ,        412820,      41.8k,         ,             ,          ,           ,     225.34 per execution
    205, C##TEST   , STAT, consistent gets from cache           ,        412811,      41.8k,         ,             ,          ,           ,     225.33 per execution
    205, C##TEST   , STAT, consistent gets pin                  ,        412811,      41.8k,         ,             ,          ,           ,     225.33 per execution
    205, C##TEST   , STAT, consistent gets pin (fastpath)       ,        412811,      41.8k,         ,             ,          ,           ,     225.33 per execution
    205, C##TEST   , STAT, consistent gets direct               ,             9,        .91,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, logical read bytes from cache        ,    3397402624,    344.01M,         ,             ,          ,           ,      1.85M per execution
    205, C##TEST   , STAT, physical reads                       ,             2,         .2,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, physical reads direct                ,             2,         .2,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, physical read IO requests            ,             1,         .1,         ,             ,          ,           ,     16.38k bytes per request
    205, C##TEST   , STAT, physical read bytes                  ,         16384,      1.66k,         ,             ,          ,           ,       8.94 per execution
    205, C##TEST   , STAT, db block changes                     ,          3749,     379.61,         ,             ,          ,           ,       2.05 per execution
    205, C##TEST   , STAT, change write time                    ,            49,       4.96,         ,             ,          ,           ,        .03 per execution
    205, C##TEST   , STAT, free buffer requested                ,          1265,     128.09,         ,             ,          ,           ,        .69 per execution
    205, C##TEST   , STAT, dirty buffers inspected              ,            10,       1.01,         ,             ,          ,           ,        .01 per execution
    205, C##TEST   , STAT, pinned buffers inspected             ,             1,         .1,         ,             ,          ,           ,  .55*10^-3 per execution
    205, C##TEST   , STAT, hot buffers moved to head of LRU     ,            65,       6.58,         ,             ,          ,           ,        .04 per execution
    205, C##TEST   , STAT, free buffer inspected                ,           123,      12.45,         ,             ,          ,           ,        .07 per execution
    205, C##TEST   , STAT, switch current to new buffer         ,          1240,     125.56,         ,             ,          ,           ,        .68 per execution
    205, C##TEST   , STAT, switch current caused by our pin     ,          1240,     125.56,         ,             ,          ,           ,        .68 per execution
    205, C##TEST   , STAT, shared hash latch upgrades - no wait ,           726,      73.51,         ,             ,          ,           ,         .4 per execution
    205, C##TEST   , STAT, calls to kcmgcs                      ,          3714,     376.07,         ,             ,          ,           ,       2.03 per execution
    205, C##TEST   , STAT, calls to kcmgas                      ,             4,        .41,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, calls to get snapshot scn: kcmgss    ,          1855,     187.83,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, redo entries                         ,          1856,     187.93,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, redo size                            ,        521376,     52.79k,         ,             ,          ,           ,          ~ bytes per user commit
    205, C##TEST   , STAT, redo ordering marks                  ,             4,        .41,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, file io wait time                    ,         22467,      2.27k,         ,             ,          ,           ,    22.47ms bad guess of IO wait time per IO request
    205, C##TEST   , STAT, Number of read IOs issued            ,             1,         .1,         ,             ,          ,           ,  .55*10^-3 per execution
    205, C##TEST   , STAT, undo change vector size              ,        182320,     18.46k,         ,             ,          ,           ,      99.52 per execution
    205, C##TEST   , STAT, no work - consistent read gets       ,        406064,     41.12k,         ,             ,          ,           ,     221.65 per execution
    205, C##TEST   , STAT, table scans (long tables)            ,          1231,     124.65,         ,             ,          ,           ,        .67 per execution
    205, C##TEST   , STAT, table scan rows gotten               ,     123963294,     12.55M,         ,             ,          ,           ,     67.67k per execution
    205, C##TEST   , STAT, table scan disk non-IMC rows gotten  ,     123898306,     12.55M,         ,             ,          ,           ,     67.63k per execution
    205, C##TEST   , STAT, table scan blocks gotten             ,        404643,     40.97k,         ,             ,          ,           ,     220.88 per execution
    205, C##TEST   , STAT, heap block compress                  ,             6,        .61,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, HSC Heap Segment Block Changes       ,          1853,     187.63,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, session cursor cache hits            ,          1840,     186.31,         ,             ,          ,           ,      1.84k softparses avoided thanks to cursor cache
    205, C##TEST   , STAT, buffer is pinned count               ,          1222,     123.74,         ,             ,          ,           ,        .29 % buffer gets avoided thanks to buffer pin caching
    205, C##TEST   , STAT, buffer is not pinned count           ,          2446,     247.68,         ,             ,          ,           ,       1.34 per execution
    205, C##TEST   , STAT, execute count                        ,          1832,      185.5,         ,             ,          ,           ,          ~ executions per parse
    205, C##TEST   , TIME, PL/SQL execution elapsed time        ,         11791,     1.19ms,      .1%, [          ],          ,           ,
    205, C##TEST   , TIME, DB CPU                               ,       2460784,   249.17ms,    24.9%, [@@@       ],          ,           ,
    205, C##TEST   , TIME, sql execute elapsed time             ,       8059797,   816.12ms,    81.6%, [######### ],          ,           ,
    205, C##TEST   , TIME, DB time                              ,       8059797,   816.12ms,    81.6%, [######### ],          ,           ,      18.39 % unaccounted time
    205, C##TEST   , WAIT, Disk file operations I/O             ,         46307,     4.69ms,      .5%, [          ],      1828,      185.1,    25.33us average wait ----------> here
    205, C##TEST   , WAIT, ASM IO for non-blocking poll         ,             8,      .81us,      .0%, [          ],         1,         .1,        8us average wait

--  End of Stats snap 1, end=2020-12-20 09:02:36, seconds=9.9

------------------------------------------------------------------------------------------
  ActSes   %Thread | INST | SQL_ID          | SQL_CHILD | EVENT          | WAIT_CLASS
------------------------------------------------------------------------------------------
     .48     (48%) |    1 | fnbt9ts7bsqzn   | 0         | ON CPU         | ON CPU
     .48     (48%) |    1 | 7wv0zz7965u54   | 0         | ON CPU         | ON CPU
     .03      (3%) |    1 | 1nf0v48kt3a30   | 0         | ON CPU         | ON CPU
     .03      (3%) |    1 | 8myfvr1fb8gju   | 0         | ON CPU         | ON CPU

--  End of ASH snap 1, end=2020-12-20 09:02:36, seconds=5, samples_taken=40, AAS=1


SQL> @snapper ash 5 1 205
Sampling SID 205 with interval 5 seconds, taking 1 snapshots...

-- Session Snapper v4.30 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) 
- Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet!
----------------------------------------------------------------------------------------------------
  ActSes   %Thread | INST | SQL_ID          | SQL_CHILD | EVENT                    | WAIT_CLASS
----------------------------------------------------------------------------------------------------
     .46     (46%) |    1 | fnbt9ts7bsqzn   | 0         | ON CPU                   | ON CPU
     .46     (46%) |    1 | 7wv0zz7965u54   | 0         | ON CPU                   | ON CPU
     .03      (3%) |    1 | 1nf0v48kt3a30   | 0         | ON CPU                   | ON CPU
     .03      (3%) |    1 | fnbt9ts7bsqzn   | 0         | Disk file operations I/O | User I/O ----------> here
     .03      (3%) |    1 | 8myfvr1fb8gju   | 0         | ON CPU                   | ON CPU

As you can see there is a certain proportion of time spent waiting on Disk file operations I/O. And, naturally, the p1 and p3 values, in this case, are respectively 8 and 8 as the followings prove:

SQL>  @ashtopSID
Enter value for sid: 205
Enter value for from: 12-20 09:00:00
Enter value for to: 12-20 09:15:00

  Total                                                                                                                                                                        
  Seconds     AAS %This   SQL_ID        SQL_OPNAME      TOP_LEVEL_SQL EVENT2                     P1TEXT           P1 P3TEXT        P3 
--------- ------- ------- ------------- --------------- ------------- -------------------------- --------------- --- ----------- ---- 
      309      .3   45% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
      306      .3   44% | fnbt9ts7bsqzn UPDATE          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
       24      .0    3% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
       13      .0    2% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju db file sequential read    file#             1 blocks         1 
        8      .0    1% | fnbt9ts7bsqzn UPDATE          8myfvr1fb8gju db file sequential read    file#             1 blocks         1 
        7      .0    1% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju Disk file operations I/O   FileOperation     8 filetype       8 ----------> here
        6      .0    1% | fnbt9ts7bsqzn UPDATE          8myfvr1fb8gju Disk file operations I/O   FileOperation     8 filetype       8 ----------> here
        5      .0    1% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju Disk file operations I/O   FileOperation     8 filetype       8 ----------> here
        3      .0    0% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju db file scattered read     file#             1 blocks         3 
        3      .0    0% |                               8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
        2      .0    0% | 0jjc60pmrntdv SELECT          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
        1      .0    0% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju ON CPU                     file#             1 blocks         1 
        1      .0    0% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju db file sequential read    file#             5 blocks         1 
        1      .0    0% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju resmgr:cpu quantum         location          3                0 
        1      .0    0% | 8myfvr1fb8gju PL/SQL EXECUTE  8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8

Interestingly ASH shows that the CPU consumed by Oracle is also related to the manipulation of files of type 8. I don’t know really how to correctly interpret these p1 and p3 values in the context of CPU


select decode(event,null, 'on cpu', event), p1,p3, count(1)
    from dba_hist_active_sess_history
    where    sample_time between to_date('20122020 09:00:00', 'ddmmyyyy hh24:mi:ss')
                and     to_date('20122020 09:15:00', 'ddmmyyyy hh24:mi:ss')
group by decode(event,null, 'on cpu', event), p1,p3 
order by 4 desc;

DECODE(EVENT,NULL,'ONCPU',EVENT)          P1         P3   COUNT(1)
--------------------------------  ---------- ---------- ----------
on cpu                                     8          8         64
db file sequential read                    1          1         10
on cpu                                   100          0          6
Disk file operations I/O                   8          8          4
on cpu                                     1          1          4
db file scattered read                     3          3          3
on cpu                                     0          0          3
control file sequential read               0          1          3
on cpu                                     3          1          2
ASM file metadata operation                0          0          2
on cpu                               2490368          0          1
db file parallel write                     1 2147483647          1
db file async I/O submit                 106          0          1
on cpu                                     3          0          1
on cpu                                     2          1          1
on cpu                                131072          0          1
on cpu                                     5          1          1
on cpu                                 65536          0          1
oracle thread bootstrap              5910576          0          1
on cpu                                262144          0          1

Summary

Whenever you see a Disk file operations I/O wait event consuming a lot of time and CPU then get the p1,p2,p3 values from ASH or AWR


SQL> @sed "Disk file operations"
Show wait event descriptions matching %Disk file operations%..

EVENT_NAME                     WAIT_CLASS           PARAMETER1           PARAMETER2           PARAMETER3
------------------------------ -------------------- -------------------- -------------------- --------------------
Disk file operations I/O       User I/O             FileOperation        fileno               filetype

If p1 and p3 values equal to 8 and 8 respectively then check if you are, inadvertently, auditing too many repetitive actions by means of the following select


SQL>  select user_name, audit_option from DBA_STMT_AUDIT_OPTS where user_name='C##TEST';

USER_NAME                 AUDIT_OPTION
------------------------- ----------------------------------------
C##TEST                   ALTER SYSTEM
C##TEST                   SYSTEM AUDIT
C##TEST                   CREATE SESSION
C##TEST                   TABLE
C##TEST                   CLUSTER
C##TEST                   TABLESPACE
C##TEST                   USER
C##TEST                   ROLLBACK SEGMENT
C##TEST                   TYPE
C##TEST                   SYNONYM
C##TEST                   PUBLIC SYNONYM
C##TEST                   VIEW
C##TEST                   SEQUENCE
C##TEST                   DATABASE LINK
C##TEST                   PUBLIC DATABASE LINK
C##TEST                   ROLE
C##TEST                   DIMENSION
C##TEST                   PROCEDURE
C##TEST                   TRIGGER
C##TEST                   PROFILE
C##TEST                   DIRECTORY
C##TEST                   MATERIALIZED VIEW
C##TEST                   MINING MODEL
C##TEST                   INDEX
C##TEST                   SELECT TABLE
C##TEST                   INSERT TABLE
C##TEST                   UPDATE TABLE
C##TEST                   DELETE TABLE
C##TEST                   NOT EXISTS
C##TEST                   SYSTEM GRANT
C##TEST                   EXECUTE PROCEDURE
C##TEST                   CONTEXT
C##TEST                   SQL TRANSLATION PROFILE
C##TEST                   LOCKDOWN PROFILE
C##TEST                   ATTRIBUTE DIMENSION
C##TEST                   HIERARCHY
C##TEST                   ANALYTIC VIEW
C##TEST                   READ DIRECTORY
C##TEST                   WRITE DIRECTORY
C##TEST                   EXECUTE DIRECTORY

If so then reduce the number of audited actions if you want to reduce the impact of this wait event.
In passing, I think that this wait event would not have shown up at the TOP if I have set the audit trail parameter value to DB. I left this not-demonstrated though.

Comments (12)

November 29, 2020

OR-Expansion by-passed: top query block of a DML

Filed under: Oracle — hourim @ 4:48 pm

Here’s an interesting OR-Expansion limit which reared its head a little bit after a 12cR1-to-19c Oracle upgrade. It happens during a Create Table As Select (CTAS) which was using a CONCATENATION operation under 12cR1 completing in a couple of seconds. However, under 19c version the same CTAS uses a dramatic NESTED LOOPS to the detriment of the more powerful OR-Expansion and took more than 6 hours.

Here’re below the 12cR1 and 19c execution plan respectively

------------------------------------------------------------------------
| Id  | Operation                        | Name       | Rows  |Time     |
------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT           |            |       |         |
|   1 |  LOAD AS SELECT                  | TARGET_T   |       |         |
|   2 |   OPTIMIZER STATISTICS GATHERING |            | 71227 |00:00:05 |
|   3 |    HASH UNIQUE                   |            | 71227 |00:00:05 |
|   4 |     CONCATENATION                |            |       |         |
|   5 |      HASH JOIN                   |            | 38426 |00:00:03 |
|   6 |       TABLE ACCESS STORAGE FULL  | SOURCE_T2  | 48067 |00:00:01 |
|   7 |       TABLE ACCESS STORAGE FULL  | SOURCE_T1  |    16M|00:00:03 |
|   8 |      HASH JOIN                   |            | 32801 |00:00:03 |
|   9 |       TABLE ACCESS STORAGE FULL  | SOURCE_T2  | 48067 |00:00:01 |
|  10 |       TABLE ACCESS STORAGE FULL  | SOURCE_T1  |    11M|00:00:03 |
-------------------------------------------------------------------------

-------------------------------------------------------------------------
| Id  | Operation                        | Name       | Rows  | Time     |
-------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT           |            |       |          |
|   1 |  LOAD AS SELECT                  | TARGET_T   |       |          |
|   2 |   OPTIMIZER STATISTICS GATHERING |            | 71227 | 08:53:39 |
|   3 |    HASH UNIQUE                   |            | 71227 | 08:53:39 |
|   4 |     NESTED LOOPS                 |            | 71227 | 08:53:39 |
|   5 |      TABLE ACCESS STORAGE FULL   | SOURCE_T1  |    24M| 00:00:03 |
|   6 |      TABLE ACCESS STORAGE FULL   | SOURCE_T2  |     1 | 00:00:01 |
-------------------------------------------------------------------------

There is a simple root cause for this performance deterioration: OR-Expansion can’t be used by Oracle during a DDL operation.

You can locate the root cause of this transformation limit by using the or_expand hint and observing the generated Hint Report at the bottom of the corresponding execution plan:

create table t1 (n1 number, n2 number, c1 varchar2(10));
create index t1_idx1 on t1(n1, n2);
create index t1_idx2 on t1(n2);

create table t2 as
select
   rownum id
  ,trunc((rownum-1/6)) n2
  ,trunc((rownum-1/8)) n3
  ,mod(rownum,5) n4
  ,lpad('z', 4) vc
from
 dual
connect by level <=1e3;

  
exec dbms_stats.gather_table_stats(user, 't1')
exec dbms_stats.gather_table_stats(user, 't2');

SQL>  explain plan for
	create table t1_bis
	 as
	   select /*+ or_expand */
		t1.*
	   from
		t1, t2
	   where
		t1.n1 = t2.id
	  or
	  t1.n1 = t2.n2;

Explained.

-----------------------------------------------------------
| Id  | Operation                        | Name   | Rows  |
-----------------------------------------------------------
|   0 | CREATE TABLE STATEMENT           |        |     1 |
|   1 |  LOAD AS SELECT                  | T1_BIS |       |
|   2 |   OPTIMIZER STATISTICS GATHERING |        |     1 |
|   3 |    NESTED LOOPS                  |        |     1 |
|   4 |     TABLE ACCESS FULL            | T1     |     1 |
|*  5 |     TABLE ACCESS FULL            | T2     |     1 |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("T1"."N1"="T2"."ID" OR "T1"."N1"="T2"."N2")

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$1
         U -  or_expand / Top query block of a DML  -------> Note the reason for not using the hint

You can also get the reason for which Oracle by-passed the or-expansion for the CTAS case via the corresponding 10053

**************************
Query transformations (QT)
**************************
TE: Checking validity of table expansion for query block SEL$1 (#0)
TE: Bypassed: Top query block of a DML.
ORE: Checking validity of OR Expansion for query block SEL$1 (#0)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."N1"="T2"."ID" OR "T1"."N1"="T2"."N2"
ORE: Predicate chain after QB validity check - SEL$1
"T1"."N1"="T2"."ID" OR "T1"."N1"="T2"."N2"
ORE: bypassed - Top query block of a DML. ---> Note the by-pass reason

While the OR-expansion transformation is by passed during the CTAS it is, however, used for the select part in 19c:

explain plan for
   select /*+ or_expand */
       t1.*
  from
    t1, t2
  where
    t1.n1 = t2.id
  or
    t1.n1 = t2.n2;

Explained.

--------------------------------------------------------
| Id  | Operation            | Name            | Rows  |
--------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |     2 |
|   1 |  VIEW                | VW_ORE_F79C84EE |     2 |
|   2 |   UNION-ALL          |                 |       |
|*  3 |    HASH JOIN         |                 |     1 |
|   4 |     TABLE ACCESS FULL| T1              |     1 |
|   5 |     TABLE ACCESS FULL| T2              |  1000 |
|*  6 |    HASH JOIN         |                 |     1 |
|   7 |     TABLE ACCESS FULL| T1              |     1 |
|   8 |     TABLE ACCESS FULL| T2              |  1000 |
--------------------------------------------------------
/*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SET$2A13AF86_2" "T2"@"SET$2A13AF86_2")
      LEADING(@"SET$2A13AF86_2" "T1"@"SET$2A13AF86_2" "T2"@"SET$2A13AF86_2")
      FULL(@"SET$2A13AF86_2" "T2"@"SET$2A13AF86_2")
      FULL(@"SET$2A13AF86_2" "T1"@"SET$2A13AF86_2")
      USE_HASH(@"SET$2A13AF86_1" "T2"@"SET$2A13AF86_1")
      LEADING(@"SET$2A13AF86_1" "T1"@"SET$2A13AF86_1" "T2"@"SET$2A13AF86_1")
      FULL(@"SET$2A13AF86_1" "T2"@"SET$2A13AF86_1")
      FULL(@"SET$2A13AF86_1" "T1"@"SET$2A13AF86_1")
      NO_ACCESS(@"SEL$9162BF3C" "VW_ORE_F79C84EE"@"SEL$F79C84EE")
      OUTLINE(@"SEL$1")
      OR_EXPAND(@"SEL$1" (1) (2))
      OUTLINE_LEAF(@"SEL$9162BF3C")
      OUTLINE_LEAF(@"SET$2A13AF86")
      OUTLINE_LEAF(@"SET$2A13AF86_1")
      OUTLINE_LEAF(@"SET$2A13AF86_2")
      ALL_ROWS
      OPT_PARAM('_optimizer_use_feedback' 'false')
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."N1"="T2"."ID")
   6 - access("T1"."N1"="T2"."N2")
       filter(LNNVL("T1"."N1"="T2"."ID"))

Workaround

To work around this limit, you can set back the optimizer features enable (OFE) to the previous version (12cR1 or less) where the OR-Expansion was not yet implemented so that your CTAS might be able to use the CONCATENATION operation. Or, you can opt for the following workaround provided by Mauro Pagano(@Mautro)

SQL> explain plan for
   create table t1_bis
   as
   select *
   from
   ( select /*+ no_merge
               or_expand
            */
        t1.*
   from
     t1, t2
   where
     t1.n1 = t2.id
   or
     t1.n1 = t2.n2
     );

Explained.

----------------------------------------------------------------------------
| Id  | Operation                        | Name            | Rows  | Bytes |
----------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT           |                 |     2 |    66 |
|   1 |  LOAD AS SELECT                  | T1_BIS          |       |       |
|   2 |   OPTIMIZER STATISTICS GATHERING |                 |     2 |    66 |
|   3 |    VIEW                          |                 |     2 |    66 |
|   4 |     VIEW                         | VW_ORE_4C1C5B2C |     2 |    66 |
|   5 |      UNION-ALL                   |                 |       |       |
|*  6 |       HASH JOIN                  |                 |     1 |    37 |
|   7 |        TABLE ACCESS FULL         | T1              |     1 |    33 |
|   8 |        TABLE ACCESS FULL         | T2              |  1000 |  4000 |
|*  9 |       HASH JOIN                  |                 |     1 |    41 |
|  10 |        TABLE ACCESS FULL         | T1              |     1 |    33 |
|  11 |        TABLE ACCESS FULL         | T2              |  1000 |  8000 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access("T1"."N1"="T2"."ID")
   9 - access("T1"."N1"="T2"."N2")
       filter(LNNVL("T1"."N1"="T2"."ID"))

It consists of wrapping the select part into an inline view and pre-empting this inline view from being merged with the DDL part of the CTAS. As such the SELECT part becomes a valid candidate for the OR-Expansion transformation.

Oddity

When I tried to put the or_expand hint at the create table level here’s what I got


SQL> select banner_full from v$version;

BANNER_FULL
--------------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.8.0.0.0

SQL> explain plan for
create /*+ or_expand */ table t1_bis
as
  select
     t1.*
from
  t1, t2
where
  t1.n1 = t2.id
or
  t1.n1 = t2.n2;  2    3    4    5    6    7    8    9   10   11
explain plan for
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 8658
Session ID: 198 Serial number: 5465

With the following error in the alert log

ORA-07445: exception encountered: core dump [qkshtQBStore()+219] [SIGSEGV] [ADDR:0x258] [PC:0x2628B8B] [Address not mapped to object] []

Comments (2)

November 15, 2020

Null-Aware anti-join, parsing and _optimizer_squ_bottomup

Filed under: Oracle — hourim @ 10:29 am

I originally wrote about Null-Aware anti join in 2017 just as something I’d keep seeing several times at client sites. But it turned out that it was rather to explain Null-Accepting Semi-Join. For the sake of clarity let me show, below, two execution plans where these two CBO transformations are in action respectively:

----------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |
----------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |
|   1 |  SORT AGGREGATE     |      |     1 |     6 |
|*  2 |   HASH JOIN ANTI NA |      |     1 |     6 | -- Null-Aware ANTI Join
|   3 |    TABLE ACCESS FULL| T1   |    10 |    30 |
|   4 |    TABLE ACCESS FULL| T2   |    10 |    30 |
----------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 – access("T1"."N1"="N1")

----------------------------------------------------
----------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |
----------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |
|   1 |  SORT AGGREGATE     |      |     1 |     6 |
|*  2 |   HASH JOIN SEMI NA |      |     7 |    42 | --Null Accepting SEMI-join
|   3 |    TABLE ACCESS FULL| T1   |    10 |    30 |
|   4 |    TABLE ACCESS FULL| T2   |    10 |    30 |
----------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."N1"="T1"."N1")

A simple way to don’t get confused is to remember this:

Null-Aware Anti Join is for rows that don’t join (ANTI)
Null-Accepting Semi join is for rows that (SEMI) join.

In this blog post, I will try to show how fixing a really weird real life parsing issue by changing the value of the _optimizer_squ_bottomup parameter has created a performance issue in another query. I will demonstrate that this bad side effect occurred because the Null-Aware anti-join transformation is by-passed by Oracle under a non-default value of this hidden parameter

SQL> select banner_full from gv$version;

BANNER_FULL
----------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

create table t1 as select rownum n1, trunc((rownum -1/5)) n2 from dual connect by level <= 1e6; 
create table t2 as select rownum n1, trunc((rownum -1/3)) n2 from dual connect by level <= 1e6;
 
update t1 set n1 = null where n1<=1e3;
 
exec dbms_stats.gather_table_stats(user, 't1');
exec dbms_stats.gather_table_stats(user, 't2');


SELECT 
   count(1)
FROM t1
   WHERE t1.n1 NOT IN (select n1 from t2);
   
Elapsed: 00:00:00.31 -----> only 31 ms

Execution Plan
----------------------------------------------------------
Plan hash value: 1650861019
------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc|
------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    10 |       |
|   1 |  SORT AGGREGATE     |      |     1 |    10 |       |
|*  2 |   HASH JOIN ANTI NA |      |  1000K|  9765K|    16M|
|   3 |    TABLE ACCESS FULL| T1   |  1000K|  4882K|       |
|   4 |    TABLE ACCESS FULL| T2   |  1000K|  4882K|       |
------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."N1"="N1")

Statistics
------------------------------------------
          0  recursive calls
          0  db block gets
       4154  consistent gets
          0  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        430  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

As you can see, Oracle has used a Null-Aware anti join (HASH JOIN ANTI NA) and executed the query in a couple of milliseconds with 4k of consistent gets

But, if you come to encounter a parsing issue (mainly in 12cR2) and you workaround it by changing the following parameter:

SQL> alter session set "_optimizer_squ_bottomup" = false;

Then see how your original query will behave:

SELECT count(1) FROM t1 WHERE t1.n1 NOT IN (select n1 from t2)

Global Information
------------------------------
 Status              :  EXECUTING             
 Instance ID         :  1                     
 Session             :  C##MHOURI (274:59197) 
 SQL ID              :  7009s3j53bdgv         
 SQL Execution ID    :  16777226              
 Execution Started   :  11/14/2020 14:43:59   
 First Refresh Time  :  11/14/2020 14:44:05   
 Last Refresh Time   :  11/14/2020 15:00:43   
 Duration            :  1004s     ------> still in execution phase after 1004s            
 Module/Action       :  SQL*Plus/-            
 Service             :  orcl19c               
 Program             :  sqlplus.exe           

Global Stats
=========================================
| Elapsed |   Cpu   |  Other   | Buffer |
| Time(s) | Time(s) | Waits(s) |  Gets  |
=========================================
|    1003 |     982 |       21 |    50M |
========================================= 
SQL Plan Monitoring Details (Plan Hash Value=59119136)

=========================================================================================
| Id   |       Operation       | Name |  Rows   |   Time    | Start  | Execs |   Rows   |
|      |                       |      | (Estim) | Active(s) | Active |       | (Actual) |
=========================================================================================
|    0 | SELECT STATEMENT      |      |         |           |        |     1 |          |
|    1 |   SORT AGGREGATE      |      |       1 |           |        |     1 |          |
| -> 2 |    FILTER             |      |         |       999 |     +6 |     1 |        0 |
| -> 3 |     TABLE ACCESS FULL | T1   |      1M |       999 |     +6 |     1 |     219K |
| -> 4 |     TABLE ACCESS FULL | T2   |       1 |      1004 |     +1 |  218K |     218K |
=========================================================================================

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( IS NULL)
   4 - filter(LNNVL("N1"<>:B1))

It used a dramatic execution plan.

And this is a perfect representation of what can happen in a real-life production system. The FILTER operation acts as a NESTED LOOPS operation would have done: the inner row source (operation id n°4) has been started 218K times. Moreover, the performance pain would have been even worse if your real-life query runs under an Exadata machine where the LNNVL function (used to overcome the always threatening null) will impeach smart scan from kicking in.

Here’s below a snippet of the 10053 when this transformation is used:

CBQT: Validity checks passed for 51rdr10tfdcb1.

Subquery removal for query block SEL$2 (#2)
RSW: Not valid for subquery removal SEL$2 (#2)
Subquery unchanged.
SU:   Transform ALL subquery to a null-aware antijoin.
SJC: Considering set-join conversion in query block SEL$5DA710D3 (#1)

AP: Adaptive joins bypassed for query block SEL$5DA710D3 due to null-aware anti-join

***************
Now joining: T2[T2]#1
***************
NL Join

Best:: JoinMethod: HashNullAwareAnti
       Cost: 2809.074624  Degree: 1  Resp: 2809.074624  Card: 1000000.000000 Bytes:

And here’s a snippet of the same trace file when this transformation is not used:

*************************************
  PARAMETERS WITH ALTERED VALUES
  ******************************
Compilation Environment Dump
_pga_max_size                       = 333600 KB
_optimizer_squ_bottomup             = false
_optimizer_use_feedback             = false
Bug Fix Control Environment 

Considering Query Transformations on query block SEL$1 (#0)
**************************
Query transformations (QT)
**************************
CBQT bypassed for query block SEL$1 (#0): Disabled by parameter.
CBQT: Validity checks failed for cw6z7z8cajdbq.

Can we infer that that changing the default value of _optimizer_squ_bottomup will cancel the Cost-Based Query Transformation?

Summary

Changing the _optimizer_squ_bottomup value at a global level will fix several parsing issues occurring because of a not yet published bug Bug 26661798: HIGH PARSE TIME FOR COMPLEX QUERY 12C but, bear in mind that it might also introduce a performance deterioration for queries using NOT IN predicate where the Null-Aware anti join transformation is invalidated by the non-default value of this hidden parameter .

Mohamed Houri’s Oracle Notes

February 5, 2023

SPM baseline and historical execution plans

January 22, 2023

Force matching signature

What impact can this have?

Bottom Line

August 15, 2022

enq: TM – contention [mode=3/5]

January 30, 2022

Or expansion: Linear

January 23, 2022

Or expansion: Two-pass, Linear or Greedy

July 28, 2021

July 2, 2021

Statistics: On the importance of the Notes column

Impact n°2

Conclusion

December 21, 2020

Disk file operations I/O

Summary

November 29, 2020

OR-Expansion by-passed: top query block of a DML

Workaround

Oddity

November 15, 2020

Null-Aware anti-join, parsing and _optimizer_squ_bottomup

Summary

Follow Blog via Email

Categories

Blog Stats

Blogs I Follow

Archives