Mohamed Houri’s Oracle Notes

November 28, 2019

DOP Downgrade

Filed under: Oracle — hourim @ 5:53 pm

Suppose you are confronted to a performance issue due to an initially optimized parallel execution plan that has been 100% DOP downgraded. Suppose also that this DOP downgrade is due to an Oracle Resource Manager max DOP limit directive.
Would you be able to point out this DOP downgrade root cause using gv$sql_monitor view? Would you be able as well to say, a posteriori (using AWR tables), that your issue was due to ORM max DOP limit?

This blog post aims to answer the above two questions.

1.Replaying the issue using a reproducible model

While it is not so straightforward to link a real-life performance issue to a DOP downgrade due to a resource manager maxim allowed DOP, it is nevertheless, very easy to model and reproduce this case at will.

RDEV>  create table t1 as select a.* from dba_objects a,dba_objects where rownum<= 1e6;

RDEV2> alter table t1 parallel 4;

RDEV2> select count(1) from t1 where owner >= 'A';

Parallel Execution Details (DOP=4 , Servers Allocated=4)
===================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    |
|                |       |         | Time(s) | Time(s) | Waits(s) |
===================================================================
| PX Coordinator | QC    |         |    0.12 |    0.01 |          |
| p000           | Set 1 |       1 |    0.20 |    0.03 |     0.15 |
| p001           | Set 1 |       2 |    0.20 |    0.03 |     0.15 |
| p002           | Set 1 |       3 |    0.20 |    0.04 |     0.15 |
| p003           | Set 1 |       4 |    0.20 |    0.04 |     0.12 |
===================================================================
-------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  |    TQ  |IN-OUT|
-------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |        |      |
|   1 |  SORT AGGREGATE        |          |     1 |        |      |
|   2 |   PX COORDINATOR       |          |       |        |      |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |  Q1,00 | P->S |
|   4 |     SORT AGGREGATE     |          |     1 |  Q1,00 | PCWP |
|   5 |      PX BLOCK ITERATOR |          |  1000K|  Q1,00 | PCWC |
|*  6 |       TABLE ACCESS FULL| T1       |  1000K|  Q1,00 | PCWP |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("OWNER">='A')

Note
-----
   - Degree of Parallelism is 4 because of table property
 

As you can see, the Degree Of Parallelism 4 has been honored and the SQL query has been executed using Parallelism.
However, if under the hood, I create an Oracle Resource Manager which will limit the maximum DOP allowed to be used by the current user (RDEV) then the same query will be DOP downgraded causing performance pain like
in one of my real-life case (a blog post will follow):

BEGIN
-- clear pending area
  DBMS_RESOURCE_MANAGER.clear_pending_area();
  DBMS_RESOURCE_MANAGER.create_pending_area();
  
-- create resource plan
  DBMS_RESOURCE_MANAGER.create_plan(
    plan    => 'DOPDOWNGRAD_RPL',
    comment => 'Resource Plan limiting DOP');
    
-- create consumer group
  DBMS_RESOURCE_MANAGER.create_consumer_group(
    consumer_group => 'DOPDONWGRAD_RSG',
    comment        => 'LOW Parallel Priority group');

-- assign the DOP limitation to the consumer group
  DBMS_RESOURCE_MANAGER.create_plan_directive (
    plan                     => 'DOPDOWNGRAD_RPL',
    group_or_subplan         => 'DOPDONWGRAD_RSG',
    comment                  => 'LOW PDOP Priority',    
    parallel_degree_limit_p1 => 1);

-- This is mandatory to avoid ORA-29377: consumer group OTHER_GROUPS is not part of top-plan DOPDOWNGRAD_RPL
  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
    PLAN             => 'DOPDOWNGRAD_RPL',
    GROUP_OR_SUBPLAN => 'OTHER_GROUPS',
    MGMT_P1          => 10,
    COMMENT          => 'Directive for OTHER_GROUPS (mandatory)');
    
 -- validate the resource plan
  DBMS_RESOURCE_MANAGER.validate_pending_area;
  DBMS_RESOURCE_MANAGER.submit_pending_area();
END;
/

The preceding anonymous PL/SLQ block creates the DOPDOWNGRAD_RPL resource plan, attaches to it the DOPDONWGRAD_RSG consumer group and assigns a directive to this consumer group so that maximum DOP allowed by this consumer group would be 1.

To finish the setup, I will assign the user RDEV to this consumer group and set the DOPDOWNGRAD_RPL plan as the default resource plan:

-- Affect user RDEV to resource group
BEGIN
    DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
    DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
       ATTRIBUTE      => DBMS_RESOURCE_MANAGER.ORACLE_USER,
       VALUE          => 'RDEV',
       CONSUMER_GROUP => 'DOPDONWGRAD_RSG');
	  
   DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
   DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
END;
/

We are now ready to run again the same query and check whether it has been executed parallelly or it has been pre-empted by the DOPDOWNGRAD_RPL resource plan to use parallelism:

RDEV> alter system set resource_manager_plan='DOPDOWNGRAD_RPL';

System altered.

RDEV> show parameter resource_manager_plan

NAME                    TYPE        VALUE
----------------------- ----------- ----------------
resource_manager_plan   string      DOPDOWNGRAD_RPL

Let’s execute the same query using RDEV user. Remember that this user belongs to an oracle resource manager group with a parallel degree limit directive as shown above:

SQL> SELECT /*+ monitor */ count(1) from t1 where owner >= 'A';

  COUNT(1)
----------
   1000000

RDEV> @xpsimp

SQL_ID  b60wdaqys7wf9, child number 0
-------------------------------------
SELECT /*+ monitor */ count(1) from t1 where owner >= 'A'

Plan hash value: 3110199320
---------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes |   TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |       |       |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |     4 |       |      |            |
|   2 |   PX COORDINATOR       |          |       |       |       |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |     4 | Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |     4 | Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |  1000K|  3906K| Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| T1       |  1000K|  3906K| Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("OWNER">='A')

The absence of the Note line at the bottom of the execution plan represents, already, a first indication that the query has not been executed using parallelism. Indeed, this is confirmed via the corresponding SQL Monitoring report:

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  1                   
 Session             :  RDEV (273:24189)    
 SQL ID              :  b60wdaqys7wf9       
 SQL Execution ID    :  16777216            
 Execution Started   :  11/07/2019 09:02:53 
 First Refresh Time  :  11/07/2019 09:02:53 
 Last Refresh Time   :  11/07/2019 09:02:53 
 Duration            :  .414398s            
 Module/Action       :  SQL*Plus/-          
 Service             :  diiqazbx            
 Program             :  sqlplus.exe         
 DOP Downgrade       :  100%        --> here            
 Fetch Calls         :  1   

The ACTIVE type of the same SQL Monitoring report shows the reason:

Other Plan Line Statistics

   Min DOP after downgrade 1
         downgrade reason  352

Thanks to Yasin Baskan, I knew that the description of the downgrade reason codes differs from release to release. We need to use the following query in the corresponding database (12cR2 in the current case) to get the actual significance of each code:

-- +-----------------------------------|
-- query supplied by Yasin Baskan
-- SQL> desc x$qksxa_reason
--		 Name           Type
--		 -------------- ---------------|
--		 ADDR           RAW(8)
--		 INDX           NUMBER
--		 INST_ID        NUMBER
--		 CON_ID         NUMBER
--		 QKSXAREASONS   VARCHAR2(255)
-- +-----------------------------------|
col qksxareasons format a100
col indx         format 999
SELECT
    indx
   ,qksxareasons
FROM
    x$qksxa_reason
WHERE
    qksxareasons like '%DOP downgrade%';   	


INDX QKSXAREASONS
---- -----------------------------------------------------
 351 DOP downgrade due to adaptive DOP
 352 DOP downgrade due to resource manager max DOP
 353 DOP downgrade due to insufficient number of processes
 354 DOP downgrade because slaves failed to join

We finally, have found the real root cause of the simulated performance issue: 352 DOP downgrade due to resource manager max DOP

2. Back to the aim of the article

The first questions that came to my mind when I have started troubleshooting the DOP Downgrade real-life case were:

• Can I find any Oracle Resource manager related information in the gv$sql_monitor view ?
• Can I find historical (AWR) execution statistics of any active Oracle Resource manager actions ?

2.1. v$sql_monitor

As per regards to the first question, hopefully the gv$sql_monitor view contains 4 columns which seem to be strictly related to the Oracle resource manager as shown below with column name starting by RM_:

SQL> @desc gv$sql_monitor

 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 INST_ID                                            NUMBER
 KEY                                                NUMBER
 REPORT_ID                                          NUMBER
 STATUS                                             VARCHAR2(19)
 USER#                                              NUMBER
 USERNAME                                           VARCHAR2(128)
../..
 RM_LAST_ACTION                                     VARCHAR2(48)
 RM_LAST_ACTION_REASON                              VARCHAR2(128)
 RM_LAST_ACTION_TIME                                DATE
 RM_CONSUMER_GROUP                                  VARCHAR2(128)

Let’s see then what Oracle has stored in those four RM_ columns:

RDEV> SELECT
 decode(process_name, 'ora', 'QC', process_name) as process_name
 ,rm_consumer_group
 ,rm_last_action
 ,rm_last_action_reason
 ,rm_last_action_time
     FROM
        gv$sql_monitor
    WHERE 
        sql_id = 'b60wdaqys7wf9';

PROCES RM_CONSUMER_GROUP RM_LAST_ACTION RM_LAST_ACTION_REASO RM_LAST_ACTION_TIME
------ ----------------- -------------- -------------------- --------------------
QC     DOPDONWGRAD_RSG   {null}               {null}               {null}

Unfortunately, it looks like only the column RM_CONSUMER_GROUP is filled up by Oracle. The remaining RM_ columns are always null.

However, it looks like that whenever the column RM_CONSUMER_GROUP contains a value which is not equal to OTHER_GROUPS (like in the current example where RM_CONSUMER_GROUP= DOPDONWGRAD_RSG), we can infer that the current monitored query
is subject to a resource manager directive which might influence its execution.

For example, let’s execute the same query when connected with a different user (RDEV2) and see what information will be stored in the gv$sql_monitor view:

RDEV2> SELECT /*+ monitor */ count(1) from t1 where owner >= 'A'

COUNT(1)
----------
   1000000

SQL_ID  b60wdaqys7wf9, child number 2
-------------------------------------
SELECT /*+ monitor */ count(1) from t1 where owner >= 'A'
Plan hash value: 3110199320
--------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |       |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |        |      |            |
|   2 |   PX COORDINATOR       |          |       |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |  1000K|  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| T1       |  1000K|  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter("OWNER">='A')

Note
-----
   - Degree of Parallelism is 4 because of table property

RDEV2> SELECT  
   username
  ,decode(process_name, 'ora', 'QC', process_name) as process_name
  ,rm_consumer_group
  ,rm_last_action
  ,rm_last_action_reason
  ,rm_last_action_time
      FROM
        gv$sql_monitor
      WHERE 
        sql_id = 'b60wdaqys7wf9'
order by last_refresh_time;

USERNAME  PROCES  RM_CONSUMER_GROUP  RM_LAST_ACTION  RM_LAST_ACTION_REASO  RM_LAST_ACTION_TIME
-------- -------- ----------------- ---------------- -------------------- --------------------
RDEV      QC     DOPDONWGRAD_RSG    {null}           {null}                {null}
RDEV2     QC     OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p002   OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p000   OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p003   OTHER_GROUPS       {null}           {null}                {null}                 
{null}    p001   OTHER_GROUPS       {null}           {null}                {null}     

When the same query is executed by user RDEV2 there is no DOP downgrade because this user doesn’t belong to any user created resource manager group. There are two information backing this claim:

• User RDEV2 has OTHER_GROUPS as a default consumer group
• There is 4 PX server associated with the QC of user RDEV2

3.2. AWR and Oracle Resource manager
The second investigation was to see whether I can, a posteriori, diagnose that my query underwent a resource manager directive; particularly those concerning parallel degree limit. For that reason, I wrote the following script

SQL> @getConsGrpHist
SELECT
   begin_time
  ,execs 
  ,avg_etime
  ,avg_px
 -- ,trunc(avg_etime/decode(avg_px,0,1,avg_px) ,2) avg_px_time
  ,plan_name
  ,cpu_managed
  ,parallel   
FROM
   (SELECT 	   
        sn.begin_interval_time      begin_time
       ,st.executions_delta         execs 
       ,st.end_of_fetch_count_delta end_of_fetch 
       ,trunc(st.elapsed_time_delta/1e6/decode(st.executions_delta, 0, 1,st.executions_delta))   avg_etime
       ,round(st.px_servers_execs_delta/decode(st.executions_delta,0,1, st.executions_delta), 1) avg_px	  
       ,sr.plan_name
       ,sr.cpu_managed
       ,sr.parallel_execution_managed parallel    
    FROM 
         dba_hist_rsrc_plan sr
        ,dba_hist_snapshot  sn
        ,dba_hist_sqlstat   st
    WHERE
	sr.snap_id = sn.snap_id
    AND sn.snap_id = st.snap_id
    AND sn.begin_interval_time between to_date('&from_date', 'ddmmyyyy hh24:mi:ss')
				      and    to_date('&to_date'  , 'ddmmyyyy hh24:mi:ss')
    AND st.sql_id ='&sql_id'
    AND plan_name != 'DEFAULT_MAINTENANCE_PLAN'	
	) 
order by begin_time asc;      

SQL> @GetConsGrpHist
Enter value for from_date: 08112019 15:07:00
Enter value for to_date: 08112019 15:09:00
Enter value for sql_id: b60wdaqys7wf9

BEGIN_TIME                 EXECS  AVG_ETIME  AVG_PX PLAN_NAME       CPU_MANAGED PARALLEL
------------------------- ------ ---------- ------- --------------- ----------- ----------
08-NOV-19 03.07.08.369 PM      1          0       0 DOPDOWNGRAD_RPL ON          FULL
08-NOV-19 03.07.08.369 PM      1          0       0 DOPDOWNGRAD_RPL ON          FULL

Indeed, I can see that my query has used a non-default resource plan (DOPDOWNGRAD_RPL) in which a parallel statement queuing is enabled as indicated by the FULL value of the PARALLEL column.

Bottom-line

In this simple Note I outlined two main points:

• Show how to identify a DOP downgrade and its reason
• Show how to link a non-default Oracle resource manager to a SQL_ID by checking the value of the column RM_CONSUMER_GROUP from the gv$sql_monitor view
• Show that the remaining 3 resource manager columns (RM_LAST_ACTION, RM_LAST_ACTION_REASON, RM_LAST_ACTION_TIME) in the gv$sql_monitor view are, unfortunately, not maintained by Oracle
• Show how to look in AWR historical execution for queries subject to user defined Oracle Resource Manager directives

1 Comment »

  1. Great Mohamed as usual…very helpfull.

    Comment by Yahya — November 30, 2019 @ 1:18 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: