Mohamed Houri’s Oracle Notes

September 28, 2016

AWR and superfluous historical statistics

Filed under: ASH — hourim @ 6:11 pm

An active sql_id is subject to a capture and a load into ASH (gv$active_session_history). As long as this sql_id is present in ASH there is a 1/10 chance for it to be captured into AWR persisted historical tables as well. In addition to the ASH gv$active_session_history and its AWR alter-ego dba_hist_active_sess_history table, performance and tuning specialists are extensively using the dba_hist_sqlstat table into which snapshots of gv$sql are periodically sent by the MMON Lite process via the SGA-ASH circular buffer.

One of the queries I am using against this table is the following one which I am referring to it as histstats.sql (I think I have originally hijacked from the internet):

SELECT 
sn.snap_id,
plan_hash_value,
st.sql_profile,
executions_delta execs,
TRUNC(elapsed_time_delta/1e6/DECODE(executions_delta, 0, 1, executions_delta)) avg_etime,
  ROUND(disk_reads_delta    /DECODE(executions_delta,0,1, executions_delta),1) avg_pio,
  ROUND(buffer_gets_delta   /DECODE(executions_delta,0,1, executions_delta), 1) avg_lio ,
  ROUND(rows_processed_delta/DECODE(executions_delta,0, 1, executions_delta), 1) avg_rows
FROM 
  dba_hist_sqlstat st,
  dba_hist_snapshot sn
WHERE st.snap_id        = sn.snap_id
AND sql_id              = '&sql_id'
AND begin_interval_time >= to_date('&date','ddmmyyyy')
ORDER BY 1 ASC;

That’s said how would you read and interpret the following output of the above query taken from a running system for a particular sql_id?

SNAP_ID     PLAN_HASH_VALUE SQL_PROFILE                     EXECS AVG_ETIME AVG_LIO 
----------  --------------- ------------------------------ ----- ---------- ----------      
30838        726354567                                      0       7227    3945460      
30838        726354567                                      0       7227    3945460      
30839       4211030025      prf_2yhzvghb06vh4_4211030025    1          3      28715      
30839       4211030025      prf_2yhzvghb06vh4_4211030025    1          3      28715      
30839        726354567                                      0       7140    5219336      
30839        726354567                                      0       7140    5219336 
30840        726354567                                      0       7203    9389840      
30840        726354567                                      0       7203    9389840      
30840       4211030025      prf_2yhzvghb06vh4_4211030025    0       2817    7831649      
30840       4211030025      prf_2yhzvghb06vh4_4211030025    0       2817    7831649      
30841        726354567                                      0       7192    5200201      
30841        726354567                                      0       7192    5200201      
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            
30842       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0      
30842       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0      
30842        726354567                                      0       4956    3183667      
30842        726354567                                      0       4956    3183667  

Or, to make things less confused, how would you interpret the different rows of snapshot 30841 reproduced below?

SNAP_ID     PLAN_HASH_VALUE SQL_PROFILE                     EXECS AVG_ETIME AVG_LIO 
----------  --------------- ------------------------------ ----- ---------- ----------      
30841        726354567                                      0       7192    5200201      
30841        726354567                                      0       7192    5200201      
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            

What do those zeros values in the line with a not-null SQL Profile actually mean?

The answer to this question resides in the way Oracle dumps statistics of a SQL statement from standard memory into AWR tables. When an active sql_id having multiple child cursors in gv$sql, the MMON Lite process will (if the sql_id qualify for the capture of course) dump into the dba_hist_sqlstat view statistics of all corresponding distinct child cursors including those not used by the active sql_id at the capture time (even the statistics of non-shareable cursors are dumped into this view provided these cursors are present in gv$sql at the capture moment).

For example in the above list you can see that we have 4 execution statistics at snapshot 30841 of which the two first ones are actually using effectively the plan_hash_value 726354567. According to their elapsed time (7192) these executions did span multiple snapshots (this is a serial execution plan by the way). But the remaining lines, those showing a SQL Profile, are in fact superfluous and confusing. They have been captured only because they correspond to the presence of a second shareable child cursor (plan_hash_value 4211030025) in gv$sql at the moment of the capture (snapshot 30841). One of the indication that they are superfluous is the zeros statistics in their average logical I/O, physical I/O, executions and number of rows processed.

Now that the scene has been set we need a reproducible example. For that purpose any SQL statement having multiple child cursors in gv$sql will do the trick. An easy way of engineering such a case is to use Adaptive Cursor Sharing (in 12.1.0.2.0) among other different scenarios:

The model

INSERT INTO t1
SELECT level n1 ,
  CASE
    WHEN level = 1
    THEN 1
    WHEN level > 1
    AND level <= 101
    THEN 100
    WHEN level > 101
    AND level <= 1101
    THEN 1000
    WHEN level > 10001
    AND level <= 11000
    THEN 10000
    ELSE 1000000
  END n2
FROM dual
  CONNECT BY level < 1200150;
  
CREATE INDEX t1_i1 ON t1(n2);

BEGIN
  dbms_stats.gather_table_stats 
         (user 
         ,'t1' 
         ,method_opt => 'for all columns size auto'
         ,cascade => true 
         ,estimate_percent => dbms_stats.auto_sample_size 
         );
END;
/ 

Generating multiple child cursors using ACS

var ln2 number;
exec :ln2 := 100;

SQL> select count(1) FROm t1 WHERE n2 <= :ln2;

  COUNT(1)
----------
       101

exec :ln2 := 1000000
SQL> select count(1) FROm t1 WHERE n2 <= :ln2;

  COUNT(1)
----------
   1200149

SQL_ID  9sp9wvczrvpty, child number 0
-------------------------------------
select count(1) FROm t1 WHERE n2 <= :ln2

Plan hash value: 2432955788

---------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes |
---------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |
|   1 |  SORT AGGREGATE   |       |     1 |     3 |
|*  2 |   INDEX RANGE SCAN| T1_I1 |   218 |   654 |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"<=:LN2)

In order for the above sql_id to be bind aware and get a new optimal execution plan at each execution, it needs to be executed once again (more details can be found here):

SQL> select count(1) FROm t1 WHERE n2 <= :ln2;

  COUNT(1)
----------
   1200149

SQL_ID  9sp9wvczrvpty, child number 1
-------------------------------------
select count(1) FROm t1 WHERE n2 <= :ln2

Plan hash value: 3724264953
----------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |
----------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |
|*  2 |   TABLE ACCESS FULL| T1   |  1191K|  3489K|
----------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("N2"<=:LN2)

Notice now that a new child cursor n°1 has been generated while the existing child cursor n° 0 switched to a non-shareable status. In order to wake it up, we need to run again the same query at the corresponding bind variable value:

exec :ln2 := 100;

select count(1) FROm t1 WHERE n2 <= :ln2;

SQL_ID  9sp9wvczrvpty, child number 2
-------------------------------------
select count(1) FROm t1 WHERE n2 <= :ln2

Plan hash value: 2432955788
---------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes |
---------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |
|   1 |  SORT AGGREGATE   |       |     1 |     3 |
|*  2 |   INDEX RANGE SCAN| T1_I1 |   218 |   654 |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 – access("N2"<=:LN2)

Finally, we have achieved the initial goal of having multiple execution plans in gv$sql for the above SQL statement:

SELECT 
  sql_id ,
  child_number ,
  first_load_time ,
  last_load_time ,
  plan_hash_value ,
  executions execs ,
  rows_processed ,
  trunc(buffer_gets/executions) lio ,
  is_shareable
FROM gv$sql
WHERE sql_id = '9sp9wvczrvpty' 
--and is_shareable = 'Y'    
order by last_load_time desc; 

SQL_ID        CHILD_NUMBER FIRST_LOAD_TIME     LAST_LOAD_TIME      PLAN_HASH_VALUE  EXECS ROWS_PROCESSED        LIO I
------------- ------------ ------------------- ------------------- --------------- ------ -------------- ---------- -
9sp9wvczrvpty            0 2016-09-28/07:19:21 2016-09-28/07:19:21      2432955788      2              2       1225 N
9sp9wvczrvpty            1 2016-09-28/07:19:21 2016-09-28/07:20:44      3724264953      1              1       2222 Y
9sp9wvczrvpty            2 2016-09-28/07:19:21 2016-09-28/07:22:03      2432955788      1              1          3 Y

Notice that the sql_id (9sp9wvczrvpty) has three child cursors of which child cursor 0 is not shareable anymore. There is still nothing sent into AWR table as shown below:

SQL>  @histstats
Enter value for sql_id: 9sp9wvczrvpty
Enter value for date: 27092016

no rows selected

Forcing AWR to dump SQL statistics
In order to make Oracle dumping statistics of this SQL statement we are going to force a manual AWR snapshot capture by means of the following package call:

SQL> exec dbms_workload_repository.create_snapshot;  

The call will create a snapshot and dump the above content of gv$sql into dba_hist_sqlstat table as shown below:

SQL>  @histstats
Enter value for sql_id: 9sp9wvczrvpty
Enter value for date: 27092016

   SNAP_ID SQL_ID        PLAN_HASH_VALUE SQL_PROFILE                     EXECS  AVG_ETIME    AVG_LIO  AVG_PIO  AVG_ROWS
---------- ------------- --------------- ------------------------------ ------ ---------- ---------- -------- ---------
      4305 9sp9wvczrvpty      2432955788                                     3          0        817        0         1
      4305 9sp9wvczrvpty      3724264953                                     1          0       2222        0         1

That what was clearly expected: Oracle has captured into dba_hist_sqlstat table 4 executions at plan_hash_value 2432955788 and 1 execution at plan_hash_value 3724264953.

Now that we have multiple child cursors present in gv$sql, if AWR is to dump the sql_id it will then dump all its child cursor even those not used during the captured time. For example if I run again the same query with bind variable (ln2= 1000000) Oracle will use the plan with plan_hash_value 3724264953; but it will, nevertheless, dump statistics of the other non-used plan_hash_value into dba_hist_sqlstat as well.

exec :ln2 := 1000000

select count(1) FROm t1 WHERE n2 <= :ln2;

exec dbms_workload_repository.create_snapshot;

SQL> @histstats
Enter value for sql_id: 9sp9wvczrvpty
Enter value for date: 27092016

SNAP_ID SQL_ID        PLAN_HASH_VALUE SQL_PROFILE    EXECS  AVG_ETIME    AVG_LIO  AVG_PIO  AVG_ROWS
----- ------------- --------------- ------------- ------ ---------- ---------- -------- ---------
 4305 9sp9wvczrvpty      2432955788                    3          0        817        0         1
 4305 9sp9wvczrvpty      3724264953                    1          0       2222        0         1
 
 4306 9sp9wvczrvpty      2432955788                    0          0          0        0         0
 4306 9sp9wvczrvpty      3724264953                    1          0       2222        0         1

That’s it. Notice how, as expected, Oracle has dumped two rows for the new snapshot 4306 of which the first one with plan_hash_value (2432955788) is superfluous and has been captured in AWR tables only because it was present in gv$sql at the snapshot capture time.

Now that we know that Oracle is dumping all plan_hash_value present in gv$sql at the snapshot capture time even those not used during this snapshot, instead of having the superfluous lines appearing in the historical execution statistics I have added a minor where clause to the above extensively used script to obtain this:

SELECT
 *
FROM
 (SELECT
   sn.snap_id
   ,plan_hash_value
   ,sql_profile
   ,executions_delta execs
   ,trunc(elapsed_time_delta/1e6/decode(executions_delta, 0, 1, executions_delta)) avg_etime
   ,round(disk_reads_delta/decode(executions_delta,0,1, executions_delta),1) avg_pio
   ,round(buffer_gets_delta/decode(executions_delta,0,1, executions_delta), 1) avg_lio
   ,round(px_servers_execs_delta/decode(executions_delta,0,1, executions_delta), 1) avg_px
   ,round(rows_processed_delta/decode(executions_delta,0, 1, executions_delta), 1) avg_rows
  FROM
     dba_hist_sqlstat st,
     dba_hist_snapshot sn
  WHERE st.snap_id = sn.snap_id
  AND sql_id = '&sql_id'
  AND begin_interval_time > to_date('&from_date','ddmmyyyy')
 )
 WHERE avg_lio != 0 – added clause
 ORDER by 1 asc;
SQL> @histstats2
Enter value for sql_id: 9sp9wvczrvpty
Enter value for from_date: 27092016

SNAP_ID PLAN_HASH_VALUE SQL_PROFILE  EXECS  AVG_ETIME  AVG_PIO    AVG_LIO     AVG_PX  AVG_ROWS
------- --------------- ----------- ------ ---------- -------- ---------- ---------- ---------
   4305      2432955788                  3          0        0      817.7          0         1
   4305      3724264953                  1          0        0       2222          0         1

   4306      3724264953                  1          0        0       2222          0         1

The above added where clause excludes SQL statements with 0 logical I/O.

But wait a moment!!!

This might, at the same time, exclude vital information from the historical statistics. For example the following situation will not consume any logical I/O but can spent a lot time locked waiting for a resource to be released:

SQL-1 > select * from t1 where rownum <=  3 for update;

        N1         N2
---------- ----------
      1825    1000000
      1826    1000000
      1827    1000000

SQL-2> lock table t1 in exclusive mode;  

The second session will obviously hang. Wait a couple of seconds, go back to session , issue a rollback and watch out the new situation of session n°2

SQL-1> rollback;

SQL-2> lock table t1 in exclusive mode;

Table(s) Locked.

All what remains to do is to take a manual snapshot and check out how many logical I/O have been consumed by the SQL statement which has tried to lock the table in exclusive mode:

SQL> exec dbms_workload_repository.create_snapshot;

select sql_id
from gv$sql where sql_text like '%lock table t1%'
and sql_text not like '%v$sql%';

SQL_ID
-------------
a9nb52un8wqf4

SQL> @histstats
Enter value for sql_id: a9nb52un8wqf4
Enter value for date: 27092016

SNAP_ID SQL_ID        PLAN_HASH_VALUE SQL_PROFILE  EXECS  AVG_ETIME    AVG_LIO  AVG_PIO  AVG_ROWS
------- ------------- --------------- ----------- ------ ---------- ---------- -------- ---------
   4308 a9nb52un8wqf4               0                  1        159          0        0         0

That’s how a SQL can take a long time to complete while consuming 0 Logical I/O. This is why I have added a second extra where clause to the histstats2 script as shown below:

SELECT 
*
FROM
  (SELECT
    sn.snap_id ,
    plan_hash_value ,
    executions_delta execs ,
TRUNC(elapsed_time_delta/1e6/DECODE(executions_delta, 0, 1, executions_delta)) avg_etime ,
    ROUND(disk_reads_delta      /DECODE(executions_delta,0,1, executions_delta),1) avg_pio ,
    ROUND(buffer_gets_delta     /DECODE(executions_delta,0,1, executions_delta), 1) avg_lio ,
    ROUND(px_servers_execs_delta/DECODE(executions_delta,0,1, executions_delta), 1) avg_px ,
    ROUND(rows_processed_delta  /DECODE(executions_delta,0, 1, executions_delta), 1) avg_rows
  FROM 
    dba_hist_sqlstat st,
    dba_hist_snapshot sn
  WHERE st.snap_id        = sn.snap_id
  AND sql_id              = '&sql_id'
  AND begin_interval_time > to_date('&from_date','ddmmyyyy')
  )
WHERE avg_lio != 0 
     OR (avg_lio   =0 AND avg_etime > 0) 
 ORDER by 1 asc;  
     

This new script (histats3.sql), when applied to the two sql_id we have investigated in this post gives respectively:

SQL> @HistStats3
Enter value for sql_id: 9sp9wvczrvpty
Enter value for from_date: 27092016

   SNAP_ID PLAN_HASH_VALUE  EXECS  AVG_ETIME  AVG_PIO    AVG_LIO     AVG_PX  AVG_ROWS
---------- --------------- ------ ---------- -------- ---------- ---------- ---------
      4305      2432955788      3          0        0      817.7          0         1
      4305      3724264953      1          0        0       2222          0         1
      4306      3724264953      1          0        0       2222          0         1

SQL> @HistStats3
Enter value for sql_id: a9nb52un8wqf4
Enter value for from_date: 27092016

   SNAP_ID PLAN_HASH_VALUE  EXECS  AVG_ETIME  AVG_PIO    AVG_LIO     AVG_PX  AVG_ROWS
---------- --------------- ------ ---------- -------- ---------- ---------- ---------
      4308               0      1        159        0          0          0         0

It goes without saying that I will be very grateful to have feedback about the added where clause. I might have neglected a situation in which both Logical I/O and elapsed time are not greater than 0 and that are important to appear in the historical statistics.

Last but not least you might have already pointed out the apparition of the AVG_PX column in the above output? That will concern the next article about ASH-AWR statistics for parallel queries. Stay tuned I have a nice example from a running system to share with you and where this column reveals to be very helpful. I may also show a version of the script which includes the end_of_fetch_count column which reveals also to be a valuable information to exploit when trying to understand average elapsed time of queries coming from a web-service and where a time out has been implemented.

1 Comment »

  1. Hi Mohamed,

    It seems the HistStats query needs a join on instance_number in order to work properly on RAC.

    Good post!

    Regards,

    Joaquín Gonzalez

    Comment by Joaquín González — September 29, 2016 @ 6:31 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: