August | 2018 | Mohamed Houri’s Oracle Notes

August 17, 2018

Wrong utilisation of Result Cache

Filed under: result cache — hourim @ 5:43 pm

Still, not everyone is so in love with the Oracle result cache feature. Franck Pachot has already explained why here and here. In this article I will show you a practical real life case where result cache has been wrongly used. A couple of weeks ago I had to look at an annoying sporadic 5 to 10 min spike occurring randomly during the day. A simple ASH query during the spike shows high CPU consumption coupled with a latch free wait event popping up at the TOP. Using ASH again I found that almost all latch free wait events were due to the following TOP query:

1. Real life case

SQL> select sql_id, count(1)
    from gv$active_session_history
    where
        sample_time between to_date('03082018 10:00:00', 'ddmmyyyy hh24:mi:ss')
                    and     to_date('03082018 11:00:00', 'ddmmyyyy hh24:mi:ss')
    and event = 'latch free'
    group by sql_id
    order by 2 desc;

SQL_ID            COUNT(1)
--------------- ----------
3djqkyz0taafr           77 -->
                        15
9fxqwts01uzzw            6
43sh58j74rr0a            2
b7qrux84y1hna            1
dr3s0wjmn5h2s            1
g4k4tt8uh1tv6            1

The corresponding SQL monitoring report outlines the evidence:

SQL Text
------------------------------
SELECT
  /*+ RESULT_CACHE */
  {list_of_columns}
FROM table
WHERE 
  {list_of_predicates}
ORDER BY col1;

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  1                   
 SQL ID              :  3djqkyz0taafr       
 SQL Execution ID    :  21511509            
 Execution Started   :  08/01/2018 10:48:03 
 First Refresh Time  :  08/01/2018 10:48:15 
 Last Refresh Time   :  08/01/2018 10:48:21 
 Duration            :  18s                  
 Fetch Calls         :  1                

Global Stats
===========================================================================
| Elapsed |   Cpu   |    IO    |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
===========================================================================
|      18 |    0.05 |     0.86 |       17 |     1 |   1656 |  654 |   5MB |
===========================================================================

SQL Plan Monitoring Details (Plan Hash Value=289067707)
==========================================================================================================
| Id |            Operation            |            Name            | Execs |   Rows   | Activity Detail |
|    |                                 |                            |       | (Actual) |   (# samples)   |
==========================================================================================================
|  0 | SELECT STATEMENT                |                            |     1 |        0 |                 |
|  1 |   RESULT CACHE                  | gv2j746786jamb4s329xj0ur8g |     1 |        0 | latch free (18) |
|  2 |    FILTER                       |                            |     1 |     1540 |                 |
|  3 |     TABLE ACCESS BY INDEX ROWID | HIST_TABLE                 |     1 |     1540 |                 |
|  4 |      INDEX RANGE SCAN           | HIST_IDX                   |     1 |     1540 |                 |
==========================================================================================================

As you can see out of the 18 seconds of the total query response time, 18 were due to the wait event latch free. This latch free itself is caused by what seems to be a wrong utilisation of the result cache hint.

Of course that was the ideal moment to use the so-famous Tanel Poder latchprof.sql script, which, when launched, shows the following picture:

SQL> @latchprof sid,name,sqlid % "Result" 100000

-- LatchProf 2.02 by Tanel Poder ( http://www.tanelpoder.com )

       SID NAME                     SQLID               Held       Gets  Held % 
---------- ------------------------ ------------- ---------- ---------- ------- 
      1753 Result Cache: RC Latch   3djqkyz0taafr        248         62     .25 
       298 Result Cache: RC Latch   3djqkyz0taafr        151         45     .15 
        35 Result Cache: RC Latch   3djqkyz0taafr        148         32     .15 
      3671 Result Cache: RC Latch   3djqkyz0taafr        136         35     .14 
      2681 Result Cache: RC Latch   3djqkyz0taafr        130         34     .13 
       273 Result Cache: RC Latch   3djqkyz0taafr         94         18     .09 
      1238 Result Cache: RC Latch   6hbsjju24n8d1         56         56     .06

Not only did the latchprof.sql succeed to show 7 different sessions waiting on Result Cache: RC Latch wait event; but it outlined also that all those sessions were executing the same query already spotted above via ASH and SQL monitor report.

Close on the heels of the above latchprof.sql script, I executed the following query to check the content of result cache object view:

SQL> select
         to_char(dbms_sqltune.sqltext_to_signature(name, 1))
        , count(1)
     from
        gv$result_cache_objects
     group by
        to_char(dbms_sqltune.sqltext_to_signature(name, 1))
     having count(1) > 3
     order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
11688800737312996943                         106782 --> this is a portion of 3djqkyz0taafr
778588378770889215                              879
9760132213098432565                              62
13511637134398334555                              7
10994613278769629249                              7
13666841246362081009                              6
2234831540847838164                               5
16412641633620715561                              4

8 rows selected.

I don’t think that it is normal to see so many rows (106,782) for the same force matched cache_id into the result cache object memory. This is rather an indication of something trashing the result cache.

Another clue about this miss-utilisation of the result cache resides in the high number of invalidation for the same force matched name as shown below:

select 
    status
  , count(1)
  from
    gv$result_cache_objects
where 
   to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = '11688800737312996943'
  group by status;
  
  STATUS      COUNT(1)
--------- ----------
Invalid        77583
Published      30185
New	           1

At this stage of the investigation, there was enough low-level evidence to have some cause for concern about the result cache wrong utilisation done by this query. This is why I asked the developer to get rid of the

 /*+ result_cache */

hint, tracked back the new sql_id and confirmed the disappearance of the Result Cache: RC Latch wait event.

2. General case

For the sake of completeness I will try to show, below, how attractive the result cache pool looks like when it is very well used and when it isn’t:

exec dbms_result_cache.flush();

alter session set cursor_sharing=force;

declare
    x number;
begin
    for i in 1..&1 loop
    	execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
    end loop;
end;
/		
Enter value for 1: 100

PL/SQL procedure successfully completed.

All what I did is:

Flush the result cache memory
Force the cursor sharing in order to use system bind variable
Execute 100 times an exactly identical query using the same bind variable value at each execution

And here’s below how the result cache memory looks like in this particular ideal case:

 select
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
   ,count(1)
 from
   gv$result_cache_objects
 where
  type != 'Dependency'
 group by
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
 having count(1) >= 1
 order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
9830716796482200263                               1

select 
    status
  , scan_count 
  , count(1)
  from
    gv$result_cache_objects
where 
   to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = ‘9830716796482200263’
and type != 'Dependency'
group by 
    status
   ,scan_count
order by 3 desc
    ;    

STATUS    SCAN_COUNT   COUNT(1)
--------- ---------- ----------
Published         99          1

The presence of a single row in gv$result_cache_objects is a clear indication of the result cache being very well reused. The scan_count figure indicates that out of 100 executions we’ve initiated 99 scans in the cached result. This is a perfect use of the result cache feature.

But let’s now disturb a little bit this ideal setup by changing the above PL/SQL anonymous block so that cached result will not be always found and re-used because of different bind variable values used for the last 30 executions:

exec dbms_result_cache.flush();

declare
    x number;
begin
    for i in 1..&1 loop
     if i <= 70 then execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     else
       execute immediate 'select /*+ result_cache */ mgr from emp where sal >= '||2*i          
                  || 'and rownum=1' into x;	

     end if;
    end loop;
end;
/		
Enter value for 1: 100

PL/SQL procedure successfully completed.

And here’s below how the result cache memory looks like now:

 select
     to_char(dbms_sqltune.sqltext_to_signature(name, 1))
    ,count(1)
 from
    gv$result_cache_objects
 where
  type != 'Dependency'
 group by
    to_char(dbms_sqltune.sqltext_to_signature(name, 1))
 having count(1) >= 1
 order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
9830716796482200263                              31

select 
    status
  , scan_count 
  , count(1)
  from
    gv$result_cache_objects
where 
   to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = '9830716796482200263'
and type != 'Dependency'
group by 
    status
   ,scan_count
order by 3 desc;    

STATUS    SCAN_COUNT   COUNT(1)
--------- ---------- ----------
Published          0         30
Published         69          1

Since I have used 30 different bind variable values for the last 30 executions I have then failed to initiate any scan count into the result cache for these particular executions. However the first 70 executions have perfectly re-used the result cache 69 times (the first execution is the one that stored the result cache).

You might have probably already guessed that we can fail reusing a stored result cache even in the absence of any dependent object invalidation. It suffices, as shown via the above 30 particular query executions, to use a different bind variable value to produce a new not already cached result. But let’s now introduce a couple of invalidations in the above setup:

exec dbms_result_cache.flush();

declare
    x number;
begin
    for i in 1..&1 loop
     if i <= 50 then execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     elsif mod(i,3) = 0 then
      execute immediate 'alter table emp modify (sal default 42)';
      execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     else
      execute immediate 'select /*+ result_cache */ mgr from emp where sal >= 1000          
                  and rownum=1' into x;	
     end if;
    end loop;
end;
/	
Enter value for 1: 100

PL/SQL procedure successfully completed.

All what I did above is launch a query 100 times with:

The first 50 executions using the same bind variable value and thereby using the result cache
The next 50 executions flip flopping between reusing a result cache and failing to reuse it because of a DDL on the result cache dependent table

select
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
   ,count(1)
from
   gv$result_cache_objects
where
   type != 'Dependency'
group by
   to_char(dbms_sqltune.sqltext_to_signature(name, 1))
   having count(1) >= 1
order by 2 desc;

TO_CHAR(DBMS_SQLTUNE.SQLTEXT_TO_SIGNATUR   COUNT(1)
---------------------------------------- ----------
9830716796482200263                              18

SQL> @getRCStatus.sql
Enter value for force_match_sign: 9830716796482200263

STATUS                    SCAN_COUNT        CNT
------------------------- ---------- ----------
Invalid                           49          1
Invalid                            2         16
Published                          1          1
                                     ----------
Total cache creation                         18

The output of the query above shows that the 100 executions have been decomposed as follows:

1 execution (the first one) + 49 hits of the result cache for the next 49 executions
16 invalidations with scan count = 2 meaning that 32 executions hit the cache before being invalidated
And since there are 18 rows (16 +1+1) in the result cache this means that they inevitably correspond to 18 result cache creations : the first execution and the 17 other one following each of the 17 invalidations.

And here we are done: 1+ 49 + 32 + 18 make the 100 executions.

Oracle gives also a way to retrieve statistics on the cache memory usage via the following package call:

SQL> exec dbms_result_cache.Memory_Report
R e s u l t   C a c h e   M e m o r y   R e p o r t
[Parameters]
Block Size          = 1K bytes
Maximum Cache Size  = 10496K bytes (10496 blocks)
Maximum Result Size = 524K bytes (524 blocks)
[Memory]
Total Memory = 205320 bytes [0.016% of the Shared Pool]
... Fixed Memory = 12592 bytes [0.001% of the Shared Pool]
... Dynamic Memory = 192728 bytes [0.015% of the Shared Pool]
....... Overhead = 159960 bytes
....... Cache Memory = 32K bytes (32 blocks)
........... Unused Memory = 13 blocks
........... Used Memory = 19 blocks
............... Dependencies = 1 blocks (1 count)
............... Results = 18 blocks
................... SQL     = 1 blocks (1 count)
................... Invalid = 17 blocks (17 count)

PL/SQL procedure successfully completed.

This output doesn’t say anything else then what the following query lists:

select 
   status 
  ,count(*) 
  ,round(max(scan_count)) max_scan_cnt
  ,round(sum(block_count)) tot_blk_cnt
from 
  gv$result_cache_objects
where
  to_char(dbms_sqltune.sqltext_to_signature(name, 1)) = '9830716796482200263'
and type = 'Result'
group by 
  status; 

STATUS      COUNT(*) MAX_SCAN_CNT TOT_BLK_CNT
--------- ---------- ------------ -----------
Invalid           17           49          17
Published          1            1           1

There are 18 result cache creations : Results = 18 blocks
One result cache has been created and published : SQL = 1 blocks (1 count)
17 result caches have been invalidated : Invalid = 17 blocks (17 count)

3. Summary

The SQL query result cache can introduce a serious performance threat if not correctly used. In the same vein as you want to parse once and execute many, you would also like to store a result cache once and reuse it many times. To do so you need to take care that the query you intend to cache the result will almost always return the same result (use the same bind variable values). You will also take care to keep the result in the result cache pool valid as long as possible. To do so you will try to avoid DMLing or DDLing the SQL query dependent objects.

Mohamed Houri’s Oracle Notes

August 17, 2018

Wrong utilisation of Result Cache

Follow Blog via Email

Categories

Blog Stats

Blogs I Follow

Archives