Mohamed Houri’s Oracle Notes

December 21, 2020

Disk file operations I/O

Filed under: Oracle — hourim @ 9:58 am

When Disk file operations I/O wait event starts showing at the top of the AWR Top 10 Foreground Events by Total Wait Time it can consume a lot of CPU and freeze your database. Since I have gone through that myself a couple of weeks ago, I thought it might be interesting to write a succinct blog article explaining how to find and fix the root cause of this non-common wait event.

When I started investigating the real-life case all that I have on my hand was the p1 and p3 parameters of this wait event:

  •  p1=8 corresponds to wait for miscellaneous io (ftp, block dump, passwd file)
  •  p3=8 corresponds to any other type of file

To say the least I have been unfortunate with this number.

As I was searching online, I came across a forum question where one of the contributors was pointing to an audit file type and operation. And from there I built the following simple model

SQL> show parameter audit

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
audit_file_dest                      string      /u01/app/oracle/product/19c/dbhome_1/rdbms/audit                                                
audit_sys_operations                 boolean     TRUE
audit_syslog_level                   string
audit_trail                          string      OS
unified_audit_common_systemlog       string
unified_audit_sga_queue_size         integer     1048576
unified_audit_systemlog              string
AUDIT ALL BY c##test BY ACCESS;
AUDIT SELECT TABLE, UPDATE TABLE, INSERT TABLE, DELETE TABLE BY c##test BY ACCESS;
AUDIT EXECUTE PROCEDURE BY c##test BY ACCESS;
create table t1 as select rownum n1, mod(rownum,2) n2, lpad('x',10) vc from dual connect by level <=1e5;

create or replace procedure p1
as
   cursor c1 is
 	select *
    from t1;
begin
  for r1 in c1
  loop
    update t1 set vc = 'yz' where n1 = r1.n1;
    insert into t1 values (r1.n1, r1.n2, r1.vc);
    delete from t1 where n1=r1.n1 and n2=r1.n2 and vc =r1.vc;
   end loop;
commit;
end p1;

I’ve set the audit trail parameter at OS level and I managed to audit several DML operations done by the c##test user.

All that I have to do now to reproduce the Disk file operations I/O wait event on the p1 and p3 values set to 8 is to execute the p1 procedure as shown below

  SQL> exec p1();

While the session executing the p1 procedure (sid=205) was running I snapped it to see what it was doing behind the scene

SQL> @snapper all 5 1 205
Sampling SID 205 with interval 5 seconds, taking 1 snapshots...

-- Session Snapper v4.30 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) 
- Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! 

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                            ,         DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    205, C##TEST   , STAT, opened cursors cumulative            ,          1859,     188.24,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, recursive calls                      ,          1866,     188.95,         ,             ,          ,           ,     1.28ms recursive CPU per recursive call
    205, C##TEST   , STAT, recursive cpu usage                  ,           238,       24.1,         ,             ,          ,           ,      2.38s total recursive CPU usage
    205, C##TEST   , STAT, session logical reads                ,        413327,     41.85k,         ,             ,          ,           ,    414.55k total buffer visits
    205, C##TEST   , STAT, CPU used by this session             ,           241,       24.4,         ,             ,          ,           ,      2.41s total CPU used by this session
    205, C##TEST   , STAT, user I/O wait time                   ,             6,        .61,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, non-idle wait time                   ,             6,        .61,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, non-idle wait count                  ,          1858,     188.14,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, physical read total IO requests      ,             1,         .1,         ,             ,          ,           ,     16.38k bytes per request
    205, C##TEST   , STAT, physical read total bytes            ,         16384,      1.66k,         ,             ,          ,           ,       8.94 per execution
    205, C##TEST   , STAT, cell physical IO interconnect bytes  ,         16384,      1.66k,         ,             ,          ,           ,       8.94 per execution
    205, C##TEST   , STAT, db block gets                        ,          1911,      193.5,         ,             ,          ,           ,       1.04 per execution
    205, C##TEST   , STAT, db block gets from cache             ,          1911,      193.5,         ,             ,          ,           ,       1.04 per execution
    205, C##TEST   , STAT, db block gets from cache (fastpath)  ,           644,      65.21,         ,             ,          ,           ,        .35 per execution
    205, C##TEST   , STAT, consistent gets                      ,        412820,      41.8k,         ,             ,          ,           ,     225.34 per execution
    205, C##TEST   , STAT, consistent gets from cache           ,        412811,      41.8k,         ,             ,          ,           ,     225.33 per execution
    205, C##TEST   , STAT, consistent gets pin                  ,        412811,      41.8k,         ,             ,          ,           ,     225.33 per execution
    205, C##TEST   , STAT, consistent gets pin (fastpath)       ,        412811,      41.8k,         ,             ,          ,           ,     225.33 per execution
    205, C##TEST   , STAT, consistent gets direct               ,             9,        .91,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, logical read bytes from cache        ,    3397402624,    344.01M,         ,             ,          ,           ,      1.85M per execution
    205, C##TEST   , STAT, physical reads                       ,             2,         .2,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, physical reads direct                ,             2,         .2,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, physical read IO requests            ,             1,         .1,         ,             ,          ,           ,     16.38k bytes per request
    205, C##TEST   , STAT, physical read bytes                  ,         16384,      1.66k,         ,             ,          ,           ,       8.94 per execution
    205, C##TEST   , STAT, db block changes                     ,          3749,     379.61,         ,             ,          ,           ,       2.05 per execution
    205, C##TEST   , STAT, change write time                    ,            49,       4.96,         ,             ,          ,           ,        .03 per execution
    205, C##TEST   , STAT, free buffer requested                ,          1265,     128.09,         ,             ,          ,           ,        .69 per execution
    205, C##TEST   , STAT, dirty buffers inspected              ,            10,       1.01,         ,             ,          ,           ,        .01 per execution
    205, C##TEST   , STAT, pinned buffers inspected             ,             1,         .1,         ,             ,          ,           ,  .55*10^-3 per execution
    205, C##TEST   , STAT, hot buffers moved to head of LRU     ,            65,       6.58,         ,             ,          ,           ,        .04 per execution
    205, C##TEST   , STAT, free buffer inspected                ,           123,      12.45,         ,             ,          ,           ,        .07 per execution
    205, C##TEST   , STAT, switch current to new buffer         ,          1240,     125.56,         ,             ,          ,           ,        .68 per execution
    205, C##TEST   , STAT, switch current caused by our pin     ,          1240,     125.56,         ,             ,          ,           ,        .68 per execution
    205, C##TEST   , STAT, shared hash latch upgrades - no wait ,           726,      73.51,         ,             ,          ,           ,         .4 per execution
    205, C##TEST   , STAT, calls to kcmgcs                      ,          3714,     376.07,         ,             ,          ,           ,       2.03 per execution
    205, C##TEST   , STAT, calls to kcmgas                      ,             4,        .41,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, calls to get snapshot scn: kcmgss    ,          1855,     187.83,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, redo entries                         ,          1856,     187.93,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, redo size                            ,        521376,     52.79k,         ,             ,          ,           ,          ~ bytes per user commit
    205, C##TEST   , STAT, redo ordering marks                  ,             4,        .41,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, file io wait time                    ,         22467,      2.27k,         ,             ,          ,           ,    22.47ms bad guess of IO wait time per IO request
    205, C##TEST   , STAT, Number of read IOs issued            ,             1,         .1,         ,             ,          ,           ,  .55*10^-3 per execution
    205, C##TEST   , STAT, undo change vector size              ,        182320,     18.46k,         ,             ,          ,           ,      99.52 per execution
    205, C##TEST   , STAT, no work - consistent read gets       ,        406064,     41.12k,         ,             ,          ,           ,     221.65 per execution
    205, C##TEST   , STAT, table scans (long tables)            ,          1231,     124.65,         ,             ,          ,           ,        .67 per execution
    205, C##TEST   , STAT, table scan rows gotten               ,     123963294,     12.55M,         ,             ,          ,           ,     67.67k per execution
    205, C##TEST   , STAT, table scan disk non-IMC rows gotten  ,     123898306,     12.55M,         ,             ,          ,           ,     67.63k per execution
    205, C##TEST   , STAT, table scan blocks gotten             ,        404643,     40.97k,         ,             ,          ,           ,     220.88 per execution
    205, C##TEST   , STAT, heap block compress                  ,             6,        .61,         ,             ,          ,           ,          0 per execution
    205, C##TEST   , STAT, HSC Heap Segment Block Changes       ,          1853,     187.63,         ,             ,          ,           ,       1.01 per execution
    205, C##TEST   , STAT, session cursor cache hits            ,          1840,     186.31,         ,             ,          ,           ,      1.84k softparses avoided thanks to cursor cache
    205, C##TEST   , STAT, buffer is pinned count               ,          1222,     123.74,         ,             ,          ,           ,        .29 % buffer gets avoided thanks to buffer pin caching
    205, C##TEST   , STAT, buffer is not pinned count           ,          2446,     247.68,         ,             ,          ,           ,       1.34 per execution
    205, C##TEST   , STAT, execute count                        ,          1832,      185.5,         ,             ,          ,           ,          ~ executions per parse
    205, C##TEST   , TIME, PL/SQL execution elapsed time        ,         11791,     1.19ms,      .1%, [          ],          ,           ,
    205, C##TEST   , TIME, DB CPU                               ,       2460784,   249.17ms,    24.9%, [@@@       ],          ,           ,
    205, C##TEST   , TIME, sql execute elapsed time             ,       8059797,   816.12ms,    81.6%, [######### ],          ,           ,
    205, C##TEST   , TIME, DB time                              ,       8059797,   816.12ms,    81.6%, [######### ],          ,           ,      18.39 % unaccounted time
    205, C##TEST   , WAIT, Disk file operations I/O             ,         46307,     4.69ms,      .5%, [          ],      1828,      185.1,    25.33us average wait ----------> here
    205, C##TEST   , WAIT, ASM IO for non-blocking poll         ,             8,      .81us,      .0%, [          ],         1,         .1,        8us average wait

--  End of Stats snap 1, end=2020-12-20 09:02:36, seconds=9.9

------------------------------------------------------------------------------------------
  ActSes   %Thread | INST | SQL_ID          | SQL_CHILD | EVENT          | WAIT_CLASS
------------------------------------------------------------------------------------------
     .48     (48%) |    1 | fnbt9ts7bsqzn   | 0         | ON CPU         | ON CPU
     .48     (48%) |    1 | 7wv0zz7965u54   | 0         | ON CPU         | ON CPU
     .03      (3%) |    1 | 1nf0v48kt3a30   | 0         | ON CPU         | ON CPU
     .03      (3%) |    1 | 8myfvr1fb8gju   | 0         | ON CPU         | ON CPU

--  End of ASH snap 1, end=2020-12-20 09:02:36, seconds=5, samples_taken=40, AAS=1

SQL> @snapper ash 5 1 205
Sampling SID 205 with interval 5 seconds, taking 1 snapshots...

-- Session Snapper v4.30 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) 
- Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet!
----------------------------------------------------------------------------------------------------
  ActSes   %Thread | INST | SQL_ID          | SQL_CHILD | EVENT                    | WAIT_CLASS
----------------------------------------------------------------------------------------------------
     .46     (46%) |    1 | fnbt9ts7bsqzn   | 0         | ON CPU                   | ON CPU
     .46     (46%) |    1 | 7wv0zz7965u54   | 0         | ON CPU                   | ON CPU
     .03      (3%) |    1 | 1nf0v48kt3a30   | 0         | ON CPU                   | ON CPU
     .03      (3%) |    1 | fnbt9ts7bsqzn   | 0         | Disk file operations I/O | User I/O ----------> here
     .03      (3%) |    1 | 8myfvr1fb8gju   | 0         | ON CPU                   | ON CPU

As you can see there is a certain proportion of time spent waiting on Disk file operations I/O. And, naturally, the p1 and p3 values, in this case, are respectively 8 and 8 as the followings prove:

SQL>  @ashtopSID
Enter value for sid: 205
Enter value for from: 12-20 09:00:00
Enter value for to: 12-20 09:15:00

  Total                                                                                                                                                                        
  Seconds     AAS %This   SQL_ID        SQL_OPNAME      TOP_LEVEL_SQL EVENT2                     P1TEXT           P1 P3TEXT        P3 
--------- ------- ------- ------------- --------------- ------------- -------------------------- --------------- --- ----------- ---- 
      309      .3   45% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
      306      .3   44% | fnbt9ts7bsqzn UPDATE          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
       24      .0    3% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
       13      .0    2% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju db file sequential read    file#             1 blocks         1 
        8      .0    1% | fnbt9ts7bsqzn UPDATE          8myfvr1fb8gju db file sequential read    file#             1 blocks         1 
        7      .0    1% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju Disk file operations I/O   FileOperation     8 filetype       8 ----------> here
        6      .0    1% | fnbt9ts7bsqzn UPDATE          8myfvr1fb8gju Disk file operations I/O   FileOperation     8 filetype       8 ----------> here
        5      .0    1% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju Disk file operations I/O   FileOperation     8 filetype       8 ----------> here
        3      .0    0% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju db file scattered read     file#             1 blocks         3 
        3      .0    0% |                               8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
        2      .0    0% | 0jjc60pmrntdv SELECT          8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8 
        1      .0    0% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju ON CPU                     file#             1 blocks         1 
        1      .0    0% | 1nf0v48kt3a30 INSERT          8myfvr1fb8gju db file sequential read    file#             5 blocks         1 
        1      .0    0% | 7wv0zz7965u54 DELETE          8myfvr1fb8gju resmgr:cpu quantum         location          3                0 
        1      .0    0% | 8myfvr1fb8gju PL/SQL EXECUTE  8myfvr1fb8gju ON CPU                     FileOperation     8 filetype       8

Interestingly ASH shows that the CPU consumed by Oracle is also related to the manipulation of files of type 8. I don’t know really how to correctly interpret these p1 and p3 values in the context of CPU


select decode(event,null, 'on cpu', event), p1,p3, count(1)
    from dba_hist_active_sess_history
    where    sample_time between to_date('20122020 09:00:00', 'ddmmyyyy hh24:mi:ss')
                and     to_date('20122020 09:15:00', 'ddmmyyyy hh24:mi:ss')
group by decode(event,null, 'on cpu', event), p1,p3 
order by 4 desc;

DECODE(EVENT,NULL,'ONCPU',EVENT)          P1         P3   COUNT(1)
--------------------------------  ---------- ---------- ----------
on cpu                                     8          8         64
db file sequential read                    1          1         10
on cpu                                   100          0          6
Disk file operations I/O                   8          8          4
on cpu                                     1          1          4
db file scattered read                     3          3          3
on cpu                                     0          0          3
control file sequential read               0          1          3
on cpu                                     3          1          2
ASM file metadata operation                0          0          2
on cpu                               2490368          0          1
db file parallel write                     1 2147483647          1
db file async I/O submit                 106          0          1
on cpu                                     3          0          1
on cpu                                     2          1          1
on cpu                                131072          0          1
on cpu                                     5          1          1
on cpu                                 65536          0          1
oracle thread bootstrap              5910576          0          1
on cpu                                262144          0          1

Summary

Whenever you see a Disk file operations I/O wait event consuming a lot of time and CPU then get the p1,p2,p3 values from ASH or AWR


SQL> @sed "Disk file operations"
Show wait event descriptions matching %Disk file operations%..

EVENT_NAME                     WAIT_CLASS           PARAMETER1           PARAMETER2           PARAMETER3
------------------------------ -------------------- -------------------- -------------------- --------------------
Disk file operations I/O       User I/O             FileOperation        fileno               filetype

If p1 and p3 values equal to 8 and 8 respectively then check if you are, inadvertently, auditing too many repetitive actions by means of the following select


SQL>  select user_name, audit_option from DBA_STMT_AUDIT_OPTS where user_name='C##TEST';

USER_NAME                 AUDIT_OPTION
------------------------- ----------------------------------------
C##TEST                   ALTER SYSTEM
C##TEST                   SYSTEM AUDIT
C##TEST                   CREATE SESSION
C##TEST                   TABLE
C##TEST                   CLUSTER
C##TEST                   TABLESPACE
C##TEST                   USER
C##TEST                   ROLLBACK SEGMENT
C##TEST                   TYPE
C##TEST                   SYNONYM
C##TEST                   PUBLIC SYNONYM
C##TEST                   VIEW
C##TEST                   SEQUENCE
C##TEST                   DATABASE LINK
C##TEST                   PUBLIC DATABASE LINK
C##TEST                   ROLE
C##TEST                   DIMENSION
C##TEST                   PROCEDURE
C##TEST                   TRIGGER
C##TEST                   PROFILE
C##TEST                   DIRECTORY
C##TEST                   MATERIALIZED VIEW
C##TEST                   MINING MODEL
C##TEST                   INDEX
C##TEST                   SELECT TABLE
C##TEST                   INSERT TABLE
C##TEST                   UPDATE TABLE
C##TEST                   DELETE TABLE
C##TEST                   NOT EXISTS
C##TEST                   SYSTEM GRANT
C##TEST                   EXECUTE PROCEDURE
C##TEST                   CONTEXT
C##TEST                   SQL TRANSLATION PROFILE
C##TEST                   LOCKDOWN PROFILE
C##TEST                   ATTRIBUTE DIMENSION
C##TEST                   HIERARCHY
C##TEST                   ANALYTIC VIEW
C##TEST                   READ DIRECTORY
C##TEST                   WRITE DIRECTORY
C##TEST                   EXECUTE DIRECTORY 

If so then reduce the number of audited actions if you want to reduce the impact of this wait event.
In passing, I think that this wait event would not have shown up at the TOP if I have set the audit trail parameter value to DB. I left this not-demonstrated though.

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud's blog

Just another blog : Databases, Linux and other stuffs

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)