Mohamed Houri’s Oracle Notes

February 14, 2015

Resumable time out

Filed under: Tuning — hourim @ 9:47 am

I was trying to create a very big table through a create table as select from a join between two huge tables forcing both parallel DDL and parallel QUERY and was mainly concerned about the amount of temp space the hash join between those two big tables will require to get my table smoothly created. I said hash join because my first attempt to create this table used a nested loop join which was driving a dramatic 100 million starts of table access by index rowid since more than 10 hours when I decided to kill the underlying session. The create table being a one-shot process I decided to hint the optimizer so that it will opt for a hash join operation instead of the initial nested loop. But my concern has been transferred from a long running non finishing SQL statement to how much my create table will need of TEMP space in order to complete successfully.

I launched the create table and started monitoring it with Tanel Poder snapper script, v$active_session_history and the Real Time SQL Monitoring feature (RTSM). This is what the monitoring was showing

SQL> select 
          event
	 ,count(1) 
      from 
          v$active_session_history 
      where sql_id = '0xhm7700rq6tt' 
      group by event 
      order by 2 desc;


EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

However few minutes later Tanel Poder snapper started showing the following dominant wait event

SQL> @snapper ash 5 1 all

----------------------------------------------------------------------------------------------------
Active% | INST | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
----------------------------------------------------------------------------------------------------
   800% |    1 | 0xhm7700rq6tt   | 0         | statement suspended, wait error to  | Configuration

--  End of ASH snap 1, end=2015-02-12 09:12:57, seconds=5, samples_taken=45

Event which keeps incrementing in v$active_session_history as shown below:

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                          7400
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

7 rows selected.

SQL> /

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                          7440
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

7 rows selected.

SQL> /

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                          7480
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        979
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

EVENT                                                              COUNT(1)
---------------------------------------------------------------- ----------
statement suspended, wait error to be cleared                         11943
direct path read                                                       3202
direct path write temp                                                 1629
                                                                        980
db file scattered read                                                    8
CSS initialization                                                        7
CSS operation: query                                                      1

The unique wait event that was incrementing is that bizarre statement suspended, wait error to be cleared. The session from which I launched the create table statment was hanging without reporting any error.

As far as I was initially concerned by the TEMP space I checked the available space there

SQL> select 
       tablespace_name,
       total_blocks,
       used_blocks,
       free_blocks
     from v$sort_segment;

TABLESPACE_NAME                 TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS
------------------------------- ------------ ----------- -----------
TEMP                                 8191744     8191744           0    

No available free space left in TEMP tablespace

The RTSM of the corresponding sql_id was showing the following

=====================================================================================
| Id    |              Operation              |        Name        |  Rows   | Temp |
|       |                                     |                    |(Actual) |      |
=====================================================================================
|     0 | CREATE TABLE STATEMENT              |                    |       0 |      |
|     1 |   PX COORDINATOR                    |                    |         |      |
|     2 |    PX SEND QC (RANDOM)              | :TQ10003           |         |      |
|     3 |     LOAD AS SELECT                  |                    |         |      |
|     4 |      HASH GROUP BY                  |                    |         |      |
|     5 |       PX RECEIVE                    |                    |         |      |
|     6 |        PX SEND HASH                 | :TQ10002           |         |      |
|  -> 7 |         HASH JOIN                   |                    |       0 |  66G | --> spot this 
|       |                                     |                    |         |      |
|       |                                     |                    |         |      |
|       |                                     |                    |         |      |
|     8 |          PX RECEIVE                 |                    |    626M |      |
|     9 |           PX SEND BROADCAST         | :TQ10001           |    626M |      |
|    10 |            HASH JOIN                |                    |     78M |      |
|    11 |             PX RECEIVE              |                    |    372K |      |
|    12 |              PX SEND BROADCAST      | :TQ10000           |    372K |      |
|    13 |               PX BLOCK ITERATOR     |                    |   46481 |      |
|    14 |                INDEX FAST FULL SCAN | INDX_12453656_TABL |   46481 |      |
|    15 |             PX BLOCK ITERATOR       |                    |     88M |      |
|    16 |              TABLE ACCESS FULL      | TABL1              |     88M |      |
|       |                                     |                    |         |      |
| -> 17 |          PX BLOCK ITERATOR          |                    |    682M |      |
| -> 18 |           TABLE ACCESS FULL         | TABLE123456        |    682M |      |
|       |                                     |                    |         |      |
=====================================================================================

The HASH JOIN operation at line 7 has already gone above the available TEMP tablespace size which is 64G. I was wondering then, why my session has not been stopped with an ORA-01652 error instead of hanging there and letting the wait event statement suspended, wait error to be cleared popping up in my different monitoring tools.

After few minutes of googling tuning steps I ended up by executing the following query

SQL>select
        coord_session_id
       ,substr(sql_text, 1,10)
       ,error_msg
    from dba_resumable;

SESSION_ID  SQL        ERROR_MSG
----------- ---------  -------------------------------------------------------------------
146         create /*+  ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

This 146 SID is my create table session! It is in error but hanging and waiting on that statement suspend wait event.

After few minutes of googling tuning again I ended up by checking the following parameter:

SQL> sho parameter resumable

NAME                           TYPE                             VALUE
------------------------------ -------------------------------- ------
resumable_timeout              integer                          9000

Notice now how I finally have made a link between the title and the purpose of this article: resumable_timeout. The Oracle documentation states that when this parameter is set to a nonzero value the resumable space allocation might kicks in.

Put is simply what happens in my case is that since this parameter has been set to 9000 minutes and since my create table as select has encountered an ORA-01652 temp tablespace error, Oracle decided to put my session in a resumable state until someone will point out that space shortage,correct the situation and allow the session to continue its normal processing

The next step I did is to abort my session from its resumable state

SQL> exec dbms_resumable.abort(146);
 
PL/SQL procedure successfully completed.

Unfortunately I waited a couple of minutes in front of my sqlplus session waiting for it to resume without success so that I decided to use the brute force

ERROR:
ORA-03113: end-of-file on communication channel
Process ID: 690975
Session ID: 146 Serial number: 3837

Summary
I would suggest to do not set the resumable_timeout parameter to a nonzero value. Otherwise you will be hanging on a statement suspended, wait error to be cleared event during an amount of time indicated by this resumable_timeout parameter looking around for what is making your SQL statement hanging.

Footnote
Here below few extra words from Oracle documentation about Resumable space allocation

“Resumable space allocation avoids headaches and saves time by suspending, instead
of terminating, a large database operation requiring more disk space than is currently
available. While the operation is suspended, you can allocate more disk space on
the destination tablespace or increase the quota for the user. Once the low space
condition is addressed, the large database operation automatically picks up where it
left off.
As you might expect, the statements that resume are known as resumable statements.
The suspended statement, if it is part of a transaction, also suspends the transaction.
When disk space becomes available and the suspended statement resumes, the
transaction can be committed or rolled back whether or not any statements in
the transactions were suspended. The following conditions can trigger resumable
space allocation:
■ Out of disk space in a permanent or temporary tablespace
■ Maximum extents reached on a tablespace
■ User space quota exceeded
You can also control how long a statement can be suspended. The default time
interval is two hours, at which point the statement fails and returns an error message
to the user or application as if the statement was not suspended at all.
There are four general categories of commands that can be resumable:
(1) SELECT statements, (2) DML commands, (3) SQL*Loader operations, and
(4) DDL statements that allocate disk space.”

Advertisements

2 Comments »

  1. Sorry to be frank in this case, but I´m really incredulous that you are not aware of this old Oracle feature.

    Comment by Luis Santos — February 25, 2015 @ 6:57 pm | Reply

  2. […] having ruled out the resumable time out hypothesis I came back to the v$sql_monitor and issued the following two […]

    Pingback by Real Time SQL Monitoring oddity | Mohamed Houri’s Oracle Notes — June 23, 2015 @ 2:10 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: