February 25, 2012

Tuning an insert/select via dblink

Filed under: Trouble shooting — hourim @ 11:40 am

Recently an interesting question came up on an Oracle French Forum about a performance problem during an insert/select operation via a dblink. The poster was suffering from the very long time (several hours without any response) the insert was taking. This is why he came to that forum and posted a question which looks like “what do you think if I commit every 1000 rows? Will my insert speed up”?  That’s not the right way to follow in the process of troubleshooting an Oracle performance problem. I always remember Cary Millsap mantra “Why guess when you can know”. Fortunately Oracle gives us a lot fo tools and trace events to instrument what going on inside the Oracle database. Here below are the steps of discussion we’ve had (the original poster, others and I) in that forum.

I first have asked the poster to trace his insert select using events 10046


alter session set events '10046 trace name context forever, level 12';

INSERT /*+ append */ INTO my_table
(
champs_1,
champs_2,
champs_3,
champs_4,
champs_5,
champs_6,
champs_7
)
SELECT * FROM la_table_cible@mon_dblink ;

commit ;

alter session set events '10046 trace name context off';

I am not going to stress that I have asked the poster to ensure that his direct path load will not be silently ignored by Oracle in case the table “my_table” has trigger or foreign key integrity constraints implemented on it. I have also asked if there will be subsequent deletes (after the insert) on this table making future direct path load not reusing any freed space made by that delete.

The generated trace file profiled via orasrp profiler shows that 98% of the insert/select response time is spent waiting in the event SQL*Net message from dblink.

Up to this step, we persuaded the poster that his initial guess of committing every 1000 records was not the ultimate solution; but instead he should focus on reducing the volume of data transferred from the distant data base to the local one. The poster sends us the explain plan and Statistics of the select part when this one is executed in the distant data base

Statistics
----------------------------------------------
212        recursive calls
3          db block gets
678993     consistent gets
483744     physical reads
592        redo size
124692665  bytes sent via SQL*Net TO client
1391305    bytes received via SQL*Net FROM client
126437     SQL*Net roundtrips TO/FROM client
17         sorts (memory)
0          sorts (disk)
1896535    rows processed

He is selecting 1,896,535 records via 126,437 SQL*Net roundtrips meaning that he is using the default arraysize which is 15 (1,896,535/126,437). For this, he did 0.358 (678,993/1,896,535) logical read for each selected row which seems to be quite acceptable. But, he is sending a very large amount of data 124MB over a data base link and making a hundred of thousands SQL*Net roundtrips. Note, that he is doing also a fairly large amount of physical read (483,744) when compared to the total number of logical reads (678,993).

I’ve then asked the poster to stop selecting all the rows from the distant data base and to start selecting only necessary rows. I asked also to post the explain plan of this query in attempt to help him tuning the select part of his problematic insert. When he said that he can’t intervene into the distant data base, I have then asked him to manage his insert with a PL/SQL stored procedure as shown below:


CREATE type distant_typ AS object (champs_1 number,champs_2 number,champs_3 number, ... champs_7 date);
2  /

CREATE type distant_typ_tab AS TABLE of distant_typ;
2  /

DECLARE
CURSOR get_data_from_db IS
SELECT
     champs_1,
     champs_2,
     champs_3,
     champs_4,
     champs_5,
     champs_6,
     champs_7
FROM la_table_cible@mon_dblink
;

BEGIN

OPEN get_data_from_db;
LOOP
FETCH get_data_from_db BULK collect INTO distant_typ_tab LIMIT 100;
FOR i in 1..distant_typ_tab.COUNT
LOOP
INSERT INTO INSERT /*+ append */ INTO ma_table
(champs_1,
 champs_2,
 champs_3,
 champs_4,
 champs_5,
 champs_6,
 champs_7
)
SELECT
 champs_1,
 champs_2,
 champs_3,
 champs_4,
 champs_5,
 champs_6,
 champs_7
FROM table (distant_typ_tab);
EXIXT when get_data_from_db%notfound;
END LOOP;
END LOOP;
COMMIT;
END;

And magically the poster comes out and said “fantastic my insert/select now is completing in a very acceptable time”.

February 11, 2012

Clustering Factor explored in Moliere Language

Filed under: French-Translation — hourim @ 3:15 pm

It’s has been a great pleasure for me to translate, from English to French, Chapter 5 ( Le-clustering-factor ) of Cost-Based Oracle Fundamentals Book written by Jonathan Lewis. Although I do prefer the original version to the translated one, I was surprisingly happy to see that many French oracle colleagues have been  interested by this translation. May be a sign for a new carrier :-)

February 9, 2012

/*+ materialized */ hint and ORA-04091 error

Filed under: Uncategorized — hourim @ 12:12 pm

It is funny to know that the /*+ materialized */ hint is, behind the scene, materialized by Oracle with a creation of a global temporary table holding the data generated by the corresponding “with subquery”. What is also funny for me is, that when trouble-shooting oracle issues, I am always cross checking, all what I have read in the past that could have any relation to the problem I am confronted to at the present time. Dom Brook post is what I have recently read and is what “my cross checking” work pointed out to be eventually related to the situation explained below.

So, recently a question came up on an Oracle French forum about an insert/select statement that was working well in 10g while it started failing in error following an upgrade to 11g. There was an excellent discussion in that French forum with valuable remarks from different posters. But my curiosity was catalyzed by the fact that no one was able to clearly show which release is correct? 10g or 11g? Before coming back to the /*+ materialized */ hint which motivates this blog let me reproduce the insert/select test case here below:

10gR2 > create table t_read_consistency (id number, vc varchar2(15));
Table created.
10gR2 > insert into
2        t_read_consistency
3      select
4        rownum id,
5        rpad('a',15,'a')
6      from
7        dual
8      connect by
9        level <= 1000;
1000 rows created.

10gR2 > create type t_read_cs as object (id number,vc varchar2(15));
/
Type created.
mhouri.world > create type t_read_cs_tab  as table of t_read_cs;
/
Type created.
10gR2 > create or replace function f_read_consistency_tab
  2  return t_read_cs_tab
  3    as
  4      lc_t_read_cs_tab t_read_cs_tab := t_read_cs_tab();
  5      j  binary_integer := 0;
  6    begin
  7        for x in (select
  8                      id,
  9                      vc
 10                   from  t_read_consistency trs
 11                  where trs.id <= 10
 12       ) loop
 13 
 14          j := j +1;
 15          lc_t_read_cs_tab.extend;
 16          lc_t_read_cs_tab(j) := t_read_cs(x.id, x.vc);
 17     end loop;
 18     RETURN lc_t_read_cs_tab;
 19   end f_read_consistency_tab;
 20  /

 Function created.

This is the model. Lets now start the insert/select operations both in 10g and 11g

10gR2 > select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0    Production
TNS for Solaris: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

10gR2 > insert into t_read_consistency
2         (id,vc)
3      select id,vc
4   from table(f_read_consistency_tab)
5  ;
10 rows created.

11gR2 > select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

11gR2 > insert into t_read_consistency
2            (id,vc)
3          select id,vc
4      from table(f_read_consistency_tab)
5     ;
from table(f_read_consistency_tab)
*
ERROR at line 4:
ORA-04091: table MOHAMED.T_READ_CONSISTENCY is mutating, trigger/function may
not see it
ORA-06512: at "MOHAMED.F_READ_CONSISTENCY_TAB", line 7

Spot the difference. What is working well in 10g is throwing an error in 11g!!! Starting from this point I was religiously trying to persuade myself that the 11g release is where the correct situation is in contrast to what were claiming several posters in the French forum. From otn, I got the link to the related documentation:

which clearly states this:

  • When called from an INSERT, UPDATE, or DELETE statement, the function cannot query or modify any database tables modified by that statement. (10g)
  • When invoked from an INSERT, UPDATE, or DELETE statement, the subprogram cannot query or modify any database tables modified by that statement (11g)

The 11g release is, at least, in accordance with its corresponding documentation.

Then the /*+ materialized */ hint entered the scene. From ask tom, I knew that the following slightly modified insert/select is not throwing any error in 11g:


11gR2> insert into t_read_consistency(id,vc)
2          with data as (select /*+ materialize */ id, vc from table(f_read_consistency_tab))
3           select * from data
4          ;

10 rows created.

Consequently, if the above insert is working well this means that the insert is

  1. Either not selecting from the pipelined function and hence from the inserted table
  2. Or it is selecting from this pipelined function but before the start of the insert (exactly if I have used a PL/SQL block in which I would have first declared a local t_read_cs_tab type in which I would have then sent the content of the pipelined function and finally  would have used a select from this local type in my insert/select operation)

11gR2> explain plan for
2  insert into t_read_consistency(id,vc)
3          with data as (select /*+ materialize */ id, vc from table(f_read_consistency_tab))
4           select * from data
5          ;

Explained.

11gR2> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------
Plan hash value: 1931462686
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                    |                             |  8168 |   175K|    27   (0)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION          |                             |       |       |            |          |
|   2 |   LOAD AS SELECT                    | T_READ_CONSISTENCY          |       |       |            |          |
|   3 |    COLLECTION ITERATOR PICKLER FETCH| F_READ_CONSISTENCY_TAB      |       |       |            |          |
|   4 |   VIEW                              |                             |  8168 |   175K|     2   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL                | SYS_TEMP_0FD9D6622_C1854848 |  8168 | 16336 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

 

According to the above explain plan, the first executed operation is operation 3(read from the pipelined function) which feeds back its parent operation 2 followed by a read from that famous temporary table (SYS_TEMP_0FD9D6622_C1854848) created on the fly in response to the /*+ materialized */ hint. The insert is so that, before the first row is inserted, the function is read in its entirety and its returned rows are stored into the temporary table which is used by the insert statement. This is why the insert is not failing when the /*+ materialize */ hint is used.

Well, here below are the chronological steps

  1. I read Dom Brooks blog about the /*+ materialize */ hint where I learn that there is a creation of a temporary table behind the scene
  2. I was confronted to an issue about insert/select that is working well in 10g and not in 11g
  3. I knew that the insert/select will work in 11g if the hint /*+ materialized */ is used
  4. I cross check what I read recently and linked that to Dom Brooks article
  5. Wrote this blog article to put these chronological steps in action

February 6, 2012

How to solve ORA-08102: index key not found

Filed under: Index — hourim @ 2:28 pm

In one of our test databases we encountered the following error during an overnight PL/SQL purge job:

ORA-08102: index key not found, obj# 124885, file 48, block 55492 (2) 01-FEB-12

The purpose of the job is to delete data introduced since more than 6 days respecting for that several hierarchical parent-child relationships.  It has been working without any relatives problems since more than 2 years until is started crashing out with the above ORA-08102 error. I aim via this blog article to show how I have resolved this problem.

The first thing I did is of course to identify the corrupted object via the following select:

mhouri.world> ed
    Wrote file afiedt.buf
  1  select substr(object_name,1,30), object_type
  2  from user_objects
  3* where object_id = 154711
;

 SUBSTR(OBJECT_NAME,1,30)       OBJECT_TYPE
------------------------------ ----------------
XXX_ZZZ_UK                        INDEX

 

The naming standard (_UK) is as such that I immediately identified this index as an automatic index created to enforce a unique constraint. I thought first that I need to rebuild this unique index:

 


mhouri.world> alter index XXX_ZZZ_UK rebuild;

Index altered.

 

Unfortunately, when I launched again the purge batch the same error has been raised again. So my second tentative was to drop and recreate the unique constraint and implicitly the unique index.

 


mhouri.world> alter table t1 drop constraint XXX_ZZZ_UK;

Table altered.

mhouri.world > alter table t1 add constraint XXX_ZZZ_UK unique (id1, id2, dat1);

Table altered.

 

Of course that you have first to get the DDL definition of the constraint before dropping it so that you will be able to re-created it again.

 

When the purge process has been re-started following this last intervention, it, unfortunately, raised again the same ORA-08102 on the same object. Finally I have decided to work on the underneath table instead of its unique constraint and unique index.

 


mhouri.world>> alter table t1 move;

 Command that has invalidated all attached indexes and which obliged me to re-build them:


mhouri.world > select 'alter index ' || index_name || ' rebuild;'
from user_indexes
where status != 'VALID';

 

Once all indexes rebuilt, I launched again the purge job which surprisingly completed successfully.

 

 Frankly speaking I don’t know if I would have been able to work around this error if I have tried instead of the move command a rebuild or shrink command. But what I have learnt from this experience is that when an index is corrupted it could be repaired by moving table data instead of dropping and recreating the corrupted index as initially indicated by the ORA-08102 error. I have also to admit that I don’t know yet the side effect of the move command and will certainly write something about that if eventually I will come to learn something about that.

.

January 19, 2012

Bitmap Indexes and Foreign key

Filed under: Deadlock — hourim @ 12:09 pm

1.     Introduction

Several months ago I have published a sql script that checks the existence or the absence of an index covering the deadlock threat of unindexed foreign key constraint when DML are issued against the parent table. That script doesn’t bother about the index type. If an index of a bitmap type exists it will tell you that it is not necessary to create an extra index for your FK constraint. Let’s see that in action

2.     Bitmap Index and FK constraint

I will start by creating a classical parent/child table relation ship

houri> create table t_parent as select * from all_users;
Table created.
mhouri> alter table t_parent add constraint t_p_pk primary key (user_id);
Table altered.
mhouri> create table t_child (user_id references t_parent, data varchar2(10));
Table created.                 

Then I will create a bitmap index to cover my Foreign Key:

mhouri > create bitmap index t_bitmap_fk_i on t_child(user_id);
Index created.

Using my traditional script to check if I need to create an index to cover my FK or not, gives me the following situation

mhouri> start index_fk.sql
Enter value for m_table: t_child
Enter value for m_column: user_id
Enter value for m_column2: none
Enter value for m_column3: none
TABLE_NAME                INDEX_NAME                     COLUMN_NAM COLUMN_POS
------------------------- ------------------------------ ---------- ----------
T_CHILD                   T_BITMAP_FK_I                  USER_ID             1

Oh! Yes; it is telling me that I don’t need to create an extra index because there is already one index starting with the foreign key column. If I execute Tom Kyte script,  it is also not pointing out the need to index my user_id FK column

mhouri> start tkyte.sql
TABLE_NAME                     CONSTRAINT_NAME                COLUMNS
------------------------------ ------------------------------ -------------------
MY_XXX_TAB                   C_FK                           ID1,ID2

My T_CHILD table is not returned by Tom Kyte script as a table having an unindexed foreign key. So, according to both scripts it is safe to do DML operations on T_PARENT table. Let’s then submit our parent table to a classical OLTP operations

mhouri> select min(user_id), max(user_id) from t_parent;
MIN(USER_ID) MAX(USER_ID)
------------ ------------
0           73
mhouri> insert into t_child(user_id,data) values (0,'test');
1 row created.
mhouri> declare
2   pragma autonomous_transaction;
3   begin
4   delete from t_parent where user_id = 73;
5   commit;
6   end;
7  /

declare
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at line 4

mhouri> rollback;
Rollback complete.

Ooops!!!  The existing bitmap index does not cover the deadlock threat induced by a DML operation on a parent table having a child table with an unindexed foreign key.  The bitmap index type has to be excluded from my script. The new script index_fk when executed against the t-child table gives this:

mhouri> start index_fk_2.sql
Enter value for m_table: t_child
Enter value for m_column: user_id
Enter value for m_column2: none
Enter value for m_column3: none
no rows selected

And now, it’s OK. The script is telling me that I need to index my user_id FK columns because the existing bitmap index will not cover the deadlock threat.

          2.1 Remark

It would have been very bizarre that a bitmap index covers the deadlock threat induced by an unindexed foreign key because the bitmap index itself is responsible of several deadlocks situations when it is used in an OLTP application and this is why Tom Kyte script and my initial script has not foresee to exclude this type of indexes from the verification script

3.     Function Base Index and FK constraint

In order for an index to successfully cover the deadlock threat induced by a DML on a parent/child relationship it needs:

  1. to start with the FK column(or columns for a composite key)
  2. to not be of a bitmap type

 Looking to the above two conditions, there is nothing that impeaches a function based index to cover the FK threat. Let’s work with few examples:

mhouri> create index t_fbi_fk_i on t_child(user_id desc);
Index created.
mhouri> insert into t_child(user_id,data) values (0,'test');
1 row created.
mhouri> declare
  2       pragma autonomous_transaction;
  3      begin
  4      delete from t_parent where user_id = 73;
  5      commit;
  6      end;
  7    /

declare
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at line 4

My Function based index has only one column in it and it is a hidden column (SYS_NC00003$ for example). This is why it didn’t succeed to cover the above deadlock situation. Let’s check this with my new script:

mhouri> start index_fk_2
Enter value for m_table: t_child
Enter value for m_column: user_id
Enter value for m_column2: none
Enter value for m_column3: none
no rows selected

 My new updated script is working well in this case as far as it is telling me that I need to create an extra index.  Let’s now drop and change a little bit this function based index

mhouri> drop index t_fbi_fk_i;
Index dropped.
mhouri> create index t_fbi_fk_i on t_child(user_id, data desc);
Index created.
mhouri> insert into t_child(user_id,data) values (0,'testfbi');
1 row created.
mhouri> declare
  2       pragma autonomous_transaction;
  3      begin
  4      delete from t_parent where user_id = 73;
  5      commit;
  6      end;
  7    /

PL/SQL procedure successfully completed.

The newly created function based index is covering very well the FK on the child table. My new script is also working well in this case as it is telling me that I don’t need to create an extra index

mhouri> start index_fk_2
Enter value for m_table: t_child
Enter value for m_column: user_id
Enter value for m_column2: none
Enter value for m_column3: none
<pre>TABLE_NAME                INDEX_NAME                     COLUMN_NAM COLUMN_POS

------------------------- ------------------------------ ---------- ----------

T_CHILD                   T_FBI_FK_I                     USER_ID             1

4.     Conclusion

In order for an index to successfully cover the deadlock threat induced by a DML on a parent/child table relationship it needs:

  1. to start with the FK column(or columns for a composite key in any order)
  2. to not be of a bitmap type

Even a function based index when it starts with the FK columns will play the same role as a b-tree index will do when it will be asked to cover the deadlock threats induced by a DML operation on a parent/child relationship

December 29, 2011

Getting Explain Plan

Filed under: explain plan — hourim @ 3:04 pm

1. Introduction

In order to tune a query you need first to get its real execution plan that describes the operations dictated by the Oracle optimizer and followed by the SQL engine to execute this query. It can’t be enough emphasized that getting the exact explain plan (and not the expected one) represents an important step when troubleshooting query performance problems. This document aims to describe in detail how to obtain the real explain plan followed by the SQL engine.

2. Obtaining Explain plans

First of all it’s worth mentioning to avoid using tools like TOAD to get the explain plan of your query because, more often than not, it does not provide all the information you need for a thorough analysis. This is why I am going to present only tools that have been developed by Oracle Corporation. Among those tools the first one is the classical explain plan for as shown in the below example:

     2.1 Explain plan for command and its limitation          

Unfortunately there exist situations where the ‘explain plan for’ command may not report the real explain plan followed by the Oracle Optimizer to execute the query. This situation occurs particularly when using bind variables.  You should know that the ‘explain plan for’ command treats all bind variables as of a VARCHAR2 data type; this is why an implicit data type conversion might happen during the ‘explain plan for’ command while there is no implicit data type conversion when the query is executed. And that’s where the limit of the ‘explain plan for’ command comes from.

mhouri.world> create table t (id varchar2(10), name varchar2(100));

Table created.

mhouri.world> insert into t select to_char(object_id), object_name from user_objects;

1004 rows created.

mhouri.world> create index i on t(id);

Index created.

mhouri.world> exec dbms_stats.gather_table_stats(user,'T',cascade=>true);

PL/SQL procedure successfully completed.
mhouri.world> var x number

mhouri.world> exec :x:=99999

PL/SQL procedure successfully completed.

mhouri.world> explain plan for select sum(length(name)) from t where id >:x;

Explained.

mhouri.world> select * from table(dbms_xplan.display) ;

-------------------------------------------------------------------------------------
Plan hash value: 1188118800
-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |    24 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |      |     1 |    24 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T    |    50 |  1200 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I    |     9 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ID">:X)

Remember that the id column has been declared as varchar2 and that we issued our ‘explain plan for’ command using a bind variable declared as number (var x number). As far as the ‘explain plan for’ command treats all bind variables as VARCHAR2 it has reported that the index has been used; which is, unfortunately, completely false. And that is the topic of the next paragraph.

            2.2     dbms_xplan.display_cursor             

In order to get the real explain plan we need to use the display_cursor function of the dbms_xplan package as shown below:

mhouri.world> select sum (length (name)) from t where id >:x;
SUM (LENGTH (NAME))
-----------------
8145

mhouri.world> select * from table (dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
SQL_ID  7zm570j6kj597, child number 0
-------------------------------------
select sum(length(name)) from t where id >  :x
Plan hash value: 1842905362
--------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     5 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    24 |            |          |
|*  2 |   TABLE ACCESS FULL| T    |    50 |  1200 |     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER("ID")>:X)

Notice how the dbms_xplan.display_cursor() function has correctly reported a TABLE ACCESS FULL instead of the INDEX RANGE SCAN path (initially reported by the ‘explain plan for’ command) due to the TO_NUMBER (see the predicate information) conversion that occurred during the query execution.

            2.3     dbms_xplan.display_cursor (null, null, ‘ALLSTATS LAST’)            

Now that we are quite confident with the use of dbms_xplan.display_cursor function instead of the approximate ‘explain plan for’ function, It is worth mentioning that the function dbms_xplan.display_cursor when invoked via specific parameters, can give extra valuable information that might be of a great help for explaining query performance problems and very often to show stale statistics. Let’s look to that through the following example:

mhouri.world> select /*+ gather_plan_statistics */
2  ename, hiredate, sal, deptno
3  from emp
4  where deptno = 20;

ENAME      HIREDATE         SAL     DEPTNO
---------- --------- ---------- ----------
allen      30-MAR-10        815         20
jones      02-APR-81       2975         20
scott      09-DEC-82       3000         20
adams      12-JAN-83       1100         20
ford       03-DEC-81       3000         20

mhouri.world> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  7xs5xf4bnkmgs, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ ename, hiredate, sal, deptno from emp where
deptno = 20
Plan hash value: 2872589290
----------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers| Reads  |
--------------------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| EMP  |      1 |      5 |      5 |00:00:00.01 |    16 |    14 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPTNO"=20)

The new extra information together with their explanations are shown here below:

  1. Starts   : the number of time the operation has been started or executed
  2. E-Rows : the number of Estimated Rows for the current Operation
  3. A-Rows : the number of Actual Rows retrieved by the current Operation

It is well known that the Optimizer will do a good job only if E-Rows and A-Rows have close values. When, instead, these two cardinalities present two largely different values, then this is a clear indication of inefficient execution plan probably due to stale statistics on related tables and indexes.  However, be aware that, before jumping to conclusions, you need always to compare E-Rows with round (A-Rows/Starts).

 3 Grants and privileges

In order to be able to call the dbms_xplan.display_cursor() function you need to access to the following dynamic performance views

  • v$session
  • v$sql
  • v$sql_plan
  • v$sql_plan_statistics_all

And if you don’t supply the sql_id or the child_number to the dbms_xplan.display_cursor() function, it will reports the last executed SQL statement.

December 22, 2011

On tuning via explain plan

Filed under: explain plan — hourim @ 4:46 pm

Explain plan taken from memory, i.e. explain plan generated by the dbms_xplan.display_cursor () function, contains important information such as where time is most spent by the related query operations and, hence, on which operations the tuning effort should be concentrated.   As far as it is well known that a picture is worth a thousand words, let me then present you with a real life explain plan I have used to diagnose the cause of a query performing badly in production and which I have used also to tune this query using the valuable information delivered by this explain plan (I have edited a little bit this explain plan for clarity):


-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name                          | Starts | E-Rows | A-Rows A-Time       |
---------------------------------------------------------------------------------------------------------------------------------
|   1 |  SORT UNIQUE                                  |                               |      1 |      2 |   3494 |00:00:11.41 |
|   2 |   UNION-ALL                                   |                               |      1 |        |   3497 |00:00:11.38 |
|*  3 |    FILTER                                     |                               |      1 |        |   3482 |00:00:11.29 |
|*  4 |     FILTER                                    |                               |      1 |        |   4309 |00:00:11.27 |
|*  5 |      HASH JOIN OUTER                          |                               |      1 |      1 |   4309 |00:00:11.27 |
|   6 |       NESTED LOOPS OUTER                      |                               |      1 |      1 |   4309 |00:00:11.15 |
|   7 |        NESTED LOOPS                           |                               |      1 |      1 |   4309 |00:00:11.03 |
|   8 |         NESTED LOOPS OUTER                    |                               |      1 |      1 |   4309 |00:00:10.90 |
|   9 |          NESTED LOOPS OUTER                   |                               |      1 |      1 |   4309 |00:00:10.76 |
|  10 |           NESTED LOOPS                        |                               |      1 |      1 |   3482 |00:00:10.57 |
|  11 |            NESTED LOOPS OUTER                 |                               |      1 |      1 |   3482 |00:00:10.00 |
|  12 |             TABLE ACCESS BY GLOBAL INDEX ROWID| SXPS_P_ACCNT_AMPM_J_ICIPD     |      1 |      1 |   1534 |00:00:03.27 |
|* 13 |              INDEX RANGE SCAN                 | SXPS_JTPP_PTC_DAT_ACC_HR_NI   |      1 |      1 |   1534 |00:00:00.20 |
|  14 |             PARTITION RANGE ITERATOR          |                               |   1534 |      1 |   3222 |00:00:06.73 |
|* 15 |              TABLE ACCESS BY LOCAL INDEX ROWID| SXPS_SIGNALIS                 |   1534 |      1 |   3222 |00:00:06.71 |
|* 16 |               INDEX RANGE SCAN                | SXPS_SIG_NUI_FK_I             |   1534 |      3 |  59193 |00:00:05.58 |
|  17 |            TABLE ACCESS BY GLOBAL INDEX ROWID | SXPS_ACCNT_AMPM_J             |   3482 |      1 |   3482 |00:00:00.56 |
|* 18 |             INDEX UNIQUE SCAN                 | SXPS_NUI_PK                   |   3482 |      1 |   3482 |00:00:00.40 |
|  19 |           PARTITION RANGE ITERATOR            |                               |   3482 |      1 |   1437 |00:00:00.19 |
|  20 |            TABLE ACCESS BY LOCAL INDEX ROWID  | SXPS_ACCNT_COMMENTS           |   3482 |      1 |   1437 |00:00:00.17 |
|* 21 |             INDEX RANGE SCAN                  | SXPS_TRC_NUI_FK_I             |   3482 |      2 |   1437 |00:00:00.14 |
|* 22 |          TABLE ACCESS BY INDEX ROWID          | SXPS_PTREF                    |   4309 |      1 |   4006 |00:00:00.13 |
|* 23 |           INDEX RANGE SCAN                    | SXPS_PTF_PK                   |   4309 |      3 |  17910 |00:00:00.04 |
|* 24 |         TABLE ACCESS BY INDEX ROWID           | SXPS_ICIPD                    |   4309 |      1 |   4309 |00:00:00.12 |
|* 25 |          INDEX RANGE SCAN                     | SXPS_PTC_PK                   |   4309 |      3 |  30163 |00:00:00.04 |
|* 26 |        TABLE ACCESS BY INDEX ROWID            | SXPS_PTREF                    |   4309 |      1 |   4018 |00:00:00.11 |
|* 27 |         INDEX RANGE SCAN                      | SXPS_PTF_PK                   |   4309 |      3 |  18416 |00:00:00.03 |
|  28 |       VIEW                                    |                               |      1 |     28 |   1955 |00:00:00.08 |
|  29 |        HASH GROUP BY                          |                               |      1 |     28 |   1955 |00:00:00.08 |
|  30 |         TABLE ACCESS BY GLOBAL INDEX ROWID    | SXPS_ACCNT_COMMENTS           |      1 |     28 |   1955 |00:00:00.08 |
|* 31 |          INDEX RANGE SCAN                     | SXPS_TRC_COM_FR_NI            |      1 |     29 |   1955 |00:00:00.04 |
|  32 |     SORT AGGREGATE                            |                               |    232 |      1 |    232 |00:00:00.01 |
|  33 |      PARTITION RANGE SINGLE                   |                               |    232 |      1 |    528 |00:00:00.01 |
|* 34 |       TABLE ACCESS BY LOCAL INDEX ROWID       | SXPS_ACCNT_COMMENTS           |    232 |      1 |    528 |00:00:00.01 |
|* 35 |        INDEX RANGE SCAN                       | SXPS_TRC_NUI_FK_I             |    232 |      1 |    528 |00:00:00.01 |
|* 36 |    FILTER                                     |                               |      1 |        |     15 |00:00:00.08 |
|* 37 |     FILTER                                    |                               |      1 |        |     15 |00:00:00.08 |
|* 38 |      HASH JOIN OUTER                          |                               |      1 |      1 |     15 |00:00:00.08 |
|  39 |       NESTED LOOPS OUTER                      |                               |      1 |      1 |     15 |00:00:00.07 |
|  40 |        NESTED LOOPS                           |                               |      1 |      1 |     15 |00:00:00.07 |
|  41 |         NESTED LOOPS                          |                               |      1 |      1 |     15 |00:00:00.07 |
|  42 |          NESTED LOOPS OUTER                   |                               |      1 |      1 |     15 |00:00:00.07 |
|  43 |           NESTED LOOPS OUTER                  |                               |      1 |      1 |     15 |00:00:00.05 |
|  44 |            NESTED LOOPS OUTER                 |                               |      1 |      1 |     15 |00:00:00.03 |
|  45 |             TABLE ACCESS BY GLOBAL INDEX ROWID| SXPS_P_ACCNT_AMPM_J_ICIPD_SUP |      1 |      1 |     14 |00:00:00.03 |
|* 46 |              INDEX RANGE SCAN                 | SXPS_PTPS_PTC_HR_NI           |      1 |      1 |     14 |00:00:00.03 |
|  47 |             PARTITION RANGE ITERATOR          |                               |     14 |     33 |      8 |00:00:00.01 |
|* 48 |              TABLE ACCESS BY LOCAL INDEX ROWID| SXPS_SIGNALIS                 |     14 |     33 |      8 |00:00:00.01 |
|* 49 |               INDEX RANGE SCAN                | SXPS_SIG_NUI_FK_I             |     14 |      3 |    390 |00:00:00.01 |
|* 50 |            TABLE ACCESS BY INDEX ROWID        | SXPS_PTREF                    |     15 |      1 |      8 |00:00:00.02 |
|* 51 |             INDEX RANGE SCAN                  | SXPS_PTF_PK                   |     15 |      3 |     65 |00:00:00.01 |
|* 52 |           TABLE ACCESS BY INDEX ROWID         | SXPS_PTREF                    |     15 |      1 |      8 |00:00:00.02 |
|* 53 |            INDEX RANGE SCAN                   | SXPS_PTF_PK                   |     15 |      3 |     63 |00:00:00.01 |
|  54 |          TABLE ACCESS BY GLOBAL INDEX ROWID   | SXPS_ACCNT_AMPM_J             |     15 |      1 |     15 |00:00:00.01 |
|* 55 |           INDEX UNIQUE SCAN                   | SXPS_NUI_PK                   |     15 |      1 |     15 |00:00:00.01 |
|* 56 |         TABLE ACCESS BY INDEX ROWID           | SXPS_ICIPD                    |     15 |      1 |     15 |00:00:00.01 |
|* 57 |          INDEX RANGE SCAN                     | SXPS_PTC_PK                   |     15 |      3 |    105 |00:00:00.01 |
|  58 |        PARTITION RANGE ITERATOR               |                               |     15 |      1 |      0 |00:00:00.01 |
|  59 |         TABLE ACCESS BY LOCAL INDEX ROWID     | SXPS_ACCNT_COMMENTS           |     15 |      1 |      0 |00:00:00.01 |
|* 60 |          INDEX RANGE SCAN                     | SXPS_TRC_NUI_FK_I             |     15 |      2 |      0 |00:00:00.01 |
|  61 |       VIEW                                    |                               |      1 |     28 |   1955 |00:00:00.01 |
|  62 |        HASH GROUP BY                          |                               |      1 |     28 |   1955 |00:00:00.01 |
|  63 |         TABLE ACCESS BY GLOBAL INDEX ROWID    | SXPS_ACCNT_COMMENTS           |      1 |     28 |   1955 |00:00:00.01 |
|* 64 |          INDEX RANGE SCAN                     | SXPS_TRC_COM_FR_NI            |      1 |     29 |   1955 |00:00:00.01 |
|  65 |     SORT AGGREGATE                            |                               |      0 |      1 |      0 |00:00:00.01 |
|  66 |      PARTITION RANGE SINGLE                   |                               |      0 |      1 |      0 |00:00:00.01 |
|* 67 |       TABLE ACCESS BY LOCAL INDEX ROWID       | SXPS_ACCNT_COMMENTS           |      0 |      1 |      0 |00:00:00.01 |
|* 68 |        INDEX RANGE SCAN                       | SXPS_TRC_NUI_FK_I             |      0 |      1 |      0 |00:00:00.01 |
-------------------------------------------------------------------------------------------------------------------------------

The corresponding SQL query has not been shown intentionally because what I intend to show here is how to enhance a query execution time using exclusively its corresponding explain plan. The predicate part is left outside the picture for the moment.   I will get back to it later in this article. In addition, looking at the Estimations (E-Rows) done by the CBO and the real returned rows (A-Rows), shows clearly that statistics are not up-to-date and should be collected again. However, I am not going to concentrate my effort here on the stale statistics but either on how to point out operations that need to be tuned. It could be that stale statistics are the main problem here, but we will suppose that they aren’t and we will focus our attention on reading explain plan operations.

Having said that, from where are we going to start investigating this complex explain plan?  Well, there are several points in this explain plan to be emphasized:

  1. The total execution time of the query is 11,41 seconds and it returned 3494 rows

---------------------------------------------------------------------------------------------------
| Id  | Operation         | Name                          | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------
|   1 |  SORT UNIQUE      |                               |      1 |      2 |   3494 |00:00:11.41 |

2. There are two sets of operations that are mainly contributing to this 11,41 seconds:

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name             | Starts | E-Rows | A-Rows |   A-Time
-----------------------------------------------------------------------------------------------------------------
|  14 |     PARTITION RANGE ITERATOR           |                    |   1534 |   1      |   3222   |00:00:06.73 |
|* 15 |      TABLE ACCESS BY LOCAL INDEX ROWID | SXPS_SIGNALIS      |   1534 |   1      |   3222   |00:00:06.71 |
|* 16 |        INDEX RANGE SCAN                | SXPS_SIG_NUI_FK_I  |   1534 |   3      |  59193   |00:00:05.58 |

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                     | Starts | E-Rows | A-Rows |   A-Time
-----------------------------------------------------------------------------------------------------------------
|  12 | TABLE ACCESS BY GLOBAL INDEX ROWID| SXPS_P_ACCNT_AMPM_J_ICIPD   |   1     |  1      |   1534 |00:00:03.27|
|* 13 |   INDEX RANGE SCAN                | SXPS_JTPP_PTC_DAT_ACC_HR_NI |   1     |  1      |   1534 |00:00:00.20|

These two sets of operations consume 90% of the total response time which is 10 seconds (6,73+3,27). Let’s now think about how to tune them. Looking at operation 16(INDEX RANGE SCAN) we can notice that it generates 59193 records and feeds back it parent operation 16 (TABLE ACCESS BY GLOBAL INDEX ROWID|) with the corresponding rowid of these 59193 records. The anomaly here is that the parent operation 16 visits the table SXPS_SIGNALIS and discards 55971 records letting only 3222 records to be sent back to its parent operation 14. That is an enormous waste of time and energy which is very often an indication of a non precise index. How would we remedy to this anomaly? And here where the Predicate part of the explain plan comes into play. Each time you see an asterisk in front of an operation Id (|* 15 |) this means that this operation has a predicate part which has to be investigated:

15 - filter("R"."NUM_OR"="P"."NUM_OR")
16 - access("R"."ACCNT_ID"="P"."ACCNT_ID" AND "R"."DAT_ACC"="P"."DAT_ACC")

There is an access predicate on operation 16 (INDEX RANGE SCAN   SXPS_SIG_NUI_FK_I) with ACCNT_ID and DAT_ACC followed by a filter predicate on NUM_OR column done by operation 15 (TABLE ACCESS BY LOCAL INDEX ROWID). The index SXPS_SIG_NUI_FK_I is defined on (ACCNT_ID, DAT_ACC); why not create a precise index with these three critical columns (ACCNT_ID, DAT_ACC, NUM_OR)?

create index MHO_SXPS_REP on SXPS_SIGNALIS(DAT_ACC, ACCNT_ID, NUM_OR);

Now that the new index is in place, queering again gives the new explain plan (in which I am showing only interested operations)

--------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name             | Starts | E-Rows | A-Rows |   A-Time
--------------------------------------------------------------------------------------------------------
|  15 |    TABLE ACCESS BY GLOBAL INDEX ROWID | SXPS_SIGNALIS    |   1534 |     1  |   3222 |00:00:00.60
|* 16 |      INDEX RANGE SCAN                 | MHO_SXPS_REP     |   1534 |     1  |   3222 |00:00:00.17

With the corresponding predicate part reported here below:

16 - access("R"."DAT_ACC"="P"."DAT_ACC" AND "R"."ACCNT_ID"="P"."ACCNT_ID" AND "R"."NUM_OR"="P"."NUM_OR")

Observe the difference:  the infamous filter on SXPS_SIGNALIS table has gone! The index MHO_SXPS_REP is very precise because all the 3222 rowid passed to its parent operation have been kept and no time has been spent throwing out unnecessary records. The effect of this object change on the execution time is worth noticing: from 6,73 seconds to 0,6 seconds.

What, now, if the access to SXPS_SIGNALIS table is totally eliminated from the query operations?  To do so, we need to know the number of columns that are  selected from this table.  Observing the query reveals that, in addition to the three indexed columns, the query selects from SXPS_SIGNALIS table two supplementary columns, ACC_ID_ORG and ACC_ID_DEST. Let’s then re-create the above index with those two supplementary columns as shown below:

create index SXPS_SIG_NUM_OR_ptf_ni on SXPS_SIGNALIS(DAT_ACC,ACCNT_id, NUM_OR,ACC_ID_ORG, ACC_ID_DEST) compress 2;

I added a compress 2 clause to the creation of this index in order to make it small and to favor its placement on the data buffer cash generating less physical I/O. Compressing an index is a very interesting option that I will discuss about it in another occasion.  Re-executing the same query gives the following new situation:

------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                          | Starts | E-Rows | A-Rows |   A-Time
------------------------------------------------------------------------------------------------------
|* 14 |  INDEX RANGE SCAN    | SXPS_JTPP_PTC_DAT_ACC_HR_NI   |    1   |    1   |   1534 |00:00:00.26 |
|* 15 |  INDEX RANGE SCAN    | MHO_SXPS_REP                  |   1534 |    1   |   3222 |00:00:00.10 |

15 - access("R"."DAT_ACC"="P"."DAT_ACC" AND "R"."ACCNT_ID"="P"."ACCNT_ID" AND "R"."NUM_OR"="P"."NUM_OR")

Access to the SXPS_SIGNALIS table has gone away. Remains only the access to the new MHO_SXPS_REP index which is completed in 0.1 seconds

At this tuning point, let’s summarise where we stand. We succeed via the creation of a precise index to replace two operations by one operation and to reduce the execution time from 6,73 to 0.1 seconds. No let’s now go further and try to find other oddities in the original explain plan. Look at the following operation and spot the problem:

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Starts | E-Rows | A-Rows |   A-Time
-----------------------------------------------------------------------------------------------
|* 24 | TABLE ACCESS BY INDEX ROWID  | SXPS_ICIPD     |   4309 |      1 |   4309 |00:00:00.12 |
|* 25 |  INDEX RANGE SCAN            | SXPS_PTC_PK    |   4309 |      3 |   30163|00:00:00.04 |

24 - filter("S"."VALID_TO_DATE">=SYSDATE@!)
25 - access("S"."ICIPD_NO"="P"."ICIPD_NO" AND "S"."VALID_FROM_DATE"<=SYSDATE@!)

In close comparison to what has been said above, there is here also a waste of time and energy consumed by the parent operation throwing out 25854(30163-4309) records. Generally speaking, each time you see such a kind of situation, you should immediately think about a non precise index. The index used here is, as its name clearly indicates it, the primary key index defined as (ICIPD_NO , VALID_FROM_DATE ) while the query involves (ICIPD_NO , VALID_FROM_DATE, VALID_TO_DATE ).  It seems that there is no such a kind of three columns index. An investigation shows however a presence of the following index SXPS_PTC_DAT_NI(VALID_FROM_DATE, VALID_TO_DATE , ICIPD_NO ). So, why this precise index has not been used?  There is a fairly simple answer to this question: the existing index has not been used because its leading column is used in a range based predicate (“S”.”VALID_TO_DATE”>=SYSDATE@!)and not in the equality ((“S”.”ICIPD_NO”=”P”.”ICIPD_NO”). It is worth advising not to put a column, on which you are going to apply range predicates (>, <), at the leading edge of the index.

To demonstrate this let’s create a new index (SXPS_PTC_PTCNO_VDAT_NI) in which we will change the order of its columns to be (ICIPD_NO , VALID_FROM_DATE , VALID_TO_DATE) and re-execute the query

--------------------------------------------------------------------------------------------------
| Id  | Operation           | Name                     | Starts | E-Rows | A-Rows |   A-Time
--------------------------------------------------------------------------------------------------
|* 17 |  INDEX RANGE SCAN   | SXPS_PTC_PTCNO_VDAT_NI   | 4309   |    3   | 4309   |00:00:00.03   |

17 - access("S"."ICIPD_NO"="P"."ICIPD_NO" AND "S"."VALID_TO_DATE">=SYSDATE@! AND "S"."VALID_FROM_DATE"<=SYSDATE@!)
filter("S"."VALID_TO_DATE">=SYSDATE@!)

The access to the table has gone away and the new index became very precise because its leading column (ICIPD_NO) is the one used in the equality predicate.

The bottom line from this article is

  1. Know how to locate in explain plans operations that are worth considering in the tuning process
  2. Each time a big number of rows is discarded between an index operation and its parent table operation think about creating a precise index. The predicate part of the explain plan will help you finding the appropriate columns that are candidate to be in the precise index
  3. Think carefully when designing a composite index to start this index by columns figuring in an equality predicate and not in a range based predicate.

October 10, 2011

DBMS_XPLAN : Starts in action

Filed under: explain plan — hourim @ 12:50 pm

Very often, if not always, I am advocating the use of


select * from table(dbms_xplan.display_cursor(null,null, ‘ALLSTATS LAST’));

in order to  

  1.  Get the real explain plan and not the approximate onet
  2.  Judge about the Estimations the CBO is doing

And today I want to add

            3.  See if an operation has been started or not by the CBO

Here below a little demonstration of point 3.


mhouri > select empno, ename
  2  from emp
  3  where empno is null;
no rows selected

mhouri > select * from table(dbms_xplan.display_cursor);
-------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time    |    
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     1 (100)|         |       
|*  1 |  FILTER            |      |       |       |            |         |      
|   2 |   TABLE ACCESS FULL| EMP  |    15 |   150 |     5   (0)| 00:00:01|
--------------------------------------------------------------------------
Predicate Information (identified by operation(id):
---------------------------------------------------                    
   1 - filter(NULL IS NOT NULL)                                                                                         
19 rows selected.

It’s a pity to apply a FILTER to operation 2 (TABLE ACCESS FULL) when you know that this filter will discard all the rows because NULL is never NOT NULL. So why the CBO execute a TABLE ACCESS FULL on EMP when it knows that the filter will discard all returned rows.

But wait a moment. Are you sure that the CBO did execute operation 2? Let’s then demonstrate that:

mhouri > select /*+ gather_plan_statistics */ empno, ename
  2  from emp
  3  where empno is null;
no rows selected

mhouri > select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
-------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time  |
-------------------------------------------------------------------------
|*  1 |  FILTER            |      |      1 |        |      0 |00:00:00.01|
|   2 |   TABLE ACCESS FULL| EMP  |      0 |     15 |      0 |00:00:00.01|
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NULL IS NOT NULL)
19 rows selected.

Ah ha ha!!! When we look at the column Starts for operation 2, you will notice that it equals 0 which means that this operation has never been started by the CBO. This is confirmed also by the information A-Rows of the FILTER operation which equals 0 meaning that the filter has been started only one time and applied on 0 rows

 Bottom line here, is that using the ALLSTATS LAST property of the DBMS_XPLAN package can reveals valuable information one can use to understand the work done by the CBO

September 7, 2011

Deadlock – Part 3: Non Indexed Foreign key in action

Filed under: Deadlock — hourim @ 12:46 pm

A recent question on otn about deadlock came up last week. The deadlock graph was as such that I was wondering if I have already encountered it before.

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-001a0002-0002a0fe       196     197     X            166    1835           S
TM-0000c800-00000000       166    1835    SX            196     197    SX   SSX

Albeit it is an incomplete deadlock graph because it doesn’t present the type of sql session 197 and 1835 were executing before they deadlocked, we can point out that this deadlock involves two types of locks Transaction (TX) and DML (TM) locks. I have never seen yet both type of locks within the same deadlock graph.

Knowing that TM enqueue can almost always be due to a DML operation on a parent–child table where the foreign key is not indexed, I asked then the Original Poster (OP) to check using Tom Kyte script if his OLTP application contains non indexed foreign keys. But I was a little bit struggled when the OP said that the deadlock occurs on an insert statement. Thought that he was unable to feed us with the sql the other session was doing before the deadlock occurred. In a follow up to the same thread, Jonathan Lewis gave a possible scenario for this kind of deadlock.  I decided to implement this scenario and to analyze the generated trace file. I opened two sessions (session3 is just to select from v$lock) and followed the steps described below:

session1 > select distinct sid from v$mystat;
  SID                                                 
----------                                                 
  1074  

session1 > create table parent (x number primary key);
Table created.

session1 > create table child (y number references parent);
Table created.

session1 > insert into parent values (1);
1 row created.

session1 > insert into parent values (2);
1 row created.

session1 > insert into parent values (3);
1 row created.

session1 > insert into child values (1);
1 row created.

session1 > insert into child values (3);
1 row created.

session1 > commit;
Commit complete.

session1 > delete from child where y = 1;
1 row deleted.

session1 > delete from parent where x = 1;
1 row deleted.

 At this step there are still no locks acquired as it can be verified via a simple query against v$session and v$lock

  
session3 > start getlocks.sql
no rows selected

Now, open a new session (session2) and issue the following insert

session2 > select distinct sid from v$mystat;
SID                                                                     
-----                                                                     
333

session2 > insert into child values (1);

session2 (SID 333) starts hanging!!! From session3, we can see that a lock has been acquired

  
session3 > start getlocks.sql
SID       WSID LOCK_TYPE           MODE_HELD          MODE_REQUESTED                                                                                                
---------- ---------- ------------------- ------------- ---------------------                                                                    
1074      333 Transaction         Exclusive           Share        

Then, go back to session 1 (SID 1074) and issue the following delete

  
session1 > delete from parent where x = 2;

The last delete hangs at about 3 seconds before it successfully complete.  If you manage to check the new lock situation before the deadlock happens you will see the following picture:

  
session3 > start getlocks.sql
SID       WSID LOCK_TYPE           MODE_HELD              MODE_REQUESTED                                                                                               
---------- ---------- ------------------- ----------------------------------------
333       1074 DML                 Row-X (SX)             S/ROW-X (SSX)                                                                                                
1074       333 Transaction          Exclusive              Share                 

 And finally after about 3 seconds of wait, session2 with SID 333 has been deadlocked and its transaction stopped by Oracle

  
insert into child values (1)
            *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

 The corresponding generated deadlock graph looks like:

  
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-00011fe5-00000000        28      333    SX             27     1074    SX   SSX  --> the session timed out
TX-000a0013-0000039d        27     1074     X             28      333           S  --> the persistent session
 session 333: DID 0001-001C-00000006      session 1074: DID 0001-001B-00000030
session 1074: DID 0001-001B-00000030     session 333: DID 0001-001C-00000006
 Rows waited on:
  Session 333: no row
  Session 1074: no row
 ----- Information for the OTHER waiting sessions -----
Session 1074:
  sid: 1074 ser: 93 audsid: 540090 user: 84/MOHAMED flags: 0x8000045
  pid: 27 O/S info: user: SYSTEM, term: FSC401216102802, ospid: 3708
    image: ORACLE.EXE (SHAD)
  client details:
    O/S info: user: FSC401216102802\Mohamed, term: FSC401216102802, ospid: 2816:748
    machine: WORKGROUP\FSC401216102802 program: sqlplus.exe
    application name: SQL*Plus, hash value=3669949024
  current SQL:
  delete from parent where x = 2
 ----- End of information for the OTHER waiting sessions -----
 Information for THIS session:
 ----- Current SQL Statement for this session (sql_id=6cnrmyxw03k6f) -----
insert into child values (1)
===================================================

 Since then, I know that an unindexed foreign key can present a deadlock situation with a graph showing both TX and TM enqueue.

August 6, 2011

Special Merge

Filed under: Oracle — hourim @ 3:50 pm

An interesting question came up on the otn thread ( http://forums.oracle.com/forums/thread.jspa?messageID=9772504&#9772504) a couple of days ago. The original poster was asking on how to avoid superfluous updates when using the MERGE command. He said that he wants the update to start only if one of the columns to be matched has been changed; otherwise he would like the update to be ignored by the MERGE command when a matching key has been found.

I included my answer within the thread. How, in your case, would you proceed to accomplish the original poster requirement?

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.