May | 2014 | Mohamed Houri’s Oracle Notes

May 12, 2014

Disjunctive subquery

Filed under: Trouble shooting — hourim @ 1:29 pm

Here it is a SQL query called from a .Net web service which is ”time outing” and to which I have been asked to bring a fix.

SELECT      d.col_id,
            d.col_no,
            d.col_dost_id,
            d.col_r_id,
            d.xxx,
            d.yyy,
            ……..
            d.zzz
      FROM table_mho d
      WHERE (d.COL_UK = ‘LRBRE-12052014’
          OR EXISTS (select 1
                     from table_mho d1
                     where d1.col_id = d.col_id
                       and exists (select 1
                                   from table_mho d2
                                   where d2.COL_UK = ‘LRBRE-12052014’
                                     and d1.master_col_id = d2.col_id
                                     and d2.col_type = 'M' )
                       and d1.col_type = 'S'
                       )
              )
    order by d.col_id;

Looking carefully at the content of this query I have immediately got a clue on what might be happening here: Disjunctive Subquery.

A disjunctive subquery represents a subquery that appears in an OR predicate (disjunction). And the above query has indeed an OR predicate followed by an EXISTS clause:

          OR EXISTS (select 1
                     from table_mho d1
                     where d1.col_id = d.col_id
                       and exists (select 1
                                   from table_mho d2
                                   where d2.COL_UK = ‘LRBRE-12052014’
                                     and d1.master_col_id = d2.col_id
                                     and d2.col_type = 'M' )
                       and d1.col_type = 'S'
                       )

I am not going to dig in the details of disjunctive subqueries and their inability to be unnested by the CBO for releases prior to 12c. I will be writing in a near future (I hope) a general article in which disjunctive subqueries will be explained and popularized via a reproducible model. The goal of this brief blog post is just to show how I have been successful to trouble shoot the above web service performance issue by transforming a disjunctive subquery into an UNION ALL SQL statement so that I gave the CBO an opportunity to choose an optimal plan.

Here it is the sub-optimal plan for the original query

---------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------
|   1 |  SORT ORDER BY                 |                  |      1 |  46854 |      1 |00:00:10.70 |
|*  2 |   FILTER                       |                  |      1 |        |      1 |00:00:10.70 |
|   3 |    TABLE ACCESS FULL           | TABLE_MHO        |      1 |    937K|    937K|00:00:00.94 |
|   4 |    NESTED LOOPS                |                  |    937K|      1 |      0 |00:00:07.26 |
|*  5 |     TABLE ACCESS BY INDEX ROWID| TABLE_MHO        |    937K|      1 |     60 |00:00:06.65 |
|*  6 |      INDEX UNIQUE SCAN         | COL_MHO_PK       |    937K|      1 |    937K|00:00:04.14 |
|*  7 |     TABLE ACCESS BY INDEX ROWID| TABLE_MHO        |     60 |      1 |      0 |00:00:00.01 |
|*  8 |      INDEX UNIQUE SCAN         | COL_MHO_UK       |     60 |      1 |     60 |00:00:00.01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("D"."COL_UK"=‘LRBRE-12052014’ OR  IS NOT NULL))
   5 - filter(("D1"."MASTER_COL_ID" IS NOT NULL AND "D1"."COL_TYPE"='S'))
   6 - access("D1"."COL_ID"=:B1)
   7 - filter(("D2"."COL_TYPE"='M' AND "D1"."MASTER_COL_ID"="D2"."COL_ID"))
   8 - access("D2"."COL_UK"=‘LRBRE-12052014’)

Statistics
---------------------------------------------------
          0  recursive calls
          0  db block gets
    3771234  consistent gets
      22748  physical reads
          0  redo size
       1168  bytes sent via SQL*Net to client
        244  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

Note the apparition of the FILTER operation (n°2) which is less efficient. One of the dramatic consequences of that is the NESTED LOOP operation (n°4) which has been started 937,000 times without producing any rows but nevertheless generating almost 4 millions of buffer gets. Because of this disjunctive subquery, Oracle is not able to merge the subquery clause with the rest of the query in order to consider another optimal path.

There is a simple technique if you want to re-write the above query in order to get rid of the disjunctive subquery: use of an UNION ALL as I did for my original query (bear in mind that in my actual case COL_UK column is NOT NULL)

SELECT ww.**
FROM
(SELECT     d.col_id,
            d.col_no,
            d.col_dost_id,
            d.col_r_id,
            d.xxx,
            d.yyy,
            ……..
            d.zzz
      FROM table_mho d
      WHERE d.COL_UK = ‘LRBRE-12052014’
UNION ALL
SELECT      d.col_id,
            d.col_no,
            d.col_dost_id,
            d.col_r_id,
            d.xxx,
            d.yyy,
            ……..
            d.zzz
      FROM table_mho d
      WHERE d.COL_UK != ‘LRBRE-12052014’
      AND EXISTS (select 1
                     from table_mho d1
                     where d1.col_id = d.col_id
                       and exists (select 1
                                   from table_mho d2
                                   where d2.COL_UK = ‘LRBRE-12052014’
                                     and d1.master_col_id = d2.col_id
                                     and d2.col_type = 'M' )
                       and d1.col_type = 'S'
                       )
              )
) ww
 order by ww.col_id;

And here it is the new corresponding optimal plan

------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name                  | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------------------------
|   1 |  SORT ORDER BY                     |                       |      1 |      2 |      1 |00:00:00.01 |
|   2 |   VIEW                             |                       |      1 |      2 |      1 |00:00:00.01 |
|   3 |    UNION-ALL                       |                       |      1 |        |      1 |00:00:00.01 |
|   4 |     TABLE ACCESS BY INDEX ROWID    | TABLE_MHO             |      1 |      1 |      1 |00:00:00.01 |
|*  5 |      INDEX UNIQUE SCAN             | COL_MHO_UK            |      1 |      1 |      1 |00:00:00.01 |
|   6 |     NESTED LOOPS                   |                       |      1 |      1 |      0 |00:00:00.01 |
|   7 |      VIEW                          | VW_SQ_1               |      1 |      1 |      0 |00:00:00.01 |
|   8 |       HASH UNIQUE                  |                       |      1 |      1 |      0 |00:00:00.01 |
|   9 |        NESTED LOOPS                |                       |      1 |      1 |      0 |00:00:00.01 |
|* 10 |         TABLE ACCESS BY INDEX ROWID| TABLE_MHO             |      1 |      1 |      0 |00:00:00.01 |
|* 11 |          INDEX UNIQUE SCAN         | COL_MHO_UK            |      1 |      1 |      1 |00:00:00.01 |
|* 12 |         TABLE ACCESS BY INDEX ROWID| TABLE_MHO             |      0 |      1 |      0 |00:00:00.01 |
|* 13 |          INDEX RANGE SCAN          | COL_COL_MHO_FK_I      |      0 |     62 |      0 |00:00:00.01 |
|* 14 |      TABLE ACCESS BY INDEX ROWID   | TABLE_MHO             |      0 |      1 |      0 |00:00:00.01 |
|* 15 |       INDEX UNIQUE SCAN            | COL_MHO_PK            |      0 |      1 |      0 |00:00:00.01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("D"."COL_UK"=‘LRBRE-12052014’)
  10 - filter("D2"."COL_TYPE"='M')
  11 - access("D2"."COL_UK"=‘LRBRE-12052014’)
  12 - filter("D1"."COL_TYPE"='S')
  13 - access("D1"."MASTER_COL_ID"="D2"."COL_ID")
       filter("D1"."MASTER_COL_ID" IS NOT NULL)
  14 - filter("D"."COL_UK"<>‘LRBRE-12052014’)
  15 - access("COL_ID"="D"."COL_ID")

Statistics
------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
       1168  bytes sent via SQL*Net to client
        403  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

I went from a massif 3,771,234 of consistent gets to just 8 logical I/O. The subquery has been transformed into an in-line view (VW_SQ_1) while the not always good FILTER operation disappeared letting the place to rapid operations accurately estimated by the CBO without re-computing the statistics in between

Bottom Line: Trouble shooting a query performance problem can sometime be achieved by reformulating the query so that you give the CBO a way to circumvent a transformation it can’t do with the original query.

Comments (3)

May 7, 2014

ORA-04030 when parsing XML messages

Filed under: Trouble shooting — hourim @ 2:56 pm

Two days ago I have been asked to look at the following error which occurred in a PRE-PRODUCTION data base just few weeks before going live:

ORA-04030: out of process memory when trying to allocate 4032 bytes (qmxdGetElemsBy,qmemNextBuf:alloc)

The aim of this article is to explain how (with the help of my colleagues) I succeeded to circumvent this ORA-04030 error.

First, I was aware that ORA-04031 is linked to an issue with the SGA while the current ORA-04030 is linked with an abnormal consummation of PGA memory.Unfortunately, having no prior acquaintances with this application makes the issue hard to solve. All what I have been said is that when the corresponding PRE-PRODUCTION application has been triggered to parse 10 XML messages with 3MB of size for each message, it crashes during the parsing process of the 6^th xml message with the above error.

I asked, nevertheless, the customer whether treating simultaneously 10 x 3MB XML messages is a situation that is possible to happen in their real life PRODUCTION application or they are just trying to fudge the database with extreme situations. And the answer was, to my disappointment, YES. This kind of XML load is very plausible once the application will go live.

I asked (again) if I can have those 10 XML messages in order to inject them into a DEVELOPMENT data base. Once they handled me a copy of the desired messages, I tried to parse them in the DEVELOPMENT instance which, hopefully (yes hopefully) crashes with the same error.

There is an ORA-04030 diagnostic tool developed by Oracle Support which, if you give it the corresponding alert log file, will generate for your system a set of recommendations that if implemented might solve the issue.

This tool when provided with the corresponding trace file, suggested us, among many other suggestions, to:

Set the hidden parameter _realfree_heap_pagesize_hint
Use bulk collect with the limit clause and gave us a link showing how to do that following Steven Feuerstein article published on oracle magazine

Taken into account my ignorance of the application business and how the code has been exactly managed to parse the xml messages, I was looking to use a tactical fix, which will be safe, necessitate a minimum of tests and with a big chance of no side effects. As far as I was in a development database I instrumented (added extra logging) the PL/SQL code until I have clearly isolated the piece of code which is causing the error:


BEGIN
   -- New parser
       lx_parser := xmlparser.newParser;

   -- Parse XML file
       xmlparser.parseClob(lx_parser,pil_xml_file);

 
   --pil_xml_file);
       lxd_docxml := xmlparser.getDocument(lx_parser);

   -- get all elements
         lxd_element := xmldom.GETDOCUMENTELEMENT(lxd_docxml);

   FOR li_metadata_id IN 1..pit_metadata.COUNT
   LOOP
     IF pit_metadata(li_metadata_id).MXTA_FATHER_ID IS NULL
     THEN
       gi_num_row_table := 1;
       P_PARSE_XML_ELEMENTS(lxd_element
                           ,NULL
                           ,pit_metadata
                           ,li_metadata_id
                           ,pit_metadata(li_metadata_id).TAG_PATH
                            );
     END IF;
   END LOOP;
   …./….
 END;

The above piece of code is called in a loop and is traumatizing the PGA memory.

With the help of a colleague, we decided to use a tactical fix which consists of freeing up the pga memory each time a xml document is treated using the Oracle xmldom.freedocument API. In other words, I added the following piece of code at the end of the above stored procedure:


     END IF;

   END LOOP;

   …./….

xmldom.freedocument(lxd_docxml); -- added this

END;

And guess what?

The ORA-04030 ceases to disturb us.

Bottom Line: When troubleshooting an issue in a real life production application try first to reproduce it in a TEST environment. Then instrument again the PL/SQL code in this TEST environment until you narrow the error close to its calling program. Then look to a tactical fix that will not need a lot of non-regression tests and which should be of limited side effect.

Comments (1)

Mohamed Houri’s Oracle Notes

May 12, 2014

Disjunctive subquery

May 7, 2014

ORA-04030 when parsing XML messages

Follow Blog via Email

Categories

Blog Stats

Blogs I Follow

Archives