Mohamed Houri’s Oracle Notes

October 1, 2016

SQL Profile: or when the hint exceeds 500 bytes

Filed under: Oracle — hourim @ 5:13 pm

A couple of days ago I have been in a weird situation following a 12c upgrade:

  • a critical query (46h7mfaac03yv) started to perform very poorly impacting the overall response time of a critical report
  • there were no ”good” execution plans in the historical execution of this query to use for fixing a SQL Profile.
  • The current execution plan has more than a hundred of operations making the issue very hard to solve very quickly
  • the real time sql monitoring report shows several parts of the plan contributing to the alteration of the response time.

One thing I have the good habit to do in many situations like the one exposed here is to check the historical execution plans of the same query in TEST environment. I also sometimes backup outlines of critical queries into a dedicated windows directory to use them if the need arises. Hopefully one of the TEST environment contains an execution plans with very good average elapsed time. Having this plan at my disposal, I used the following strategy to fix the good plan for the bad performing query:

  • I took the outline of the TEST execution plan put it into the sql_text of the PRODUCTION sql_id
  • I used the real time sql monitoring report to fill up the corresponding bind variables values
  • I opened a SQLPlus session in PRODUCTION and executed the new hinted query

And as expected the hinted query identified by the tandem(sql_id: 3ts967mzugyxw, child number:0) completes in few seconds. All what remains to do before announcing the good news for the client was to use a custom sql script with which I will transfer the execution plan of the hinted query to the production non hinted one. Something resembling to this:

create table t1 as select rownum n1 from dual connect by level <=1e2;
select count(1) from t1 where n1 <= 5;

---------------------------------------
SQL_ID  addzft9frsckw, child number 0
-------------------------------------
select count(1) from t1 where n1 <= 5

Plan hash value: 3724264953
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|*  2 |   TABLE ACCESS FULL| T1   |     5 |
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 – filter("N1"<=5)


SELECT count(1) from t1 where n1 <= 5;

----------------------------------------
SQL_ID  2w9a295mxcjgx, child number 0
-------------------------------------
SELECT count(1) from t1 where n1 <= 5

Plan hash value: 3724264953
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|*  2 |   TABLE ACCESS FULL| T1   |     5 |
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("N1"<=5)   

Creating a SQL profile for the first sql_id using the metadata of the second one is accomplished by means of the following call:

SQL> @fixProfilefromCache
Enter value for sql_id_from: 2w9a295mxcjgx
Enter value for child_no_from: 0
Enter value for sql_id_to: addzft9frsckw
Enter value for sql_id_to: addzft9frsckw
Enter value for sql_id_to: addzft9frsckw

PL/SQL procedure successfully completed.
select count(1) from t1 where n1 <= 5;

----------------------------------------
SQL_ID  addzft9frsckw, child number 0
-------------------------------------
select count(1) from t1 where n1 <= 5

Plan hash value: 3724264953
-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|*  2 |   TABLE ACCESS FULL| T1   |     5 |
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("N1"<=5)

Note
-----
   - SQL profile profile_addzft9frsckw_attach used for this statement
 

Unfortunately there are situations where you will stop to be lucky. Look to that weird situation I have been faced to when I applied the same script for the real life query:

SQL> @fixProfilefromCache
Enter value for sql_id_from: 3ts967mzugyxw
Enter value for child_no_from: 0
Enter value for sql_id_to: 46h7mfaac03yv
Enter value for sql_id_to: 46h7mfaac03yv
Enter value for sql_id_to: 46h7mfaac03yv
declare
*
ERROR at line 1:
ORA-05602: PL/SQL: numeric or value error: Bulk Bind: Truncated Bind
ORA-06512: at line 11

Few minutes of PL/SQL investigations reveal that the above error occurs around the following piece of code in the FixProfilefromCache.sql script

declare
   ar_profile_hints sys.sqlprof_attr;
   cl_sql_text clob;
begin
   select
      extractvalue(value(d), '/hint') as outline_hints
         bulk collect into ar_profile_hints
   from
      xmltable('/*/outline_data/hint'
         passing (
            select
               xmltype(other_xml) as xmlval
            from
               gv$sql_plan
            where
               sql_id = '&sql_id_from'
               and child_number = &child_no_from
               and other_xml is not null
			   and rownum =1
         )
      ) d;

And, to be more precise, exactly at this line

   select
      extractvalue(value(d), '/hint') as outline_hints
        bulk collect into ar_profile_hints

What is the definition of the sys.sqlprof_attr object type I am using very often without being preoccupied by its data type so far?

   desc SQLPROF_ATTR
        SQLPROF_ATTR VARRAY(2000) OF VARCHAR2(500)

It’s a list of strings not allowed to exceed 500 bytes each. Is this meaning that one of my real life query outline hints exceeded 500 bytes? Let’s check:

  select
    substr(outline_hints,1,45) outline_hints
   ,outline_hints_length
from
 (
   select
      extractvalue(value(d), '/hint') as outline_hints  
     ,length(extractvalue(value(d), '/hint')) as outline_hints_length
   from
      xmltable('/*/outline_data/hint'
         passing (
            select
               xmltype(other_xml) as xmlval
            from
               gv$sql_plan
            where
               sql_id = '3ts967mzugyxw'
               and child_number = 0
               and other_xml is not null
			   and rownum =1
         )
      ) d
  order by outline_hints_length desc
  )
where rownum <= 1;

OUTLINE_HINTS                                 OUTLINE_HINTS_LENGTH
--------------------------------------------- --------------------
USE_CONCAT(@"SEL$C59E9DD6" 8 OR_PREDICATES(3)                  508

That’s it. One of the outline hints exceeds the 500 bytes upper limit imposed by the sys.sqlprof_attr type. Don’t try to create your proper sys.sqlprof_attr type allowing to store more than 500 bytes. The import_sql_profile procedure of the dbms_sqltune package doesn’t allow a parameter with a different data type:

 PROCEDURE import_sql_profile(
                   sql_text      IN CLOB,
                   profile       IN sqlprof_attr,
                   name          IN VARCHAR2 := NULL,
                   description   IN VARCHAR2 := NULL,
                   category      IN VARCHAR2 := NULL,
                   validate      IN BOOLEAN  := TRUE,
                   replace       IN BOOLEAN  := FALSE,
                   force_match   IN BOOLEAN  := FALSE);

Hopefully it seems that Oracle has already foreseen this kind of situation and has overloaded the above procedure in order to accept the hint as a CLOB data type via the proxfile_xml parameter:

PROCEDURE import_sql_profile(
                   sql_text      IN CLOB,
                   profile_xml   IN CLOB,
                   name          IN VARCHAR2 := NULL,
                   description   IN VARCHAR2 := NULL,
                   category      IN VARCHAR2 := NULL,
                   validate      IN BOOLEAN  := TRUE,
                   replace       IN BOOLEAN  := FALSE,
                   force_match   IN BOOLEAN  := FALSE);

Finally I have created a new script which I have named fixprofilexmlfromcache and which I have successfully used to transfer the good plan to the sql_id of the real life query as shown below:

SQL> @FixProfileFromXmlFromCache
Enter value for sql_id_from: 3ts967mzugyxw
Enter value for child_no_from: 0
Enter value for sql_id_to: 46h7mfaac03yv
Enter value for sql_id_to: 46h7mfaac03yv
Enter value for sql_id_to: 46h7mfaac03yv

PL/SQL procedure successfully completed.

SUMMARY

While it is extremely rare to use a SQL Profile with hints exceeding 500 bytes it is however good to know that there is an alternative to overcome this limit by using the second overloaded import_sql_profile procedure of the dbms_sqltune package which accepts the outline hints as a CLOB instead of a varray of 500 bytes.

September 28, 2016

AWR and superfluous historical statistics

Filed under: ASH — hourim @ 6:11 pm

An active sql_id is subject to a capture and a load into ASH (gv$active_session_history). As long as this sql_id is present in ASH there is a 1/10 chance for it to be captured into AWR persisted historical tables as well. In addition to the ASH gv$active_session_history and its AWR alter-ego dba_hist_active_sess_history table, performance and tuning specialists are extensively using the dba_hist_sqlstat table into which snapshots of gv$sql are periodically sent by the MMON Lite process via the SGA-ASH circular buffer.

One of the queries I am using against this table is the following one which I am referring to it as histstats.sql (I think I have originally hijacked from the internet):

SELECT 
sn.snap_id,
plan_hash_value,
st.sql_profile,
executions_delta execs,
TRUNC(elapsed_time_delta/1e6/DECODE(executions_delta, 0, 1, executions_delta)) avg_etime,
  ROUND(disk_reads_delta    /DECODE(executions_delta,0,1, executions_delta),1) avg_pio,
  ROUND(buffer_gets_delta   /DECODE(executions_delta,0,1, executions_delta), 1) avg_lio ,
  ROUND(rows_processed_delta/DECODE(executions_delta,0, 1, executions_delta), 1) avg_rows
FROM 
  dba_hist_sqlstat st,
  dba_hist_snapshot sn
WHERE st.snap_id        = sn.snap_id
AND sql_id              = '&sql_id'
AND begin_interval_time >= to_date('&date','ddmmyyyy')
ORDER BY 1 ASC;

That’s said how would you read and interpret the following output of the above query taken from a running system for a particular sql_id?

SNAP_ID     PLAN_HASH_VALUE SQL_PROFILE                     EXECS AVG_ETIME AVG_LIO 
----------  --------------- ------------------------------ ----- ---------- ----------      
30838        726354567                                      0       7227    3945460      
30838        726354567                                      0       7227    3945460      
30839       4211030025      prf_2yhzvghb06vh4_4211030025    1          3      28715      
30839       4211030025      prf_2yhzvghb06vh4_4211030025    1          3      28715      
30839        726354567                                      0       7140    5219336      
30839        726354567                                      0       7140    5219336 
30840        726354567                                      0       7203    9389840      
30840        726354567                                      0       7203    9389840      
30840       4211030025      prf_2yhzvghb06vh4_4211030025    0       2817    7831649      
30840       4211030025      prf_2yhzvghb06vh4_4211030025    0       2817    7831649      
30841        726354567                                      0       7192    5200201      
30841        726354567                                      0       7192    5200201      
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            
30842       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0      
30842       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0      
30842        726354567                                      0       4956    3183667      
30842        726354567                                      0       4956    3183667  

Or, to make things less confused, how would you interpret the different rows of snapshot 30841 reproduced below?

SNAP_ID     PLAN_HASH_VALUE SQL_PROFILE                     EXECS AVG_ETIME AVG_LIO 
----------  --------------- ------------------------------ ----- ---------- ----------      
30841        726354567                                      0       7192    5200201      
30841        726354567                                      0       7192    5200201      
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            
30841       4211030025      prf_2yhzvghb06vh4_4211030025    0          0          0            

What do those zeros values in the line with a not-null SQL Profile actually mean?

The answer to this question resides in the way Oracle dumps statistics of a SQL statement from standard memory into AWR tables. When an active sql_id having multiple child cursors in gv$sql, the MMON Lite process will (if the sql_id qualify for the capture of course) dump into the dba_hist_sqlstat view statistics of all corresponding distinct child cursors including those not used by the active sql_id at the capture time (even the statistics of non-shareable cursors are dumped into this view provided these cursors are present in gv$sql at the capture moment).

For example in the above list you can see that we have 4 execution statistics at snapshot 30841 of which the two first ones are actually using effectively the plan_hash_value 726354567. According to their elapsed time (7192) these executions did span multiple snapshots (this is a serial execution plan by the way). But the remaining lines, those showing a SQL Profile, are in fact superfluous and confusing. They have been captured only because they correspond to the presence of a second shareable child cursor (plan_hash_value 4211030025) in gv$sql at the moment of the capture (snapshot 30841). One of the indication that they are superfluous is the zeros statistics in their average logical I/O, physical I/O, executions and number of rows processed.

Now that the scene has been set we need a reproducible example. For that purpose any SQL statement having multiple child cursors in gv$sql will do the trick. An easy way of engineering such a case is to use Adaptive Cursor Sharing (in 12.1.0.2.0) among other different scenarios:

The model

INSERT INTO t1
SELECT level n1 ,
  CASE
    WHEN level = 1
    THEN 1
    WHEN level > 1
    AND level <= 101
    THEN 100
    WHEN level > 101
    AND level <= 1101
    THEN 1000
    WHEN level > 10001
    AND level <= 11000
    THEN 10000
    ELSE 1000000
  END n2
FROM dual
  CONNECT BY level < 1200150;
  
CREATE INDEX t1_i1 ON t1(n2);

BEGIN
  dbms_stats.gather_table_stats 
         (user 
         ,'t1' 
         ,method_opt => 'for all columns size auto'
         ,cascade => true 
         ,estimate_percent => dbms_stats.auto_sample_size 
         );
END;
/ 

Generating multiple child cursors using ACS

var ln2 number;
exec :ln2 := 100;

SQL> select count(1) FROm t1 WHERE n2 <= :ln2;

  COUNT(1)
----------
       101

exec :ln2 := 1000000
SQL> select count(1) FROm t1 WHERE n2 <= :ln2;

  COUNT(1)
----------
   1200149

SQL_ID  9sp9wvczrvpty, child number 0
-------------------------------------
select count(1) FROm t1 WHERE n2 <= :ln2

Plan hash value: 2432955788

---------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes |
---------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |
|   1 |  SORT AGGREGATE   |       |     1 |     3 |
|*  2 |   INDEX RANGE SCAN| T1_I1 |   218 |   654 |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"<=:LN2)

In order for the above sql_id to be bind aware and get a new optimal execution plan at each execution, it needs to be executed once again (more details can be found here):

SQL> select count(1) FROm t1 WHERE n2 <= :ln2;

  COUNT(1)
----------
   1200149

SQL_ID  9sp9wvczrvpty, child number 1
-------------------------------------
select count(1) FROm t1 WHERE n2 <= :ln2

Plan hash value: 3724264953
----------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |
----------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |
|*  2 |   TABLE ACCESS FULL| T1   |  1191K|  3489K|
----------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("N2"<=:LN2)

Notice now that a new child cursor n°1 has been generated while the existing child cursor n° 0 switched to a non-shareable status. In order to wake it up, we need to run again the same query at the corresponding bind variable value:

exec :ln2 := 100;

select count(1) FROm t1 WHERE n2 <= :ln2;

SQL_ID  9sp9wvczrvpty, child number 2
-------------------------------------
select count(1) FROm t1 WHERE n2 <= :ln2

Plan hash value: 2432955788
---------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes |
---------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |
|   1 |  SORT AGGREGATE   |       |     1 |     3 |
|*  2 |   INDEX RANGE SCAN| T1_I1 |   218 |   654 |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 – access("N2"<=:LN2)

Finally, we have achieved the initial goal of having multiple execution plans in gv$sql for the above SQL statement:

SELECT 
  sql_id ,
  child_number ,
  first_load_time ,
  last_load_time ,
  plan_hash_value ,
  executions execs ,
  rows_processed ,
  trunc(buffer_gets/executions) lio ,
  is_shareable
FROM gv$sql
WHERE sql_id = '9sp9wvczrvpty' 
--and is_shareable = 'Y'    
order by last_load_time desc; 

SQL_ID        CHILD_NUMBER FIRST_LOAD_TIME     LAST_LOAD_TIME      PLAN_HASH_VALUE  EXECS ROWS_PROCESSED        LIO I
------------- ------------ ------------------- ------------------- --------------- ------ -------------- ---------- -
9sp9wvczrvpty            0 2016-09-28/07:19:21 2016-09-28/07:19:21      2432955788      2              2       1225 N
9sp9wvczrvpty            1 2016-09-28/07:19:21 2016-09-28/07:20:44      3724264953      1              1       2222 Y
9sp9wvczrvpty            2 2016-09-28/07:19:21 2016-09-28/07:22:03      2432955788      1              1          3 Y

Notice that the sql_id (9sp9wvczrvpty) has three child cursors of which child cursor 0 is not shareable anymore. There is still nothing sent into AWR table as shown below:

SQL>  @histstats
Enter value for sql_id: 9sp9wvczrvpty
Enter value for date: 27092016

no rows selected

Forcing AWR to dump SQL statistics
In order to make Oracle dumping statistics of this SQL statement we are going to force a manual AWR snapshot capture by means of the following package call:

SQL> exec dbms_workload_repository.create_snapshot;  

The call will create a snapshot and dump the above content of gv$sql into dba_hist_sqlstat table as shown below:

SQL>  @histstats
Enter value for sql_id: 9sp9wvczrvpty
Enter value for date: 27092016

   SNAP_ID SQL_ID        PLAN_HASH_VALUE SQL_PROFILE                     EXECS  AVG_ETIME    AVG_LIO  AVG_PIO  AVG_ROWS
---------- ------------- --------------- ------------------------------ ------ ---------- ---------- -------- ---------
      4305 9sp9wvczrvpty      2432955788                                     3          0        817        0         1
      4305 9sp9wvczrvpty      3724264953                                     1          0       2222        0         1

That what was clearly expected: Oracle has captured into dba_hist_sqlstat table 4 executions at plan_hash_value 2432955788 and 1 execution at plan_hash_value 3724264953.

Now that we have multiple child cursors present in gv$sql, if AWR is to dump the sql_id it will then dump all its child cursor even those not used during the captured time. For example if I run again the same query with bind variable (ln2= 1000000) Oracle will use the plan with plan_hash_value 3724264953; but it will, nevertheless, dump statistics of the other non-used plan_hash_value into dba_hist_sqlstat as well.

exec :ln2 := 1000000

select count(1) FROm t1 WHERE n2 <= :ln2;

exec dbms_workload_repository.create_snapshot;

SQL> @histstats
Enter value for sql_id: 9sp9wvczrvpty
Enter value for date: 27092016

SNAP_ID SQL_ID        PLAN_HASH_VALUE SQL_PROFILE    EXECS  AVG_ETIME    AVG_LIO  AVG_PIO  AVG_ROWS
----- ------------- --------------- ------------- ------ ---------- ---------- -------- ---------
 4305 9sp9wvczrvpty      2432955788                    3          0        817        0         1
 4305 9sp9wvczrvpty      3724264953                    1          0       2222        0         1
 
 4306 9sp9wvczrvpty      2432955788                    0          0          0        0         0
 4306 9sp9wvczrvpty      3724264953                    1          0       2222        0         1

That’s it. Notice how, as expected, Oracle has dumped two rows for the new snapshot 4306 of which the first one with plan_hash_value (2432955788) is superfluous and has been captured in AWR tables only because it was present in gv$sql at the snapshot capture time.

Now that we know that Oracle is dumping all plan_hash_value present in gv$sql at the snapshot capture time even those not used during this snapshot, instead of having the superfluous lines appearing in the historical execution statistics I have added a minor where clause to the above extensively used script to obtain this:

SELECT
 *
FROM
 (SELECT
   sn.snap_id
   ,plan_hash_value
   ,sql_profile
   ,executions_delta execs
   ,trunc(elapsed_time_delta/1e6/decode(executions_delta, 0, 1, executions_delta)) avg_etime
   ,round(disk_reads_delta/decode(executions_delta,0,1, executions_delta),1) avg_pio
   ,round(buffer_gets_delta/decode(executions_delta,0,1, executions_delta), 1) avg_lio
   ,round(px_servers_execs_delta/decode(executions_delta,0,1, executions_delta), 1) avg_px
   ,round(rows_processed_delta/decode(executions_delta,0, 1, executions_delta), 1) avg_rows
  FROM
     dba_hist_sqlstat st,
     dba_hist_snapshot sn
  WHERE st.snap_id = sn.snap_id
  AND sql_id = '&sql_id'
  AND begin_interval_time > to_date('&from_date','ddmmyyyy')
 )
 WHERE avg_lio != 0 – added clause
 ORDER by 1 asc;
SQL> @histstats2
Enter value for sql_id: 9sp9wvczrvpty
Enter value for from_date: 27092016

SNAP_ID PLAN_HASH_VALUE SQL_PROFILE  EXECS  AVG_ETIME  AVG_PIO    AVG_LIO     AVG_PX  AVG_ROWS
------- --------------- ----------- ------ ---------- -------- ---------- ---------- ---------
   4305      2432955788                  3          0        0      817.7          0         1
   4305      3724264953                  1          0        0       2222          0         1

   4306      3724264953                  1          0        0       2222          0         1

The above added where clause excludes SQL statements with 0 logical I/O.

But wait a moment!!!

This might, at the same time, exclude vital information from the historical statistics. For example the following situation will not consume any logical I/O but can spent a lot time locked waiting for a resource to be released:

SQL-1 > select * from t1 where rownum <=  3 for update;

        N1         N2
---------- ----------
      1825    1000000
      1826    1000000
      1827    1000000

SQL-2> lock table t1 in exclusive mode;  

The second session will obviously hang. Wait a couple of seconds, go back to session , issue a rollback and watch out the new situation of session n°2

SQL-1> rollback;

SQL-2> lock table t1 in exclusive mode;

Table(s) Locked.

All what remains to do is to take a manual snapshot and check out how many logical I/O have been consumed by the SQL statement which has tried to lock the table in exclusive mode:

SQL> exec dbms_workload_repository.create_snapshot;

select sql_id
from gv$sql where sql_text like '%lock table t1%'
and sql_text not like '%v$sql%';

SQL_ID
-------------
a9nb52un8wqf4

SQL> @histstats
Enter value for sql_id: a9nb52un8wqf4
Enter value for date: 27092016

SNAP_ID SQL_ID        PLAN_HASH_VALUE SQL_PROFILE  EXECS  AVG_ETIME    AVG_LIO  AVG_PIO  AVG_ROWS
------- ------------- --------------- ----------- ------ ---------- ---------- -------- ---------
   4308 a9nb52un8wqf4               0                  1        159          0        0         0

That’s how a SQL can take a long time to complete while consuming 0 Logical I/O. This is why I have added a second extra where clause to the histstats2 script as shown below:

SELECT 
*
FROM
  (SELECT
    sn.snap_id ,
    plan_hash_value ,
    executions_delta execs ,
TRUNC(elapsed_time_delta/1e6/DECODE(executions_delta, 0, 1, executions_delta)) avg_etime ,
    ROUND(disk_reads_delta      /DECODE(executions_delta,0,1, executions_delta),1) avg_pio ,
    ROUND(buffer_gets_delta     /DECODE(executions_delta,0,1, executions_delta), 1) avg_lio ,
    ROUND(px_servers_execs_delta/DECODE(executions_delta,0,1, executions_delta), 1) avg_px ,
    ROUND(rows_processed_delta  /DECODE(executions_delta,0, 1, executions_delta), 1) avg_rows
  FROM 
    dba_hist_sqlstat st,
    dba_hist_snapshot sn
  WHERE st.snap_id        = sn.snap_id
  AND sql_id              = '&sql_id'
  AND begin_interval_time > to_date('&from_date','ddmmyyyy')
  )
WHERE avg_lio != 0 
     OR (avg_lio   =0 AND avg_etime > 0) 
 ORDER by 1 asc;  
     

This new script (histats3.sql), when applied to the two sql_id we have investigated in this post gives respectively:

SQL> @HistStats3
Enter value for sql_id: 9sp9wvczrvpty
Enter value for from_date: 27092016

   SNAP_ID PLAN_HASH_VALUE  EXECS  AVG_ETIME  AVG_PIO    AVG_LIO     AVG_PX  AVG_ROWS
---------- --------------- ------ ---------- -------- ---------- ---------- ---------
      4305      2432955788      3          0        0      817.7          0         1
      4305      3724264953      1          0        0       2222          0         1
      4306      3724264953      1          0        0       2222          0         1

SQL> @HistStats3
Enter value for sql_id: a9nb52un8wqf4
Enter value for from_date: 27092016

   SNAP_ID PLAN_HASH_VALUE  EXECS  AVG_ETIME  AVG_PIO    AVG_LIO     AVG_PX  AVG_ROWS
---------- --------------- ------ ---------- -------- ---------- ---------- ---------
      4308               0      1        159        0          0          0         0

It goes without saying that I will be very grateful to have feedback about the added where clause. I might have neglected a situation in which both Logical I/O and elapsed time are not greater than 0 and that are important to appear in the historical statistics.

Last but not least you might have already pointed out the apparition of the AVG_PX column in the above output? That will concern the next article about ASH-AWR statistics for parallel queries. Stay tuned I have a nice example from a running system to share with you and where this column reveals to be very helpful. I may also show a version of the script which includes the end_of_fetch_count column which reveals also to be a valuable information to exploit when trying to understand average elapsed time of queries coming from a web-service and where a time out has been implemented.

September 27, 2016

TEMP Table transformation and PQ_SLAVE_mismatch

Filed under: Oracle — hourim @ 7:20 pm

A SQL Profiled query opted for a different execution plan despite the Note at the bottom of its execution plan indicating that a SQL Profile has been used. The new plan makes the query failing with the classical parallel query error due to a lack of TEMP space following a massive parallel broadcast distribution

A SQL Profiled query opted for a different execution plan despite the Note at the bottom of its execution plan indicating that a SQL Profile has been used. The new plan makes the query failing with the classical parallel query error due to a lack of TEMP space following a massive parallel broadcast distribution

ORA-12801: error signaled in parallel query server P013
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

Initially this query fails when it reached 32GB of TEMP space. When waked up, the on call DBA augmented the TEMP space but, unfortunately, the next run failed as well when it reached 137GB of TEMP space.

When it was my turn to investigate this issue, instead of continuing enlarging the TEMP tablespace, or changing the parallel distribution method, I decided to figure out why Oracle is refusing to use the SQL Profile and was compiling a new plan practically at each execution. Applying Tanel Poder nonshared script to the sql_id of the real world query gives this:

SQL> @nonshared 1b7g55gx40k79
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)...

INST_ID			              : 1
SQL_ID                        : gx28sa7z20btn
ADDRESS                       : 00007FF92115E830
CHILD_ADDRESS                 : 00007FF92115E6B0
CHILD_NUMBER                  : 1
REASON                        : <reason>PQ Slave mismatch(5)</reason>

-----------------
INST_ID			              : 1
SQL_ID                        : gx28sa7z20btn
ADDRESS                       : 00007FF92115E830
CHILD_ADDRESS                 : 00007FF92115E6B0
CHILD_NUMBER                  : 2
REASON                        : <reason>PQ Slave mismatch(5)</reason>
PQ_SLAVE_MISMATCH             :  Y

-----------------
INST_ID			              : 1
SQL_ID                        : gx28sa7z20btn
ADDRESS                       : 00007FF92115E830
CHILD_ADDRESS                 : 00007FF92115E6B0
CHILD_NUMBER                  : 3
REASON                        : <reason>PQ Slave mismatch(5)</reason>
PQ_SLAVE_MISMATCH             :  Y

-----------------
INST_ID			              : 1
SQL_ID                        : gx28sa7z20btn
ADDRESS                       : 00007FF92115E830
CHILD_ADDRESS                 : 00007FF92115E6B0
CHILD_NUMBER                  : 4
REASON                        : <reason>PQ Slave mismatch(5)</reason>
PQ_SLAVE_MISMATCH             :  Y
-----------------
INST_ID			              : 1
SQL_ID                        : gx28sa7z20btn
ADDRESS                       : 00007FF92115E830
CHILD_ADDRESS                 : 00007FF92115E6B0
CHILD_NUMBER                  : 5
REASON                        : <reason>PQ Slave mismatch(5)</reason>
PQ_SLAVE_MISMATCH             :  Y
-----------------

What does this PQ_SLAVE_MISMATCH non sharing reason mean?
Oracle defines it as

(Y|N) Top-level slave decides not to share cursor

This very short definition seems indicating that a parallel slave refused to share the execution plan of its Query Coordinator and decided to hard parse its proper execution plan even though they are both (the QC and the PX slaves) running in the same instance in an 11.2.0.4 release.

I spent a couple of minutes looking at the query trying to simplify it until I found the part of it causing the execution plan mismatch. Having got a clue of what is happening in this real life query I engineered the following model with which I have reproduced the same behaviour in 11.2.0.4 and 12.1.0.1.0. Look at the following setup (where you will recognize one of the Jonathan Lewis table scripts):

select banner from gv$version where rownum=1;

BANNER
-----------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production

create table t1
as
 with generator as (
      select  --+ materialize
	     rownum id
	  from dual
	  connect by level <=1000
)
select 
    trunc((rownum -1)/2) product_id
   ,mod(rownum-1,2)      flag1
   ,mod(rownum-1,3)      flag2
   ,rownum               n1
   ,lpad(rownum,30)      v1
from
    generator v1
   ,generator v2
 where rownum <= 1e6;
 
create table t2
as 
select 

     level id
     ,date  '2012-06-07' + mod((level-1)*5,10)+ interval '5' minute start_date
     ,rpad( 'xx',10) padding
from 
   dual
connect by level <=1e6;
 
begin
 dbms_stats.gather_table_stats(user, 't1', method_opt => 'for all columns size 1');
 dbms_stats.gather_table_stats(user, 't2', method_opt => 'for all columns size 1');
 end;
 /

And here’s below the query I will be using in order to reproduce the non-sharing reason:

with got_my_dates 
 as (select /*+ materialize */
          id
		 ,padding
		 ,start_date
	 from 
	      t2
	 where 
	   start_date   > to_date('07/06/2012','dd/mm/yyyy')
	 and start_date <= to_date('10/06/2012','dd/mm/yyyy')
	 )
select
     /*+ parallel(4) */
     t1.*
	,cte.padding
	,cte.start_date
from
     t1
	,got_my_dates cte
where
    t1.product_id = cte.id
and t1.product_id <= 1e3;

A simple remark before starting the experiment. In the real life query the “with subquery” has been automatically materialized by Oracle because it is called two times in the main query. This is why the materialize hint I used above might not be absolutely necessary for the behaviour, I will be explaining hereinafter, to happen.
Let’s also confirm, before starting the experiment, that this query has not been previously parsed and as such is completely unknown:

SQL> select sql_id
    from gv$sql
    where sql_text like '%got_my_dates%'
    and sql_text not like '%v$sql%';

no rows selected

In the following I will execute the above query, get it execution plan, and show how many child cursor it has used during this very first execution:

SQL> – run query

SQL_ID  1b7g55gx40k79, child number 0
-------------------------------------

Plan hash value: 2708956082
-------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Rows  | Bytes |    TQ  |IN-OUT|
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |       |       |        |      |
|   1 |  TEMP TABLE TRANSFORMATION |                            |       |       |        |      |
|   2 |   PX COORDINATOR           |                            |       |       |        |      |
|   3 |    PX SEND QC (RANDOM)     | :TQ10000                   |   994K|    22M|  Q1,00 | P->S |
|   4 |     LOAD AS SELECT         |                            |       |       |  Q1,00 | PCWP |
|   5 |      PX BLOCK ITERATOR     |                            |   994K|    22M|  Q1,00 | PCWC |
|*  6 |       TABLE ACCESS FULL    | T2                         |   994K|    22M|  Q1,00 | PCWP |
|   7 |   PX COORDINATOR           |                            |       |       |        |      |
|   8 |    PX SEND QC (RANDOM)     | :TQ20001                   |  1830 |   134K|  Q2,01 | P->S |
|*  9 |     HASH JOIN              |                            |  1830 |   134K|  Q2,01 | PCWP |
|  10 |      JOIN FILTER CREATE    | :BF0000                    |  1830 | 84180 |  Q2,01 | PCWP |
|  11 |       PX RECEIVE           |                            |  1830 | 84180 |  Q2,01 | PCWP |
|  12 |        PX SEND BROADCAST   | :TQ20000                   |  1830 | 84180 |  Q2,00 | P->P |
|  13 |         PX BLOCK ITERATOR  |                            |  1830 | 84180 |  Q2,00 | PCWC |
|* 14 |          TABLE ACCESS FULL | T1                         |  1830 | 84180 |  Q2,00 | PCWP |
|* 15 |      VIEW                  |                            |   994K|    27M|  Q2,01 | PCWP |
|  16 |       JOIN FILTER USE      | :BF0000                    |   994K|    22M|  Q2,01 | PCWP |
|  17 |        PX BLOCK ITERATOR   |                            |   994K|    22M|  Q2,01 | PCWC |
|* 18 |         TABLE ACCESS FULL  | SYS_TEMP_0FD9D6633_140F243 |   994K|    22M|  Q2,01 | PCWP |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter(("START_DATE">TO_DATE(' 2012-06-07 00:00:00', 'syyyy-mm-dd hh24:mi:ss') 
              AND "START_DATE"<=TO_DATE(' 2012-06-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
   9 - access("T1"."PRODUCT_ID"="CTE"."ID")
  14 - access(:Z>=:Z AND :Z<=:Z)
       filter("T1"."PRODUCT_ID"<=1000)
  15 - filter("CTE"."ID"<=1000)
  18 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"C0"))
 
Note
-----
- Degree of Parallelism is 4 because of hint


SQL> @gv$sql
Enter value for sql_id: 1b7g55gx40k79

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER LAST_LOAD_TIME      EXECUTIONS
------------- --------------- ------------ ------------------- ----------
1b7g55gx40k79      2708956082            0 2016-09-26/07:30:23          1
1b7g55gx40k79      2708956082            1 2016-09-26/07:30:24          0

Notice how a completely new query produces, during its very first execution, two child cursors. Interestingly, the number of executions seems to indicate that Oracle used the child cursor 0 while the number of executions of the child cursor n°1 has not been incremented.

The reason for this double child cursors is:

SQL> @nonshared 1b7g55gx40k79
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)...

SQL_ID                        : 1b7g55gx40k79
ADDRESS                       : 00007FF92115E830
CHILD_ADDRESS                 : 00007FF92115E6B0
CHILD_NUMBER                  : 0
REASON                        : <reason>PQ Slave mismatch(5)</reason>
CON_ID                        : 1

-----------------

SQL_ID                        : 1b7g55gx40k79
ADDRESS                       : 00007FF92115E830
CHILD_ADDRESS                 : 00007FF9214A40E8
CHILD_NUMBER                  : 1
PQ_SLAVE_MISMATCH             : Y
REASON                        : <reason>PQ Slave mismatch(5)</reason>
CON_ID                        : 1
-----------------

So here we are: exactly at the same situation as the real world query.

Another remark which is worth to be mentioned here is that the execution plan of child cursor n°1 is exactly identical to the child cursor n° 0 shown above except this bizarre Note at the bottom:

SQL_ID  1b7g55gx40k79, child number 1
-------------------------------------
with got_my_dates  as (select /*+ materialize */           id    
,padding    ,start_date   from        t2   where     start_date   > 
to_date('07/06/2012','dd/mm/yyyy')   and start_date <= 
to_date('10/06/2012','dd/mm/yyyy')   ) select      /*+ parallel(4) */   
   t1.*  ,cte.padding  ,cte.start_date from      t1  ,got_my_dates cte 
where     t1.product_id = cte.id and t1.product_id <= 1e3
 
Note
-----
- automatic DOP: Computed Degree of Parallelism is 4

I don’t clearly get where this automatic DOP is coming from since the auto DOP parameter is not set in my personal laptop nor in the real world application:

SQL> show parameter parallel_degree_policy

PARAMETER_NAME                  TYPE        VALUE
------------------------------- ----------- -------
parallel_degree_policy          string      MANUAL

Last but not least, while the query was always requesting 2*DOP (8) parallel serves, Oracle managed, systematically, to give it 12 (and sometimes 16) parallel servers:

SQL> select
         sql_id
         ,process_name px_slave
     from gv$sql_monitor
     where sql_id = '1b7g55gx40k79'
     and sql_exec_id =16777216
     and  trunc(sql_exec_start) =to_date('27092016','ddmmyyyy')
     and sql_text is null
     order by 2 ;

SQL_ID        PX_SL
------------- -----
1b7g55gx40k79 p000
1b7g55gx40k79 p000
1b7g55gx40k79 p001
1b7g55gx40k79 p001
1b7g55gx40k79 p002
1b7g55gx40k79 p002
1b7g55gx40k79 p003
1b7g55gx40k79 p003
1b7g55gx40k79 p004
1b7g55gx40k79 p005
1b7g55gx40k79 p006
1b7g55gx40k79 p007

12 rows selected.


Global Information
------------------------------
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  1                   
 Session             :  C##MHOURI (7:4495)  
 SQL ID              :  1b7g55gx40k79       
 SQL Execution ID    :  16777216            
 Execution Started   :  09/27/2016 07:29:34 
 First Refresh Time  :  09/27/2016 07:29:34 
 Last Refresh Time   :  09/27/2016 07:29:45 
 Duration            :  11s                 
 Module/Action       :  SQL*Plus/-          
 Service             :  orcl                
 Program             :  sqlplus.exe         
 Fetch Calls         :  3                

Parallel Execution Details (DOP=4 , Servers Allocated=12)

This information of Servers Allocated that are greater than the requested servers is not an innocent information and should always kept your attention.

That’s said, as you might have already guessed via the title of this post, the simplification of the real word query shows that the parallel slave is refusing to share the QC execution plan because of the materialisation of the Common Table Expression. This is why when I pre-empted the materialisation of the CTE via the /*+ inline */ hint the parallel slave shared the execution plan of its QC as shown below:

with got_my_dates
 as (select /*+ inline */
          id
            ,padding
            ,start_date
    from
         t2
    where
      start_date   > to_date('07/06/2012','dd/mm/yyyy')
    and start_date <= to_date('10/06/2012','dd/mm/yyyy')
    )
select
     /*+ parallel(4) */
     t1.*
   ,cte.padding
   ,cte.start_date
from
     t1
   ,got_my_dates cte
where
    t1.product_id = cte.id
and t1.product_id <= 1e3;

1000 rows selected.

SQL> start xpsimp

SQL_ID  4h1qa708b9p3j, child number 0
-------------------------------------
Plan hash value: 2637578939
--------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes |    TQ  |IN-OUT|
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |        |      |
|   1 |  PX COORDINATOR             |          |       |       |        |      |
|   2 |   PX SEND QC (RANDOM)       | :TQ10002 |   681 | 47670 |  Q1,02 | P->S |
|*  3 |    HASH JOIN BUFFERED       |          |   681 | 47670 |  Q1,02 | PCWP |
|   4 |     JOIN FILTER CREATE      | :BF0000  |   681 | 16344 |  Q1,02 | PCWP |
|   5 |      PX RECEIVE             |          |   681 | 16344 |  Q1,02 | PCWP |
|   6 |       PX SEND HYBRID HASH   | :TQ10000 |   681 | 16344 |  Q1,00 | P->P |
|   7 |        STATISTICS COLLECTOR |          |       |       |  Q1,00 | PCWC |
|   8 |         PX BLOCK ITERATOR   |          |   681 | 16344 |  Q1,00 | PCWC |
|*  9 |          TABLE ACCESS FULL  | T2       |   681 | 16344 |  Q1,00 | PCWP |
|  10 |     PX RECEIVE              |          |  1830 | 84180 |  Q1,02 | PCWP |
|  11 |      PX SEND HYBRID HASH    | :TQ10001 |  1830 | 84180 |  Q1,01 | P->P |
|  12 |       JOIN FILTER USE       | :BF0000  |  1830 | 84180 |  Q1,01 | PCWP |
|  13 |        PX BLOCK ITERATOR    |          |  1830 | 84180 |  Q1,01 | PCWC |
|* 14 |         TABLE ACCESS FULL   | T1       |  1830 | 84180 |  Q1,01 | PCWP |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."PRODUCT_ID"="ID")
   9 - access(:Z>=:Z AND :Z<=:Z)
       filter(("ID"<=1000 AND "START_DATE">TO_DATE(' 2012-06-07 00:00:00', 'syyyy-mm-dd
               hh24:mi:ss') 
       AND  "START_DATE"<=TO_DATE(' 2012-06-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
  14 - access(:Z>=:Z AND :Z<=:Z)
       filter(("T1"."PRODUCT_ID"<=1000 AND
       SYS_OP_BLOOM_FILTER(:BF0000,"T1"."PRODUCT_ID")))
 
Note
-----
   - Degree of Parallelism is 4 because of hint
 

Notice below how, now that the CTE is not materialized, the query is using a single child cursor :

SQL> @gv$sql2
Enter value for sql_id: 4h1qa708b9p3j

SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER LAST_LOAD_TIME      EXECUTIONS END_OF_FETCH_COUNT
------------- --------------- ------------ ------------------- ---------- ------------------
4h1qa708b9p3j      2637578939            0 2016-09-27/18:00:54          1                  1

SQL> @nonshared 4h1qa708b9p3j
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)...

SQL_ID                        : 4h1qa708b9p3j
ADDRESS                       : 00007FF921744A58
CHILD_ADDRESS                 : 00007FF921744698
CHILD_NUMBER                  : 0
REASON                        :
CON_ID                        : 1
-----------------

And spot as well that the non materialization of the CTE is so that the number of requested parallel server (2*DOP) equals the number of allocated servers as shown below :

SQL> select
        sql_id
       ,process_name px_slave
     from gv$sql_monitor
     where sql_id = '4h1qa708b9p3j'
     and sql_exec_id =16777216
     and trunc(sql_exec_start) = trunc(sysdate)
     and sql_text is null
     order by 2 ;

SQL_ID        PX_SL
------------- -----
4h1qa708b9p3j p000
4h1qa708b9p3j p001
4h1qa708b9p3j p002
4h1qa708b9p3j p003
4h1qa708b9p3j p004
4h1qa708b9p3j p005
4h1qa708b9p3j p006
4h1qa708b9p3j p007

8 rows selected.

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  1                   
 Session             :  C##MHOURI (7:4495)  
 SQL ID              :  4h1qa708b9p3j       
 SQL Execution ID    :  16777216            
 Execution Started   :  09/27/2016 18:00:54 
 First Refresh Time  :  09/27/2016 18:00:54 
 Last Refresh Time   :  09/27/2016 18:00:57 
 Duration            :  3s                  
 Module/Action       :  SQL*Plus/-          
 Service             :  orcl                
 Program             :  sqlplus.exe         
 Fetch Calls         :  3                 

Parallel Execution Details (DOP=4 , Servers Allocated=8)

And for those who can’t deal without execution plan here’s below the new execution plan

Plan hash value: 2637578939
--------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes |    TQ  |IN-OUT|
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |        |      |
|   1 |  PX COORDINATOR             |          |       |       |        |      |
|   2 |   PX SEND QC (RANDOM)       | :TQ10002 |   738 | 51660 |  Q1,02 | P->S |
|*  3 |    HASH JOIN BUFFERED       |          |   738 | 51660 |  Q1,02 | PCWP |
|   4 |     JOIN FILTER CREATE      | :BF0000  |   737 | 17688 |  Q1,02 | PCWP |
|   5 |      PX RECEIVE             |          |   737 | 17688 |  Q1,02 | PCWP |
|   6 |       PX SEND HYBRID HASH   | :TQ10000 |   737 | 17688 |  Q1,00 | P->P |
|   7 |        STATISTICS COLLECTOR |          |       |       |  Q1,00 | PCWC |
|   8 |         PX BLOCK ITERATOR   |          |   737 | 17688 |  Q1,00 | PCWC |
|*  9 |          TABLE ACCESS FULL  | T2       |   737 | 17688 |  Q1,00 | PCWP |
|  10 |     PX RECEIVE              |          |  1986 | 91356 |  Q1,02 | PCWP |
|  11 |      PX SEND HYBRID HASH    | :TQ10001 |  1986 | 91356 |  Q1,01 | P->P |
|  12 |       JOIN FILTER USE       | :BF0000  |  1986 | 91356 |  Q1,01 | PCWP |
|  13 |        PX BLOCK ITERATOR    |          |  1986 | 91356 |  Q1,01 | PCWC |
|* 14 |         TABLE ACCESS FULL   | T1       |  1986 | 91356 |  Q1,01 | PCWP |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."PRODUCT_ID"="ID")
   9 - access(:Z>=:Z AND :Z<=:Z)
       filter(("ID"<=1000 AND "START_DATE">TO_DATE(' 2012-06-07 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND 
              "START_DATE"<=TO_DATE(' 2012-06-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
  14 - access(:Z>=:Z AND :Z<=:Z)
       filter(("T1"."PRODUCT_ID"<=1000 AND SYS_OP_BLOOM_FILTER(:BF0000,"T1"."PRODUCT_ID")))
 
Note
-----
- Degree of Parallelism is 4 because of hint

While the initial execution plan uses two DFO trees, the new one managed to use only a single DFO trees with 3 DFOs. Having multiple parallelisers (or DFOs) might sometimes create issues like in this current case of TEMP TABLE transformation and PQ_SLAVE_MISMATCH.

BOTTOM LINE

Even though this issue doesn’t reproduce in 12.0.1.2, for previous Oracle releases, watch out your parallel queries using a temp table transformation. You might be confronted to an execution plan instability due to the PQ_SLAVE_MISMATCH non-sharing reason where a parallel slave refuses to share the execution plan of its query coordinator. Bear in mind as well that in such situation even a SQL Profile will not succeed to definitely guarantee the same execution plan at each execution.

August 6, 2016

Primary Key non unique unusable index and direct path load:again

Filed under: direct path,Oracle,SQL Server — hourim @ 6:10 am

This combination of aprimary key policed by a non-unique unusable index on table direct path loaded raising error ORA-26026 has occurred again following a recent upgrade from 11.2.0.3 to 11.0.2.4

I know that many authors like Randolph Geist and Richard Foote has already wrote something about this subject. I, however, for the sake of my proper documentation, decided to summarise this issue in my blog.

The story comes when a developer sends me an e-mail saying that he has observed the following error in one of his overnight batch jobs in one of the TEST databases:

ORA-26026 : unique index xxx.t1_non_unique_idx initially in unusable state

Notice how the error message is pointing to a unique index error on a non-unique index. That was, for me, the first clue to what is really happening in the developer situation. The name of the developer index in the original error text was not so suggestive about the uniqueness of the index as the one I used in the above error message. The developer batch job was accomplishing the following steps:

  • disable all non-unique indexes on the target table
  • parallel direct path load into the target table

The developer says that his job was running quite smoothly in 11.2.0.3 and started failing because of the ORA-26026 following a fresh upgrade to 11.2.0.4. Here’s below the developer set-up you can use and play with at will:

create table t1 as select
         rownum              n1
        ,trunc((rownum-1/3)) n2
        ,mod(rownum,10)      n3
    from dual
    connect by level <= 1e3;

create index t1_non_unique_idx on t1(n1,n2);

alter table t1 add constraint t1_pk primary key (n1) using index;

create unique index t1_pk on t1(n1);

 select index_name, uniqueness
     from user_indexes
     where table_name = 'T1';
INDEX_NAME                     UNIQUENES
------------------------------ ---------
T1_PK                          UNIQUE
T1_NON_UNIQUE_IDX              NONUNIQUE

select
        constraint_name
       ,index_name
    from user_constraints
    where table_name = 'T1'
    and constraint_type = 'P';

CONSTRAINT INDEX_NAME
---------- ---------------------
T1_PK      T1_NON_UNIQUE_IDX

With this set-up in place the primary key constraint is policed by the non-unique index t1_non_unique_idx. The batch continues then by disabling all non-unique indexes and direct path load into the target table. Something resembling to the following piece of SQL code:

create table t2 as select * from t1;
truncate table t1;
alter index t1_non_unique_idx unusable;

Execute the following insert/select piece of code in 11.2.0.3 and you will not encounter any error:

insert /*+ append */ into t1 select * from t2;

Upgrade to 11.2.0.4 (or 12c) and you will immediately hit the ORA-26026:

ERROR at line 1:
ORA-26026: unique index C##MHOURI.T1_NON_UNIQUE_IDX initially in unusable state

In the developer case, despite the existence of a perfect unique index to cover the Primary key, this constraint was enforced via a non-unique index. It seems that the developer has created the T1_PK index after he has created the non-unique index and the primary key constraint.

To solve this issue all what I did is to change the index enforcing the primary key as follows:

alter table t1 modify constraint t1_pk using index t1_pk;

select
         constraint_name
        ,index_name
    from user_constraints
    where table_name = 'T1'
    and constraint_type = 'P';

CONSTRAINT INDEX_NAME
---------- ----------
T1_PK      T1_PK

insert /*+ append */ into t1 select * from t2;

1000 rows created.

Bottom line : when you are direct path loading data into a table after you have disabled its non-unique indexes be sure that your primary key is not enforced via one of those disabled non-unique indexes. Otherwise, depending on the Oracle release you are using, your insert will be rejected because of the ORA-26026 error.

July 16, 2016

SQL Server 2016 : parallel DML

Filed under: Oracle — hourim @ 2:16 pm

In contrast to Oracle where parallel DML is possible since a long time in SQL Server it is only until the 2016 last release where a parallel insert-select has been made possible. This article aims to illustrate this parallel operation in SQL Server and show how the resemblance with Oracle is very accurate.

I am going first to create the source and the target tables necessary for the demonstration

drop table t1;
drop table t2;

create table t1 (id   INT,
                 Nom  VARCHAR(100),
  	             Prenom  VARCHAR(100),
		        Ville   VARCHAR(100)
		 );

insert into t1(id, Nom, Prenom, Ville)
select TOP 1000000 ROW_NUMBER() over (order by a.name) ROWID,
                   'Bruno',
		   CASE WHEN ROW_NUMBER() over (order by a.name)%2 =1 THEN 'Smith'
		   ELSE 'Mohamed' END,
		   CASE WHEN ROW_NUMBER() over (order by a.name)%10 =1 THEN 'Paris'
		   WHEN ROW_NUMBER() over (order by a.name)%10 =5 THEN 'Lille'
		   WHEN ROW_NUMBER() over (order by a.name)%10 =3 THEN 'Marseille'
		   ELSE 'Londres' END
FROM sys.all_objects a
cross join sys.all_objects b;

create table t2 (id     INT,
                 Nom    VARCHAR(100),
	            Prenom  VARCHAR(100),
	      	    Ville   VARCHAR(100)
		 );

I have created table t1 with 1,000,000 rows and have cloned it into an empty t2 table. I am going below to insert the content of t1 into t2 table hoping that this insert will operate in parallel.

Here’s below the SQL Server version I am using:

select @@version;

Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64)
Copyright (c) Microsoft Corporation  Developer Edition

One of the parallel DML pre-requisites in SQL Server is the compatibility_level parameter which should have a value set to 130. Let’s verify this parameter value before trying the insert/select operation:

select name, compatibility_level from sys.databases where name = 'master';

name	compatibility_level
master	130

If the value of this parameter is not equal to 130 you can set it using the following command (I don’t know the impact of this change on your application so don’t change it without measuring its possible side effects ):

 alter database master set compatibility_level = 130;

Finally I am now ready to launch the insert/select operation and gather its corresponding execution plan:

insert into t2 WITH (TABLOCK)
select * from t1;

parallel DML3

Notice that I have added a WITH (TABLOCK) clause to this insert so that it can operates in parallel. In close comparison with Oracle direct path load where a lock is automatically acquired on the inserted table (until the transaction is committed) it seems that in SQL Server we need to explicitly acquire this lock with the (TABLOCK) clause.

You might ask how I have figured out that the insert/select operation has been done in parallel. In fact the two yellow arrows in the above execution plan indicate the parallel nature of the execution. However the two yellow arrows do not indicate that the DOP (Degree Of Parallelism) is 2. If you want to know the actual DOP used by this operation you have to hover over the first operation (towards the left : INSERT) to see a tooltip showing that degree of parallelism if 4. However I still have not found a convenient way to capture a mouse tooltip using greenshot. When dealing with SQL Server execution plan I very often prefer the free version of the SQL Sentry plan explorer from which I have captured the following part of the above execution plan where you can see that the number of executions (Actual Executions or Starts in Oracle terms) equals 4 indicating that the DOP is 4:

Sentry Plan Explorer 3 - parallel_dml.sqlplan

Notice by the way that in contrast to the graphical execution plan where the INSERT operation doesn’t seem to be done in parallel the SQL Sentry Plan explorer is clearly indicating that the insert has been operated in parallel.

If you know how parallel process is handled in Oracle you will certainly not be disappointed when you start dealing with parallel processing in SQL Server. Almost all the parallel concepts are identical. That is:

  • The maximum number of 2 concurrent DFO  active per DFO tree
  • The number of parallel servers (thread) which is 2 * DOP
  • The different data parallel distribution between servers
  • etc…

I have mentioned above that, one of the pre-requisites for a parallel insert/select operation to be successfully accomplished is the explicit lock of the inserted table. This has paved for me the way to check whether the Oracle direct path impeaching reasons: triggers and foreign keys can also restrict the parallel insert in SQL Server. Here’s then the demo; first with a trigger and second with a foreign key implemented on the t2 table:

CREATE TRIGGER t2_A_Ins_ ON  t2
FOR INSERT
AS
begin
    declare @Firstname nvarchar(50)
    set @Firstname = 'Unknown'
END;
truncate table t2;

insert into t2 WITH (TABLOCK)
select * from t1;

parallel DML trigger
As you can point it out the insert trigger annihilates the parallel insert operations in SQL Server.

drop trigger master.t2_A_Ins;
ALTER TABLE t1 ADD CONSTRAINT t1_pk
UNIQUE (id);

ALTER TABLE t2 ADD CONSTRAINT t2_t1_fk
FOREIGN KEY (id) references t1(id);

And now an insert on table t2 having a foreign key

truncate table t2;

insert into t2 WITH (TABLOCK)
select * from t1;

Plan with FK

Again the existence of a Foreign key on the t2 table pre-empted a parallel insert/select operation.
Funny enough the resemblance with Oracle.

Before pushing the “Publish” button I have played again with the model and from time to time I was unable to reproduce exactly the conclusions made in this article about the parallel insert impeachment reasons. I will certainly be back to this article when my investigations will be finished

June 24, 2016

Index Skip Scan : how many columns are skippable?

Filed under: Index — hourim @ 10:49 am

If you want to know what an index skip scan operation is then Richard Foote article and Jonathan Lewis one you can find here and here respectively are two reliable references.

This article deals about a particular query using an index skip scan with the following Real Time SQL Monitoring (RTSM) report

Global Information
------------------------------
 Status              :  EXECUTING  
 Instance ID         :  1                   
 Session             :  C##MHOURI (965:52281)  
 SQL ID              :  c2mah5hxau88v       
 SQL Execution ID    :  16777216            
 Execution Started   :  06/15/2016 13:03:15 
 First Refresh Time  :  06/15/2016 13:03:21 
 Last Refresh Time   :  06/15/2016 14:49:32 
 Duration            :  6378s                  
 Module/Action       :  SQL*Plus/-          
 Service             :  orcl                
 Program             :  sqlplus.exe         
 Fetch Calls         :  2280           


Global Stats
================================================================
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
================================================================
|    6385 |    193 |     6192 |   2280 |   2M   | 2M   |  14GB |
================================================================
  
SQL Plan Monitoring Details (Plan Hash Value=517922901)
======================================================================================
| Id |               Operation               | Name  |   Time    | Start  | Activity |
|    |                                       |       | Active(s) | Active |   (%)    |
======================================================================================
|  0 | SELECT STATEMENT                      |       |      6297 |     +67|          |
|  1 |   TABLE ACCESS BY INDEX ROWID BATCHED | T     |      6297 |     +67|      0.52|
|->2 |    INDEX SKIP SCAN                    | T_IDX |      6378 |     +2 |     99.48|
======================================================================================

As you can see after 6385 seconds the query is still running. The Start Active column when correlated with the Time Active one indicates that 99% of this time is spent on the index skip scan operation. We all know that an index skip scan is an indicator of a missing index or a pointer to a wrongly designed one but, what has retained my attention in this case is that this skipped index is a 7 columns composed index of which the first 6 columns have been skipped.

Now the question turns to: are those 6 leading columns very repetitive(contain few distinct values) so that they can be skipped? Let’s first see the size of this index and how its columns values are distributed respectively:

SQL> @sizeBysegName
Enter value for segment_name : IDX

SEGMENT_TYPE       TABLESPACE_NAME       SEGMENT_NAME     PARTITION    GB
------------------ --------------------- --------------   ----------  -------
INDEX              IDX_TBS                T_IDX                        28.44
                                                                       -------
Total Segment Size                                                     28.44

SQL> col column_name format a20
SQL> select
  2         i.column_name
  3        ,i.column_position
  4       ,s.num_distinct
  5      from
  6         user_ind_columns i
  7       , user_tab_col_statistics s
  8       where
  9        i.column_name = s.column_name
 10       and   index_name = 'T_IDX'
 11       order by column_position asc;

COLUMN_NAME          COLUMN_POSITION NUM_DISTINCT
-------------------- --------------- ------------
BKR_ID                             1         1337
PRD_ID                             2      3342592
PRD_QTY                            3       197008
IS_PROCESED                        4            2
OPR_PRD_QTY                        5       170576
PRD_FMLY                           6            9
PROCESS_ID                         7            1

7 rows selected.

Interestingly, except the first column which is relatively repetitive when compared to the index size, neither the second column which contains more than 3 million of distinct values nor the third and the fifth which count for about 2 hundreds of thousands of distinct values, can be considered as repetitive. So why they have been skipped?

Additionally, skipping an index,say idx(a,b,c) on its column c is somehow rewriting the original query by adding two predicates on the two first index columns. Something as follows:

From

select
    {lits of columns}
from table
where
   c = :1;

To

select
    {lits of columns}
from table
where
    a in {all distinct value of a}
and b in {all distinct value of b}
and c = :1;

The two added predicates explain why the number of distinct values of the index leading edge columns should be very small for the index skip scan path to be considered.

If you want to reproduce such a kind of index operation then here’s the model I have engineered

create table t
as select
      rownum   prd_id
	 ,rownum-1 sec_prd_id
	 ,trunc((rownum-1)/3) prd_qty
	 ,trunc((rownum-1)/3) opr_prd_qty
	 ,mod(rownum,100)     bkr_id
	 ,case mod(rownum,1000000) when  0  then 'x'
	                           when  1  then 'y'
							   when  2  then 'z'
							   when  3  then 'a'
							   when  4  then 'b'
							   when  5  then 'c'
							   else 'd' end  prd_fmly
     ,case rownum when 1 then 0 else 1 end is_procesed
     ,0                                    process_id
from dual
connect by level  'for all columns size 1');
end;

SQL> select
  2         i.column_name
  3        ,i.column_position
  4       ,s.num_distinct
  5       ,s.histogram
  6      from
  7         user_ind_columns i
  8       , user_tab_col_statistics s
  9       where
 10        i.column_name = s.column_name
 11       and   index_name = 'T_IDX'
 12       order by column_position asc;

COLUMN_NAME          COLUMN_POSITION NUM_DISTINCT HISTOGRAM
-------------------- --------------- ------------ ------------
BKR_ID                             1          100 NONE
PRD_ID                             2      1000000 NONE
PRD_QTY                            3       337344 NONE
IS_PROCESED                        4            2 NONE
OPR_PRD_QTY                        5       337344 NONE
PRD_FMLY                           6            7 NONE
PROCESS_ID                         7            1 NONE

7 rows selected.
SQL> set autotrace on explain stat
SQL>  select /*+ opt_param ('optimizer_index_cost_adj' 20) */
  2             t.*
  3          from t
  4          where
  5              t.prd_id in (5507203,78400086,359798,4565189,9495,12215)
  6          and t.process_id = 0
  7          and t.opr_prd_qty  0;

    PRD_ID SEC_PRD_ID    PRD_QTY OPR_PRD_QTY     BKR_ID P IS_PROCESED PROCESS_ID
---------- ---------- ---------- ----------- ---------- - ----------- ----------
     12215      12214       4071        4071         15 d           1          0
      9495       9494       3164        3164         95 d           1          0
    359798     359797     119932      119932         98 d           1          0

---------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |     4 |   120 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T     |     4 |   120 |
|*  2 |   INDEX SKIP SCAN                   | T_IDX |     4 |       |
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T"."PROCESS_ID"=0)
       filter(("T"."PRD_ID"=9495 OR "T"."PRD_ID"=12215 OR "T"."PRD_ID"=359798 OR
              "T"."PRD_ID"=4565189 OR "T"."PRD_ID"=5507203 OR "T"."PRD_ID"=78400086) AND
              "T"."OPR_PRD_QTY"0 AND "T"."PROCESS_ID"=0)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5026  consistent gets
          0  physical reads
          0  redo size
       1167  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          3  rows processed

Not having a clear answer for why Oracle has considered this index operation, doesn’t pre-empt us from solving this issue. And as explained above, an index skip scan operation being an indicationg of a missing index I have then created the following index (bear in mind that the newly created index takes into account that the original text of query can’t be changed):

SQL> create index idx_t1_usr1 on t(process_id, prd_id, prd_qty) compress 3;

SQL> select /*+ opt_param ('optimizer_index_cost_adj' 20) */
  2         t.*
  3      from t
  4      where
  5          t.prd_id in (5507203,78400086,359798,4565189,9495,12215)
  6      and t.process_id = 0
  7      and t.opr_prd_qty  0;

    PRD_ID SEC_PRD_ID    PRD_QTY OPR_PRD_QTY     BKR_ID P IS_PROCESED PROCESS_ID
---------- ---------- ---------- ----------- ---------- - ----------- ----------
      9495       9494       3164        3164         95 d           1          0
     12215      12214       4071        4071         15 d           1          0
    359798     359797     119932      119932         98 d           1          0

----------------------------------------------------------------------------
| Id  | Operation                            | Name        | Rows  | Bytes |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |             |     4 |   120 |
|   1 |  INLIST ITERATOR                     |             |       |       |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T           |     4 |   120 |
|*  3 |    INDEX RANGE SCAN                  | IDX_T1_USR1 |     5 |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T"."OPR_PRD_QTY"0)
   3 - access("T"."PROCESS_ID"=0 AND ("T"."PRD_ID"=9495 OR "T"."PRD_ID"=12215 OR
              "T"."PRD_ID"=359798 OR "T"."PRD_ID"=4565189 OR "T"."PRD_ID"=5507203 OR
              "T"."PRD_ID"=78400086))

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         18  consistent gets
          7  physical reads
          0  redo size
       1167  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          3  rows processed

January 30, 2016

RAC : Uncached TEMP SPACE

Filed under: Oracle — hourim @ 8:00 am

We parallelized a very big index (171GB) creation using this:

SQL> create UNIQUE index PK_TABLE_XXXX ON TABLE_XXXX
          (COL1
          ,COL2
          ,COL3
          ,COL4
          ,COL5)
     LOCAL
     TABLESPACE TBS_IDX_XXXX
     parallel 8
     NOLOGGING;

SQL> alter index PK_TABLE_XXXX noparallel;

Unfortunately we went in TEMP tablespace shortage after a couple of minutes of run

Error: ORA-12801
------------------------------
ORA-12801: erreur signalée dans le serveur de requête parallèle P001, instance xxxxxx(1)
ORA-01652: impossible d'étendre le segment temporaire de 128 dans le tablespace TEMP

SQL Plan Monitoring Details (Plan Hash Value=762135660)
=============================================================================
| Id |            Operation            |       Name       |  Rows   | Temp  |
|    |                                 |                  | (Estim) | (Max) |
=============================================================================
|  0 | CREATE INDEX STATEMENT          |                  |         |       |
|  1 |   PX COORDINATOR                |                  |         |       |
|  2 |    PX SEND QC (RANDOM)          | :TQ10000         |      3G |       |
|  3 |     PX PARTITION HASH ALL       |                  |      3G |       |
|  4 |      INDEX BUILD UNIQUE (LOCAL) | PK_TABLE_XXXX    |         |       |
|  5 |       SORT CREATE INDEX         |                  |      3G |   32G | -->
|  6 |        TABLE ACCESS FULL        | TABLE_XXXX       |      3G |       |
=============================================================================

This 32GB of maximum TEMP space looks very odd. I have already been working with this data base and I was practically sure that it allows more than this limit. So I looked at the v$sort_segment view:

SQL> compute sum Label 'Total Temp Used' of "Space(GB)" on report
SQL> break on report
SQL> select
       *
    from
    (select
            tablespace_name
           ,inst_id
           ,round((total_blocks*8192)/(1024*1024*1024),2) "Space(GB)"
      from
          gv$sort_segment order by 1,2);

TABLESPACE_NAME                    INST_ID  Space(GB)
------------------------------- ---------- ----------
TEMP                                     1      31.25
TEMP                                     2     656.22
                                           ----------
Total Temp Used                                687.47

Notice this 31.25GB of TEMP Space in instance 1. It looks to be closely related to the limit we’ve hit. And, indeed, the index creating script was started from instance 1 as shown in the corresponding Real Time SQL Monitoring report:

Global Information
------------------------------
 Status              :  DONE (ERROR)
 Instance ID         :  1                    --> spot this
 Session             :  xxxxxx(908:33137)
 SQL ID              :  1h8puyf4b3bw7
 SQL Execution ID    :  16777216
 Execution Started   :  01/25/2016 18:02:40
 First Refresh Time  :  01/25/2016 18:02:40
 Last Refresh Time   :  01/25/2016 18:04:07
 Duration            :  87s
 Module/Action       :  SQL Developer/-
 Service             :  xxxxxx_INST1
 Program             :  SQL Developer

This database is a RAC (11.2.0.3.0) with 2 nodes. It possesses a TEMP tablespace composed with 22 temporary files each of which has 32GB of size. This makes a total available TEMP space of 22*32 = 704GB. Having no clue about the reason for which Oracle has not been able to use the remaining TEMP space from instance 2, i asked to run the same script from instance 2 of the same database:

Global Information
------------------------------
 Status              :  DONE
 Instance ID         :  2   --> second instance
 Session             :  xxxxxx(401:717)
 SQL ID              :  1h8puyf4b3bw7
 SQL Execution ID    :  33554432
 Execution Started   :  01/26/2016 12:02:59
 First Refresh Time  :  01/26/2016 12:03:00
 Last Refresh Time   :  01/26/2016 12:30:07
 Duration            :  1628s
 Module/Action       :  SQL Developer/-
 Service             :  xxxxxx_INST2
 Program             :  SQL Developer       

SQL Plan Monitoring Details (Plan Hash Value=762135660)
=============================================================================
| Id |            Operation            |       Name       |  Rows   | Temp  |
|    |                                 |                  | (Estim) | (Max) |
=============================================================================
|  0 | CREATE INDEX STATEMENT          |                  |         |       |
|  1 |   PX COORDINATOR                |                  |         |       |
|  2 |    PX SEND QC (RANDOM)          | :TQ10000         |      3G |       |
|  3 |     PX PARTITION HASH ALL       |                  |      3G |       |
|  4 |      INDEX BUILD UNIQUE (LOCAL) | PK_TABLE_XXXX    |         |       |
|  5 |       SORT CREATE INDEX         |                  |      3G |   99G |
|  6 |        TABLE ACCESS FULL        | TABLE_XXXX       |      3G |       |
=============================================================================

Notice how the index creation, this time, completes without error, in about 27 minutes and consumes 99GB of TEMP space.
Here’s below the situation of the cached extends in gv$temp_extent_pool view immediately after the index successful creation :

SQL> compute sum Label 'Total Temp Used' of extents_cached on report
SQL> break on report
SQL> select inst_id
              , file_id
              , extents_cached
              , extents_used
        from gv$temp_extent_pool
        order by 1,2;

   INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1          31994            8
         1          2              4            3
         1          3              0            0
         1          4              0            0
         1          5              3            0
         1          6              0            0
         1          7              1            0
         1          8              0            0
         1          9              0            0
         1         10              0            0
         1         11              0            0
         1         12              0            0
         1         13              0            0
         1         14              0            0
         1         15              0            0
         1         16              0            0
         1         17              0            0
         1         18              0            0
         1         19              0            0
         1         20              0            0
         1         21              0            0
         1         22              0            0
         2          1              0            0
         2          2          31995            0
         2          3          31999            0
         2          4          31999            0
         2          5          31996            0
         2          6          31999            0
         2          7          31998            0
         2          8          31999            0
         2          9          31999            0
         2         10          31999            0
         2         11          31999            0
         2         12          31999            0
         2         13          31999            0
         2         14          31999            0
         2         15          31999            0
         2         16          31999            0
         2         17          31999            0
         2         18          31999            1
         2         19          31999            0
         2         20          31999            0
         2         21          31999            0
         2         22          31999            0
                      --------------
Total Temp                    703973
SQL> select
           inst_id,
           tablespace_name,
           total_blocks,
           used_blocks,
           free_blocks
    from gv$sort_segment;

   INST_ID TABLESPACE_NAME                 TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS
---------- ------------------------------- ------------ ----------- -----------
         1 TEMP                                 4096256        1536     4094720
         2 TEMP                                86012288         128    86012160

We have a TEMP tablespace of 22 temporary files in a RAC configuration with 2 nodes. We can point out that, in instance 1, only the first temporary file that has been used. While in instance 2 we see that many extents have been uniformly allocated during the index creation.

SQL> select inst_id, trunc(bytes_cached/1024/1024/1024,2) Gbytes_cached from gv$temp_extent_pool;

   INST_ID GBYTES_CACHED
---------- -------------
         1         31.24
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         1             0
         2             0
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24
         2         31.24

44 rows selected.

Clearly for a reason I am not aware of, Instance 2 is refusing to uncache its temp extents to instance 1. Thanks to Riyaj Shamsudeen I knew that this situation could be due to the bug n° 14383007 which necessitates a patch to be solved.

January 20, 2016

Natural and Adjusted Hybrid Histogram

Filed under: Oracle,Statistics — hourim @ 7:16 pm

I was going to write a comment in this Jonathan Lewis article and have finally decided to write a blog article because it turned to be a long comment.  In the above mentioned article a reader was wondering why the bucket size of his modified data set is not obeying the minimum bucket size explained by Jonathan. Coincidentally I am writing a second article on Hybrid histogram for allthingsOracle where I have used my proper terminology to define two types of Hybrid histogram: a first type, which I have named ‘’Natural Hybrid’, is close to Jonathan’s original data set. And a second type, which I have named ‘’Adjusted Hybrid’’ is of the same vein as the reader modified data set. By  ‘’Natural Hybrid’’ type  I refer to a data set a data set that doesn’t qualify for a TOP-Frequency histogram because the threshold is greater than the naturally non-adjusted TopNRows. By ‘Adjusted Hybrid’type, I am refering to a data set that initially satisfies the TOP-Frequency threshold but  fails to qualify for a TOP-Frequency histogram because Oracle finds at the middle of the process that the Adjusted TopNRows is greater than the threshold .

Let’s explain with example. Firstly here’s a model that qualify for a “Natural Hybrid” histogram:

SQL> create table t1 (id number, n1 number);
SQL> start InsT1.sql (see downloadable script at the end)

If we gather histogram for this data set of 37 distinct values using 20 buckets we will obtain a HYBRID histogram because the TOP-Frequency threshold accounts for 95 rows while the TOP-20 rows account for 80 rows as shown below:

SQL> select round ((20-1)/20 * 100) threshold from dual;

 THRESHOLD
----------
        95
SQL> select
         sum (cnt) TopNRows
    from (select
            n1
           ,count(*) cnt
         from t1
         group by n1
          order by count(*) desc
          )
   where rownum <= 20;

  TOPNROWS
----------
        80

In order to compute the Hybrid histogram information, Oracle will, in this case, use the bucket-frequency method explained by Jonathan Lewis. This method uses the minimum bucket size of 5 (not valid at the end of the data set though) and the unchanged initial number of bucket of 20 condition is respected. This is what I prefer labelling a Natural Hybrid histogram.

Let’s now use the reader model

create table t_jar (id number, n1 number);
insert into t_jar values (1,5 );
insert into t_jar values (1,5 );
insert into t_jar values (1,7 );
insert into t_jar values (1,7 );
insert into t_jar values (1,7 );
insert into t_jar values (1,7 );
insert into t_jar values (1,10 );
insert into t_jar values (1,12 );
insert into t_jar values (1,15 );
insert into t_jar values (1,15 );
insert into t_jar values (1,15 );
insert into t_jar values (1,20 );
commit;

This new data set of 6 distinct values over 12 rows will normally qualify for a TOP-3 Frequency histogram as it satisfies the threshold formula:

SQL> select round ((3-1)/3 * 12) threshold from dual;

 THRESHOLD
----------
         8
SQL> select
          sum (cnt) TopNRows
     from (select
             n1
            ,count(*) cnt
          from t_jar
          group by n1
          order by count(*) desc
             )
     where rownum >= 3;
  TOPNROWS
----------
         9

However, Oracle will not accept this at face value. It has to check if the low and high values are among the TOP-3 distinct values. If one of these values (or both) are not in the TOP-3, oracle will force it into the histogram, exclude the least repetitive value from the TOP-3, adjust the TopNRows and check again whether theses modifications have not altered the data set so that it still qualify or not for a TOP-Frequency histogram.

Here’s below a snippet of a corresponding dbms_stats trace file

SQL> exec dbms_stats.set_global_prefs ('TRACE', to_char (1+16));

SQL> BEGIN
        dbms_stats.gather_table_stats
          (user, 't_jar', method_opt =>; 'for columns n1 size 3');
     END;
    /
SQL> exec dbms_stats.set_global_prefs('TRACE', null);
DBMS_STATS:  NNV  NDV  AVG  MMX  HST  EP   RP   NNNP IND  CNDV HSTN HSTR  COLNAME
DBMS_STATS:   Y    Y    Y    Y    Y    Y    Y                        Y    N1
DBMS_STATS: Approximate NDV Options
DBMS_STATS: ACL,TOPN,NIL,NIL,RWID,U,U3U
DBMS_STATS: start processing top n values for column N1
DBMS_STATS: topn sql (len: 415):
DBMS_STATS: +++ select /*+
                       no_parallel(t) no_parallel_index(t)
			   dbms_stats cursor_sharing_exact use_weak_name_resl
                       dynamic_sampling(0) no_monitoring
			 xmlindex_sel_idx_tbl no_substrb_pad
			  */
		   substrb(N1,16,0,64),1,240) val,
                 rowidtochar(rowid) rwid
		  from "XXXX&".T_JAR& t
		  where rowid in(chartorowid('AAAJ4MAAEAACN4EAAA')
		              ,chartorowid('AAAJ4MAAEAACN4EAAC')
		    	       ,chartorowid('AAAJ4MAAEAACN4EAAI'))
		order by N1;
DBMS_STATS: remove last bucket: Typ=2 Len=2: c1,10 add: Typ=2 Len=2: c1,15
DBMS_STATS: removal_count: 1 total_nonnull_rows: 12 mnb:  3
DBMS_STATS: adjusted coverage: .667

Here’s the data set

SQL> select *
        from
       (select n1, count(1) cnt
         from t_jar
        group by n1
         order by n1);

        N1        CNT
---------- ----------
         5          2 -> low value
         7          4
        10          1
        12          1
        15          3
        20          1 -> high value
6 rows selected.

And here are the TOP-3 rows of the same data set:

SQL> select
          *
     from (select
             n1
            ,count(*) cnt
          from t_jar
          group by n1
          order by count(*) desc)
     where rownum <= 3;

        N1        CNT
---------- ----------
         7          4
        15          3
         5          2
3 rows selected.

Since the high value is not in the TOP-3 it will be forced into the histogram to the cost of the exclusion of the least repetitive TOP-3 values which is 5 in this case (frequency =2). But, before doing this task, Oracle has to check if, after this high value forcing, the data set is still qualifying for a Top-Frequency using the AdjustedTopNRows

 AdjustedTopNRows = TopNRows – 2 + 1 = 9-2+1 = 8

The AdjustedTopNRows is not any more greater than the threshold of 8 which signifies that Oracle will finally stop collecting TOP-Frequency and transform what it has already gathered into  a HYBRID histogram as shown in the same trace file (Trying to convert frequency histogram to hybrid)

DBMS_STATS: adjusted coverage: .667
DBMS_STATS: hist_type in exec_get_topn: 2048 ndv:6 mnb:3
DBMS_STATS: Evaluating frequency histogram for col: N1
DBMS_STATS:  number of values = 4, max # of buckects = 3, pct = 100, ssize = 12
DBMS_STATS:  Trying to convert frequency histogram to hybrid
DBMS_STATS:  > cdn 10, popFreq 4, popCnt 1, bktSize 4, bktSzFrc 0
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 3, mnb 3, ssize 10, min_ssize 12, appr_ndv  TRUE, ndv 4, selNdv 1,
selFreq 4,
pct 100, avg_bktsize 3, csr.hreq TRUE, normalize TRUE
DBMS_STATS:   Histogram gathering flags: 7
DBMS_STATS:  Accepting histogram
DBMS_STATS: done_hist in process_topn: TRUE  csr.ccnt: 1
DBMS_STATS: Mark column N1 as top N computed
DBMS_STATS: Skip topn computed column N1 numhist: 0

This is what I have labelled an ‘’Adjusted Hybrid’’ histogram which is collected using a method that seems not involving the bucket size and which tends to be a top-frequency to hybrid conversion.

SELECT
        (row_number() over(order by ept_nbr)-1) NumBucket
        ,ept_nbr
        ,ept_act_val
        ,rpt_cnt
        ,ept_nbr - (lag(ept_nbr,1,0) over(order by ept_nbr)) "new bucket size"
        ,bucket_size "original bucket_size"
    FROM
        (SELECT
             ah.endpoint_number            ept_nbr
            ,ah.endpoint_actual_value      ept_act_val
            ,lag(ah.endpoint_number,1,0) over(order by ah.endpoint_number) ept_lag
            ,ah.endpoint_repeat_count rpt_cnt
            ,at.sample_size/at.num_buckets bucket_size
         FROM
            user_tab_histograms      ah
           ,user_tab_col_statistics  at
         WHERE ah.table_name  = at.table_name
         AND ah.column_name = at.column_name
         AND ah.table_name  = 'T_JAR'
         AND ah.column_name = 'N1'
       ) ORDER BY ept_nbr;

NUMBUCKET    EPT_NBR EPT_ACT_VA    RPT_CNT new bucket size original bucket_size
---------- ---------- ---------- ---------- --------------- --------------------
         0          2 5                   2               2                    4
         1          6 7                   4               4                    4
         2         10 20                  1               4                    4

In passing, the adjusted coverage mentioned in the above trace file is nothing than

AdjustedTopNRows/num_rows = 8/12 = 0.667

January 7, 2016

12c TOP-Frequency and the ’’half least popular value’’

Filed under: Oracle — hourim @ 1:27 pm

Asking Oracle 12c to collect histogram for a data set with a number of distinct values (NDV) greater than the supplied number of buckets (N) will probably result in Hybrid histogram. There is indeed a probability that a TOP-Frequency histogram will be preferred to Hybrid histogram. To qualify for this alternative the data set must fulfil a pre-requisite: its TOP N distinct values should occupy a number of rows exceeding a certain threshold. If this pre-requisite is satisfied a kind of frequency histogram is collected for the TOP N values; hence the TOP-Frequency label. The remaining insignificant values will be neglected and not captured into the histogram tables. However, if the extreme values of the data set, that is the low and high values, are among the insignificant values, they will not be neglected. Tow TOP N distinct values will be selected to serve as a victim, excluded from the histogram information and replaced in the TOP N captured values by the low and high values. These victim values are those with the lowest frequency of appearance in the data set. Whether the low (or high) value is naturally present or forced in the TOP N captured values determines the threshold value and the cardinality estimate of the non-popular non-captured values and that of the low (or high) value as well.

Let’s put all what precedes in action.

create table TopFreq3 as
select
    rownum n1
	, case when mod(rownum, 100000)   = 0 then   90
	       when mod(rownum, 10000)    = 0 then   180
		  when mod(rownum, 1000)= 0 then   84
              when mod(rownum, 100)      = 0 then   125
              when mod(rownum,50)        = 2 then   7
              when mod(rownum-1,80)      = 2 then   22 
              when mod(rownum, 10)       = 0 then   19
              when mod(rownum-1,10)      = 5  then   15
              when mod(rownum-1,5)       = 1  then   11
              when trunc((rownum -1/3)) < 5  then   25
              when trunc((rownum -1/5)) < 20  then   33
	  else 42 
        end n2    
from dual
connect by level <= 2e2; 

SELECT n2, count(*)
FROM TopFreq3
GROUP BY n2
ORDER BY n2;

        N2   COUNT(*)
---------- ----------
         7          4 --> low value
        11         36
        15         20
        19         18
        22          3
        25          3
        33          8
        42        106
       125          2 --> high value

9 rows selected.

As shown above the engineered data set has 9 distinct values (over 200 rows) of which the TOP 8 seem to qualify for a TOP-Frequency histogram:

 undefine nb
undefine num_rows

 SELECT
       &&nb numberOfBuckets
      ,TopNRows
      ,round((1-(1/&&nb))* &&num_rows) Threshold
      ,case
        when (TopNRows – round((1-(1/&&nb))* &&num_rows)) > 0 then 'Qualify For TOP Frequency'
      else
         'Do not qualify For TOP Frequency'
      end TOPFreqDecision
   FROM
   (select
         sum (cnt) TopNRows
     from (select
             n2
           , count(*) cnt
         from TopFreq3
         group by n2
         order by count(*) desc
         )
      where rownum <= &&nb
    );
Enter value for nb: 8
Enter value for num_rows: 200

NUMBEROFBUCKETS   TOPNROWS  THRESHOLD TOPFREQDECISION
--------------- ---------- ---------- --------------------------------
              8        198        175 Qualify For TOP Frequency

The TOPNROWS being greater than the THRESHOLD a TOP-FREQUENCY will be preferred to the HYBRID histogram as shown in the following:

begin
  dbms_stats.gather_table_stats
       (user,'TopFreq3'
       ,method_opt =>'for columns n2 size 8');
end;
/

select
    column_name
   ,num_distinct
   ,num_buckets
   ,sample_size
   ,histogram
from
   user_tab_col_statistics
   where table_name = 'TOPFREQ3'
   and column_name  = 'N2';

COLUMN_NAME  NUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE HISTOGRAM
------------ ------------ ----------- ----------- ---------------
N2                      9           8         200 TOP-FREQUENCY

Indeed as expected Oracle has gathered a TOP-FREQUENCY histogram for the TOP 8 distinct values as shown below:

select
       endpoint_actual_value value,
       endpoint_number,
       endpoint_number - (lag(endpoint_number,1,0) 
                         over(order by endpoint_number)) value_count
    from	
       user_tab_histograms
    where
       table_name = 'TOPFREQ3'
    and column_name = 'N2'
    order by
       endpoint_number
    ;

VALUE      ENDPOINT_NUMBER VALUE_COUNT
---------- --------------- -----------
7                        4           4 
11                      40          36
15                      60          20
19                      78          18
25                      81           3
33                      89           8
42                     195         106
125                    196           1 --> high value with a forced frequency 1

8 rows selected.

Notice by the way, how, despite the high value 125 is insignificant (count =2) it has been captured into the histogram with bizarrely a forced frequency of 1. For the sake of clarity let’s reprint below how the n2 column is spread in TopFreq3 table:

        N2   COUNT(*)
---------- ----------
         7          4 --> low value
        11         36
        15         20
        19         18
        22          3 --> first least popular value
        25          3 
        33          8
        42        106
       125          2 --> high value
9 rows selected.

Again notice how, the value 22 which is the first least popular value, has been removed from the TOP 8 to let its place for the insignificant high value 125 with a forced frequency of 1. This is how the TOP-Frequency seems to work. If the low and/or high value of the data set is naturally present in the TOP-N values then there will be no value exclusion-substitution. If however one of these values (or both) is outside the TOP-N values, then it must be included in the histogram information in place of the least popular value.

Let’s now check what cardinality estimates will be calculated for the non-captured values (including the excluded least popular value)

explain plan for select count(1) from TopFreq3 where n2= 22;

select * from table(dbms_xplan.display);

-----------------------------------------------
| Id  | Operation          | Name     | Rows  |
-----------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |
|   1 |  SORT AGGREGATE    |          |     1 |
|*  2 |   TABLE ACCESS FULL| TOPFREQ3 |     4 |
-----------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("N2"=22)

How Oracle did work out this cardinality of 4? In this case Oracle is simply using the following formula:

E-Rows = num_rows * NewDensity

Where NewDensity is:

NewDensity = (sample_size-TopNRows)/(num_distinct-num_buckets)/sample_size
NewDensity = (200-198)/(9-8)/200 = .01

Finally applying the above mentioned cardinality estimation we get this:

E-Rows = num_rows * NewDensity
E-Rows = 200 * 0.01 = 2

But this is not equal to the correct cardinality estimation of 4. In fact, when one of the extreme values are forced into the histogram information the threshold (or TopNRows) are slightly adapted to take into account the exclusion-inclusion operation done between the extreme value and the least popular value:

AdjustedTopNRows = TopNRows – (count of least popular value) + 1 (forced value of 
                                                                  extreme value)

AdjustedTopNRows = 198 – 3 + 1 = 196

Apply this to the above cardinality estimation formula and you will get the correct estimation as shown below:

NewDensity = (200- 196)/(9-8)/200   = .02
E-Rows     = num_rows * NewDensity  = 200 * 0.02 = 4

And this is clearly backed by the corresponding 10053 trace file (spot the NewDensity and the BktCnt):

SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for TOPFREQ3[TOPFREQ3] 
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#2): 
    NewDensity:0.020000, OldDensity:0.002500 BktCnt:196.000000, PopBktCnt:195.000000, PopValCnt:7, NDV:9
  Column (#2): N2(NUMBER)
    AvgLen: 4 NDV: 9 Nulls: 0 Density: 0.020000 Min: 7.000000 Max: 125.000000
    Histogram: Top-Freq  #Bkts: 196  UncompBkts: 196  EndPtVals: 8  ActualVal: yes
  Table: TOPFREQ3  Alias: TOPFREQ3
    Card: Original: 200.000000  Rounded: 4  Computed: 4.000000  Non Adjusted: 4.000000

Bottom line: this article aims to restrain the enthusiasm of those using the ‘‘half least popular value’’ rule to get the selectivity of a non-popular frequency histogram before they extend it to the non-popular TOP-Frequency. It seems that this is not correct. And all in all, it seems that a non-popular TOP-FREQUENCY value is nothing else than a HYBRID non-popular value.

December 31, 2015

2015 in review

Filed under: Oracle — hourim @ 6:00 pm

This is the 2015 annual report for my blog. I hope I will continue sharing and learning interesting things with you.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 32,000 times in 2015. If it were a concert at Sydney Opera House, it would take about 12 sold-out performances for that many people to see it.

Click here to see the complete report.

Next Page »

Create a free website or blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)