Very often, people get confused about the actual execution time taken by a parallel query. I’ve seen this asked a few times so that I have always thought it deserves a simple blog post. This is then the last article of the series about SQL monitor Duration and Elapsed time information where we’re going to become acquainted with how to get wall clock time of queries executed using parallelism.
SQL monitor and Parallelism
In the first article we learned that SQL monitor Elapsed time doesn’t sum up time spent outside the database by idle sessions. In the second one we outlined that the SQL monitor begins monitoring the query only when this one reaches its execution phase. In this article we will see that,for parallel query, the wall clock time, very often, approaches the Duration information of the SQL monitor report rather than that of the Elapsed time.
Having said that, here’s below a query which I will run using parallelism (t1 being a simple copy of all_objects table):
SQL> select /*+ parallel(4) */ object_type , count(1) from t1 group by object_type; 43 rows selected. Elapsed: 00:00:09.91
As you can see it took almost 10 seconds wall clock time for the above query to complete using the following parallel execution plan:
SQL_ID 08ka9ybgk43v1, child number 0 ------------------------------------- ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | TQ |IN-OUT| PQ Distrib | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 43 | Q1,01 | P->S | QC (RAND) | | 3 | HASH GROUP BY | | 43 | Q1,01 | PCWP | | | 4 | PX RECEIVE | | 43 | Q1,01 | PCWP | | | 5 | PX SEND HASH | :TQ10000 | 43 | Q1,00 | P->P | HASH | | 6 | HASH GROUP BY | | 43 | Q1,00 | PCWP | | | 7 | PX BLOCK ITERATOR | | 23131 | Q1,00 | PCWC | | |* 8 | TABLE ACCESS FULL| T1 | 23131 | Q1,00 | PCWP | | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access(:Z>=:Z AND :Z<=:Z) Note ----- - Degree of Parallelism is 4 because of hint
If you want to, or are planning to analyze the above parallel query through the SQL monitor report then pay attention to the following:
SELECT DBMS_SQLTUNE.report_sql_monitor( sql_id => '08ka9ybgk43v1', type => 'TEXT', report_level => 'ALL') AS report FROM dual; Global Information ------------------------------ Status : DONE (ALL ROWS) Instance ID : 1 Session : XXXX(296:2203) SQL ID : 08ka9ybgk43v1 SQL Execution ID : 16777216 Execution Started : 06/28/2018 07:30:52 First Refresh Time : 06/28/2018 07:30:54 Last Refresh Time : 06/28/2018 07:31:02 Duration : 10s Module/Action : SQL*Plus/- Service : YYYYY Program : sqlplus.exe Fetch Calls : 4 Global Stats ========================================================= | Elapsed | Cpu | IO | Application | Other | | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | ========================================================= | 32 | 0.91 | 29 | 1.74 | 0.08 | =========================================================
It is, again, the Duration of the Global Information which gives the wall clock time taken by the query executed via parallelism. The Elapsed Time of the Global Stats section represents time taken by the Query Coordinator and all its parallel slaves as clearly shown in the same SQL monitor report:
Parallel Execution Details (DOP=4 , Servers Allocated=8) ================================================================================= | Name | Type | Server# | Elapsed | Cpu | IO | Application | | | | | Time(s) | Time(s) | Waits(s) | Waits(s) | ================================================================================= | PX Coordinator | QC | | 2.26 | 0.01 | 0.50 | 1.74 | | p000 | Set 1 | 1 | 0.00 | 0.00 | | | | p001 | Set 1 | 2 | 0.00 | 0.00 | | | | p002 | Set 1 | 3 | 0.00 | 0.00 | | | | p003 | Set 1 | 4 | 0.00 | 0.00 | | | | p004 | Set 2 | 1 | 7.38 | 0.20 | 7.17 | | | p005 | Set 2 | 2 | 7.22 | 0.22 | 6.97 | | | p006 | Set 2 | 3 | 7.57 | 0.27 | 7.28 | | | p007 | Set 2 | 4 | 7.32 | 0.21 | 7.10 | | =================================================================================
Based on the information provided by the gv$sql_monitor view I’ve developed the following scripts showing how the query execution time has been distributed between the Query Coordinator and its parallel slaves
-- this gives execution time consumed by parallel slaves SQL> @PXElaps Enter value for sql_id: 08ka9ybgk43v1 Enter value for exec_id: 16777216 SQL_ID SQL_EXEC_ID SID PROCE ELAPSED ------------- ----------- ---------- ----- ---------- 08ka9ybgk43v1 16777216 254 p003 0 08ka9ybgk43v1 16777216 193 p000 0 08ka9ybgk43v1 16777216 214 p001 0 08ka9ybgk43v1 16777216 234 p002 0 08ka9ybgk43v1 16777216 299 p005 7.22 08ka9ybgk43v1 16777216 2 p007 7.32 08ka9ybgk43v1 16777216 277 p004 7.38 08ka9ybgk43v1 16777216 318 p006 7.57 ---------- Total Slaves 29.49 8 rows selected.
-- this gives execution time consumed by Query Coordinator SQL> @QCElaps Enter value for sql_id: 08ka9ybgk43v1 Enter value for exec_id: 16777216 SQL_ID QC_TIME DOP PX_SERVERS_REQUESTED PX_SERVERS_ALLOCATED ------------- ---------- ---------- -------------------- -------------------- 08ka9ybgk43v1 2.26 4 8 8
-- this gives execution time consumed by Query Coordinator and its PX slaves SQL> @QCPXElaps Enter value for sql_id: 08ka9ybgk43v1 Enter value for sql_exec_id: 16777216 SQL_ID SID PROCE ELAPS ------------------------- ---------- ----- ---------- 08ka9ybgk43v1 254 p003 0 08ka9ybgk43v1 2 p007 7.32 08ka9ybgk43v1 214 p001 0 08ka9ybgk43v1 299 p005 7.22 08ka9ybgk43v1 296 QC 2.26 08ka9ybgk43v1 193 p000 0 08ka9ybgk43v1 277 p004 7.38 08ka9ybgk43v1 318 p006 7.57 08ka9ybgk43v1 234 p002 0 ---------- Elapsed time(s) 31.75 9 rows selected.
The devil is in the details
Before you start crying ‘yes, but what about…‘, make sure you understand the following statistics of AWR historical execution of a particular parallel query taken from a running system:
SQL> @HistStats Enter value for sql_id: a8yny7n0x1gj1 Enter value for from_date: 01062018 SNAP_BEGIN PLAN_HASH_VALUE EXECS END_OF_FETCH AVG_ETIME AVG_PIO AVG_LIO AVG_ROWS ------------------------- --------------- -------- ------------ ---------- ---------- ---------- ---------- 20-JUN-18 08.00.58.250 PM 765437158 1 0 665 4708199 4789407 0 25-JUN-18 08.00.43.224 PM 765437158 1 0 773 5446986 5545989 0
At first glance many persons will think that this query took 665 seconds. In effect, we should not consider the above average elapsed time without the fundamental information which is: avg_px
agv_px = px_servers_execs_delta/decode (executions_delta, 0, 1, executions_delta)
This metric indicates whether the query runs parallel or not. If it is not null then the AVG_ETIME represents the cumulated execution time of taken by the Query Coordinator and all its parallel slaves used to run the query concurrently.
Here’s below the same AWR historical execution statistics augmented with the avg_px metric:
@HistStats Enter value for sql_id: a8yny7n0x1gj1 Enter value for from_date: 01052018 SNAP_BEGIN PLAN_HASH_VALUE EXECS END_OF_FETCH AVG_ETIME AVG_PX AVG_LIO ------------------------- --------------- ----- ------------ ---------- ------ -------- 20-JUN-18 08.00.58.250 PM 765437158 1 0 665 8 4789407 25-JUN-18 08.00.43.224 PM 765437158 1 0 773 8 5545989
Thanks to the avg_px supplementary information we know now that this query has used parallelism during its execution and that the AVG_ETIME of 665 seconds (or 773) corresponds to the Elapsed time taken by all parallel servers(8) including the query coordinator. If you want to measure the actual wall clock time taken by this query you have then to use the following approximative arithmetic:
ag_wc_time = trunc(avg_etime/decode(avg_px,0,1,avg_px) ,2) avg_px_time
Which when implemented into the above Histstats script gives this:
@HistStats Enter value for sql_id:a8yny7n0x1gj1 Enter value for from_date: 01052018 SNAP_BEGIN PLAN_HASH_VALUE EXECS END_OF_FETCH AVG_ETIME AVG_PX AVG_WC_TIME ------------------------- --------------- ----- ------------ ---------- ------ ----------- 20-JUN-18 08.00.58.250 PM 765437158 1 0 665 8 166.25 25-JUN-18 08.00.43.224 PM 765437158 1 0 773 8 193.25
Finally we are able to say that, approximatively, this query took about 193 seconds wall clock time during it execution of June the 25th at around 8PM.
Since the above query used parallelism it has been monitored as the following proves:
SQL> select sql_exec_id ,to_char(sql_exec_start, 'dd/mm/yyyy hh24:mi:ss') exec_date from gv$sql_monitor where sql_id = 'a8yny7n0x1gj1'; SQL_EXEC_ID EXEC_DATE ----------- ------------------- 16777225 25/06/2018 20:01:09 SELECT DBMS_SQLTUNE.report_sql_monitor( sql_id => 'a8yny7n0x1gj1', sql_exec_id => 16777225, type => 'TEXT', report_level => 'ALL') AS report FROM dual; Global Information ------------------------------ Status : DONE (ERROR) Instance ID : 1 SQL ID : a8yny7n0x1gj1 SQL Execution ID : 16777225 Execution Started : 06/25/2018 20:01:09 First Refresh Time : 06/25/2018 20:01:09 Last Refresh Time : 06/25/2018 20:04:20 Duration : 191s Global Stats ================================ | Elapsed | Cpu | IO | | Time(s) | Time(s) | Waits(s) | ================================ | 776 | 236 | 538 | ================================
Notice how using AWR historical tables(HistStats.sql) we are able to derive, almost reliably, the actual time taken by a query ran in parallel.
Summary
SQL monitor feature monitors performance of SQL statements while they execute. It is started automatically when a SQL statement runs parallel or serial but has consumed at least 5 seconds either on CPU or on I/O wait. The parsed phased of SQL statements is not monitored. In order to have an idea about the exact duration of the SQL statement execution then we have better to rely on the Duration information rather than on the Elapsed Time of the Global Stats section of the report. This is because Elapsed Time doesn’t include time spent by idle session fetching data for example. And when a SQL statement runs parallel, the Elapsed Time includes execution time of both the Query Coordinator and all its parallel server slaves while the Duration represents the wall clock time of the query that runs both serial and parallel.
SQL scritps
@PxElaps compute sum label 'Total Slaves time' of elapsed on report break on report select sql_id ,sql_exec_id ,sid , process_name , round(elapsed_time/1e6,2) elapsed -- , round(cpu_time/1e6) cpu --, round(concurrency_wait_time/1e6,2) conc -- , round(user_io_wait_time/1e6) IO_Waits from gv$sql_monitor where sql_id = '&sql_id' and sql_exec_id = '&exec_id' and sql_text is null order by round(elapsed_time/1e6,2);
@QCElaps select sql_id , round(elapsed_time/1e6,2) QC_time , px_maxdop DOP , px_servers_requested , px_servers_allocated , round(cpu_time/1e6) cpu --, round(concurrency_wait_time/1e6,2) conc , round(user_io_wait_time/1e6) IO_Waits from gv$sql_monitor where sql_id = '&sql_id' and sql_exec_id = '&exec_id' and sql_text is not null;
@QCPXElaps column label format a25 col sql_id format a25 compute sum label 'Elapsed time(s)' of elaps on report break on report select sql_id ,sid ,decode(process_name, 'ora', 'QC', process_name) process ,round(elapsed_time/1e6,2) elaps from gv$sql_monitor where sql_id = '&sql_id' and sql_exec_id = '&exec_id';
[…] SQL monitor and parallel processing […]
Pingback by SQL Monitor flaw | Mohamed Houri’s Oracle Notes — March 7, 2020 @ 8:13 am |