Tuning an insert/select via dblink

February 25, 2012

Tuning an insert/select via dblink

Filed under: Trouble shooting — hourim @ 11:40 am

Recently an interesting question came up on an Oracle French Forum about a performance problem during an insert/select operation via a dblink. The poster was suffering from the very long time (several hours without any response) the insert was taking. This is why he came to that forum and posted a question which looks like “what do you think if I commit every 1000 rows? Will my insert speed up”? That’s not the right way to follow in the process of troubleshooting an Oracle performance problem. I always remember Cary Millsap mantra “Why guess when you can know”. Fortunately Oracle gives us a lot fo tools and trace events to instrument what going on inside the Oracle database. Here below are the steps of discussion we’ve had (the original poster, others and I) in that forum.

I first have asked the poster to trace his insert select using events 10046


alter session set events '10046 trace name context forever, level 12';

INSERT /*+ append */ INTO my_table
(
champs_1,
champs_2,
champs_3,
champs_4,
champs_5,
champs_6,
champs_7
)
SELECT * FROM la_table_cible@mon_dblink ;

commit ;

alter session set events '10046 trace name context off';

I am not going to stress that I have asked the poster to ensure that his direct path load will not be silently ignored by Oracle in case the table “my_table” has trigger or foreign key integrity constraints implemented on it. I have also asked if there will be subsequent deletes (after the insert) on this table making future direct path load not reusing any freed space made by that delete.

The generated trace file profiled via orasrp profiler shows that 98% of the insert/select response time is spent waiting in the event SQL*Net message from dblink.

Up to this step, we persuaded the poster that his initial guess of committing every 1000 records was not the ultimate solution; but instead he should focus on reducing the volume of data transferred from the distant data base to the local one. The poster sends us the explain plan and Statistics of the select part when this one is executed in the distant data base

Statistics
----------------------------------------------
212        recursive calls
3          db block gets
678993     consistent gets
483744     physical reads
592        redo size
124692665  bytes sent via SQL*Net TO client
1391305    bytes received via SQL*Net FROM client
126437     SQL*Net roundtrips TO/FROM client
17         sorts (memory)
0          sorts (disk)
1896535    rows processed

He is selecting 1,896,535 records via 126,437 SQL*Net roundtrips meaning that he is using the default arraysize which is 15 (1,896,535/126,437). For this, he did 0.358 (678,993/1,896,535) logical read for each selected row which seems to be quite acceptable. But, he is sending a very large amount of data 124MB over a data base link and making a hundred of thousands SQL*Net roundtrips. Note, that he is doing also a fairly large amount of physical read (483,744) when compared to the total number of logical reads (678,993).

I’ve then asked the poster to stop selecting all the rows from the distant data base and to start selecting only necessary rows. I asked also to post the explain plan of this query in attempt to help him tuning the select part of his problematic insert. When he said that he can’t intervene into the distant data base, I have then asked him to manage his insert with a PL/SQL stored procedure as shown below:


CREATE type distant_typ AS object (champs_1 number,champs_2 number,champs_3 number, ... champs_7 date);
2  /

CREATE type distant_typ_tab AS TABLE of distant_typ;
2  /

DECLARE
CURSOR get_data_from_db IS
SELECT
     champs_1,
     champs_2,
     champs_3,
     champs_4,
     champs_5,
     champs_6,
     champs_7
FROM la_table_cible@mon_dblink
;

BEGIN

OPEN get_data_from_db;
LOOP
FETCH get_data_from_db BULK collect INTO distant_typ_tab LIMIT 100;
FOR i in 1..distant_typ_tab.COUNT
LOOP
INSERT INTO INSERT /*+ append */ INTO ma_table
(champs_1,
 champs_2,
 champs_3,
 champs_4,
 champs_5,
 champs_6,
 champs_7
)
SELECT
 champs_1,
 champs_2,
 champs_3,
 champs_4,
 champs_5,
 champs_6,
 champs_7
FROM table (distant_typ_tab);
EXIXT when get_data_from_db%notfound;
END LOOP;
END LOOP;
COMMIT;
END;

And magically the poster comes out and said “fantastic my insert/select now is completing in a very acceptable time”.

Comments (12)

12 Comments »

Hi, can you explain more about why the PL/SQL store procedure provide a better performance

Comment by Jason Zhong — December 19, 2012 @ 5:58 am | Reply
Jason,

There is a crucial issue to tackle when dealing with insert/select via db-link. The select part might involve both the local and the distant databases. By default, Oracle will consider the local database as the driving site and will (1) go the distant database (2) pull as many data as allowed to do by the SQL*Net and the arraysize and (3) join/eliminate data in the local database. When the volume of data to be exchanged between the distant and the local database is very huge it alters considerably the response time of the entire select process. In this case you can simply decide to make the distant database as the default driving site by using the strategic hint /*+ driving_site (..) */.

However, using this hint in an insert/select via db-link is of no effect. The local database will be always considered as the driving site. There are additional details in the following blog article:

Distributed DML

So, in response to your question I could say that when troubleshooting an insert/select via db-link where time is found to be consumed in the data exchange between the local and the distant database and due to the above restrictions on the choice of the driving site, we could walk around this issue by dividing the insert/select into small pieces of Pl/SQL inserts instead of one and unique large insert.

Comment by Houri Mohamed — December 20, 2012 @ 10:57 am | Reply

Using cardinality hint in subselect seems to work to get work of select done on remote db first
Driving hint and other hints are ignored for dml insert

 INSERT /*+ append */  
        INTO MIG_CGD30_TEST       
                SELECT  /*+ cardinality(ZFD 400000) cardinality(CGD 60000000)*/ 
            TRIM (CGD.NUMCPT) AS NUMCPT, TRIM (ZFD.NUMBDC_NEW) AS NUMBDC
              FROM CGD30@DBL_MIG_THALER CGD,
                   ZFD10@DBL_MIG_THALER ZFD,
                   EVD01_ADS_DR3W2  EVD

Comment by Lavallee Alain — February 13, 2013 @ 5:48 pm | Reply

Alain,

When you use the cardinality hint you are changing the arithmetic of the Optimizer. You are telling the CBO that he has to trust you and consider the cardinality (number of rows of the corresponding table) you are supplying it thanks to this hint.

You don’t really have to do that if the statistics reflect the exact reality of the volume of the data and how this data is scattered (distributed) .Based on an adequate statistics the CBO is able to get the best execution plan possible. However, the reality is such sometimes you know your data better than the CBO does and hence you can hint your query as you seem to have succeeded to do in this insert/select via a data base link

But keep an open eye on this. Things might change when data change!!!

Comment by Houri — February 14, 2013 @ 1:31 pm | Reply

the followings error occurred…

LINE/COL ERROR
-------- -----------------------------------------------------------------
10/1     PL/SQL: SQL Statement ignored
10/42    PLS-00321: expression 'DISTANT_TYP_TAB' is inappropriate as the
         left hand side of an assignment statement

11/1     PL/SQL: Statement ignored
11/29    PLS-00302: component 'COUNT' must be declared


My Scripts....

CREATE table ma_Table (champs_1 number,champs_2 number,champs_3 number);
CREATE table ma_Table2 (champs_1 number,champs_2 number,champs_3 number);

CREATE type distant_typ AS object (champs_1 number,champs_2 number,champs_3 number);
/

CREATE type distant_typ_tab AS TABLE of distant_typ;
/

  1  create or replace procedure sys.test
  2  IS
  3  CURSOR get_data_from_db IS
  4  SELECT  champs_1, champs_2, champs_3 FROM sys.ma_table2@local;
  5
  6  BEGIN
  7
  8  OPEN get_data_from_db;
  9  LOOP
 10  FETCH get_data_from_db BULK collect INTO distant_typ_tab LIMIT 100;
 11  FOR i in 1..distant_typ_tab.COUNT
 12  LOOP
 13  INSERT INTO /*+ append */ sys.ma_table
 14  (champs_1,
 15   champs_2,
 16   champs_3
 17  )
 18  SELECT
 19   champs_1,
 20   champs_2,
 21   champs_3
 22  FROM table (distant_typ_tab);
 23  EXIT when get_data_from_db%notfound;
 24  END LOOP;
 25  END LOOP;
 26  COMMIT;
 27  END;
 28  /

* @local : loopback link

Comment by Yeo — February 6, 2014 @ 8:44 am | Reply

Yeo,

I am sorry for that. I haven’t managed to give a reproducible example. So thanks for prompting me to do it

SQL> create table t as select rownum n1 from dual connect by level <= 10;

Table created.

SQL> create type distant_typ AS object(m1 number);
  2  /

Type created.

SQL> create type distant_tab AS table of distant_typ;
  2  /

Type created.

And here below is the code : t is a table I have created in the distant database while type distant_typ are types I have created in the local database (sorry for the confusion)

SQL> select count(1) from t_local;

  COUNT(1)
----------
         0

SQL> DECLARE
    
    CURSOR get_data IS
    SELECT
         distant_typ (a.n1)
    FROM t@xxx.world a;
    
    l_distant_tab distant_tab := distant_tab();
    
   BEGIN
   
   OPEN get_data;
   LOOP
       FETCH get_data BULK collect INTO l_distant_tab LIMIT 5;
       INSERT INTO t_local
           (n1
           )
           SELECT
            m1
           FROM table (l_distant_tab);
    EXIT WHEN  get_data%NOTFOUND;
   END LOOP;
   COMMIT;
   END;
   /

PL/SQL procedure successfully completed.

SQL> select count(1) from t_local;

  COUNT(1)
----------
        10

Comment by Mohamed — February 6, 2014 @ 10:31 am | Reply

[…] https://hourim.wordpress.com/2012/02/25/tuning-an-insertselect-via-dblink/ […]

Pingback by How to Speed up Deletes | abhayus — November 4, 2014 @ 8:51 pm | Reply
Hi Mohamed,

interesting thoughts. But for me this is shifting the problem from a plain SQL Solution to a mixture of PL/SQL and SQL.
I guess this also implies that different memorie areas are used.
Maybe You should spend some words on context switches as well?

Best regards
Jonas

Comment by Jonas Gassenmeyer — December 22, 2016 @ 1:59 pm | Reply
Jonas

This was a real life situation where most of the insert/select time was spent transferring data from the distant to the local database. Since this is an insert there is no way to force a driving site. The driving site is by default the site where the inserted table resides. Of course that whenever possible a plain SQL insert/select is better than a PL/SQL block. But depending on the amount of data that transits between the two database it might reveal better to opt for the PL/SQL option. There is, however, sometimes a pure SQL way to solve performance issue in distributed queries that are due to Oracle going back and forth between the two databases. My comment in Jonathan Lewis article below illustrates this way

Distributed Trap

Best regards
Mohamed Houri

Comment by hourim — December 22, 2016 @ 2:49 pm | Reply
Hello!

For me it is not clear why the PL/SQL solution is faster.
Does it do less roundtrips?
Does it send less data to remote DB?
Or else?

Please clarify the reason why it is a better solution.
Thanks
RobK

Comment by RobK — April 5, 2018 @ 9:53 am | Reply
We should always prefer a SQL insert/select instead of a PL/SQL block with a loop insert end loop. But when it comes to an insert/select via db link sometimes there is performance problem that kicks in because of for example the driving site is always the site where the inserted table is situated. In such a situation a plain SQL insert/select will suffer from this restriction which we can eventually overcome using a PL/SQL block. And the blog example was real life case of such a situation

Best regards
Mohamed

Comment by Mohamed — April 12, 2018 @ 7:22 pm | Reply
I get PL/SQL: ORA-22800: invalid user-defined type distant_typ for this line
SELECT
distant_typ (a.n1)
FROM t@xxx.world a;

Comment by Raja — July 11, 2018 @ 10:34 pm | Reply

RSS feed for comments on this post. TrackBack URI

Mohamed Houri’s Oracle Notes

February 25, 2012