Mohamed Houri’s Oracle Notes

February 25, 2012

Tuning an insert/select via dblink

Filed under: Trouble shooting — hourim @ 11:40 am

Recently an interesting question came up on an Oracle French Forum about a performance problem during an insert/select operation via a dblink. The poster was suffering from the very long time (several hours without any response) the insert was taking. This is why he came to that forum and posted a question which looks like “what do you think if I commit every 1000 rows? Will my insert speed up”?  That’s not the right way to follow in the process of troubleshooting an Oracle performance problem. I always remember Cary Millsap mantra “Why guess when you can know”. Fortunately Oracle gives us a lot fo tools and trace events to instrument what going on inside the Oracle database. Here below are the steps of discussion we’ve had (the original poster, others and I) in that forum.

I first have asked the poster to trace his insert select using events 10046


alter session set events '10046 trace name context forever, level 12';

INSERT /*+ append */ INTO my_table
(
champs_1,
champs_2,
champs_3,
champs_4,
champs_5,
champs_6,
champs_7
)
SELECT * FROM la_table_cible@mon_dblink ;

commit ;

alter session set events '10046 trace name context off';

I am not going to stress that I have asked the poster to ensure that his direct path load will not be silently ignored by Oracle in case the table “my_table” has trigger or foreign key integrity constraints implemented on it. I have also asked if there will be subsequent deletes (after the insert) on this table making future direct path load not reusing any freed space made by that delete.

The generated trace file profiled via orasrp profiler shows that 98% of the insert/select response time is spent waiting in the event SQL*Net message from dblink.

Up to this step, we persuaded the poster that his initial guess of committing every 1000 records was not the ultimate solution; but instead he should focus on reducing the volume of data transferred from the distant data base to the local one. The poster sends us the explain plan and Statistics of the select part when this one is executed in the distant data base

Statistics
----------------------------------------------
212        recursive calls
3          db block gets
678993     consistent gets
483744     physical reads
592        redo size
124692665  bytes sent via SQL*Net TO client
1391305    bytes received via SQL*Net FROM client
126437     SQL*Net roundtrips TO/FROM client
17         sorts (memory)
0          sorts (disk)
1896535    rows processed

He is selecting 1,896,535 records via 126,437 SQL*Net roundtrips meaning that he is using the default arraysize which is 15 (1,896,535/126,437). For this, he did 0.358 (678,993/1,896,535) logical read for each selected row which seems to be quite acceptable. But, he is sending a very large amount of data 124MB over a data base link and making a hundred of thousands SQL*Net roundtrips. Note, that he is doing also a fairly large amount of physical read (483,744) when compared to the total number of logical reads (678,993).

I’ve then asked the poster to stop selecting all the rows from the distant data base and to start selecting only necessary rows. I asked also to post the explain plan of this query in attempt to help him tuning the select part of his problematic insert. When he said that he can’t intervene into the distant data base, I have then asked him to manage his insert with a PL/SQL stored procedure as shown below:


CREATE type distant_typ AS object (champs_1 number,champs_2 number,champs_3 number, ... champs_7 date);
2  /

CREATE type distant_typ_tab AS TABLE of distant_typ;
2  /

DECLARE
CURSOR get_data_from_db IS
SELECT
     champs_1,
     champs_2,
     champs_3,
     champs_4,
     champs_5,
     champs_6,
     champs_7
FROM la_table_cible@mon_dblink
;

BEGIN

OPEN get_data_from_db;
LOOP
FETCH get_data_from_db BULK collect INTO distant_typ_tab LIMIT 100;
FOR i in 1..distant_typ_tab.COUNT
LOOP
INSERT INTO INSERT /*+ append */ INTO ma_table
(champs_1,
 champs_2,
 champs_3,
 champs_4,
 champs_5,
 champs_6,
 champs_7
)
SELECT
 champs_1,
 champs_2,
 champs_3,
 champs_4,
 champs_5,
 champs_6,
 champs_7
FROM table (distant_typ_tab);
EXIXT when get_data_from_db%notfound;
END LOOP;
END LOOP;
COMMIT;
END;

And magically the poster comes out and said “fantastic my insert/select now is completing in a very acceptable time”.

12 Comments »

  1. Hi, can you explain more about why the PL/SQL store procedure provide a better performance

    Comment by Jason Zhong — December 19, 2012 @ 5:58 am | Reply

  2. Jason,

    There is a crucial issue to tackle when dealing with insert/select via db-link. The select part might involve both the local and the distant databases. By default, Oracle will consider the local database as the driving site and will (1) go the distant database (2) pull as many data as allowed to do by the SQL*Net and the arraysize and (3) join/eliminate data in the local database. When the volume of data to be exchanged between the distant and the local database is very huge it alters considerably the response time of the entire select process. In this case you can simply decide to make the distant database as the default driving site by using the strategic hint /*+ driving_site (..) */.

    However, using this hint in an insert/select via db-link is of no effect. The local database will be always considered as the driving site. There are additional details in the following blog article:

    Distributed DML

    So, in response to your question I could say that when troubleshooting an insert/select via db-link where time is found to be consumed in the data exchange between the local and the distant database and due to the above restrictions on the choice of the driving site, we could walk around this issue by dividing the insert/select into small pieces of Pl/SQL inserts instead of one and unique large insert.

    Comment by Houri Mohamed — December 20, 2012 @ 10:57 am | Reply

  3. Using cardinality hint in subselect seems to work to get work of select done on remote db first
    Driving hint and other hints are ignored for dml insert

     INSERT /*+ append */  
            INTO MIG_CGD30_TEST       
                    SELECT  /*+ cardinality(ZFD 400000) cardinality(CGD 60000000)*/ 
                TRIM (CGD.NUMCPT) AS NUMCPT, TRIM (ZFD.NUMBDC_NEW) AS NUMBDC
                  FROM CGD30@DBL_MIG_THALER CGD,
                       ZFD10@DBL_MIG_THALER ZFD,
                       EVD01_ADS_DR3W2  EVD
    

    Comment by Lavallee Alain — February 13, 2013 @ 5:48 pm | Reply

  4. Alain,

    When you use the cardinality hint you are changing the arithmetic of the Optimizer. You are telling the CBO that he has to trust you and consider the cardinality (number of rows of the corresponding table) you are supplying it thanks to this hint.

    You don’t really have to do that if the statistics reflect the exact reality of the volume of the data and how this data is scattered (distributed) .Based on an adequate statistics the CBO is able to get the best execution plan possible. However, the reality is such sometimes you know your data better than the CBO does and hence you can hint your query as you seem to have succeeded to do in this insert/select via a data base link

    But keep an open eye on this. Things might change when data change!!!

    Comment by Houri — February 14, 2013 @ 1:31 pm | Reply

  5. the followings error occurred…

    LINE/COL ERROR
    -------- -----------------------------------------------------------------
    10/1     PL/SQL: SQL Statement ignored
    10/42    PLS-00321: expression 'DISTANT_TYP_TAB' is inappropriate as the
             left hand side of an assignment statement
    
    11/1     PL/SQL: Statement ignored
    11/29    PLS-00302: component 'COUNT' must be declared
    
    
    My Scripts....
    
    CREATE table ma_Table (champs_1 number,champs_2 number,champs_3 number);
    CREATE table ma_Table2 (champs_1 number,champs_2 number,champs_3 number);
    
    CREATE type distant_typ AS object (champs_1 number,champs_2 number,champs_3 number);
    /
    
    CREATE type distant_typ_tab AS TABLE of distant_typ;
    /
    
      1  create or replace procedure sys.test
      2  IS
      3  CURSOR get_data_from_db IS
      4  SELECT  champs_1, champs_2, champs_3 FROM sys.ma_table2@local;
      5
      6  BEGIN
      7
      8  OPEN get_data_from_db;
      9  LOOP
     10  FETCH get_data_from_db BULK collect INTO distant_typ_tab LIMIT 100;
     11  FOR i in 1..distant_typ_tab.COUNT
     12  LOOP
     13  INSERT INTO /*+ append */ sys.ma_table
     14  (champs_1,
     15   champs_2,
     16   champs_3
     17  )
     18  SELECT
     19   champs_1,
     20   champs_2,
     21   champs_3
     22  FROM table (distant_typ_tab);
     23  EXIT when get_data_from_db%notfound;
     24  END LOOP;
     25  END LOOP;
     26  COMMIT;
     27  END;
     28  /
    

    * @local : loopback link

    Comment by Yeo — February 6, 2014 @ 8:44 am | Reply

  6. Yeo,

    I am sorry for that. I haven’t managed to give a reproducible example. So thanks for prompting me to do it

    SQL> create table t as select rownum n1 from dual connect by level <= 10;
    
    Table created.
    
    SQL> create type distant_typ AS object(m1 number);
      2  /
    
    Type created.
    
    SQL> create type distant_tab AS table of distant_typ;
      2  /
    
    Type created.
    

    And here below is the code : t is a table I have created in the distant database while type distant_typ are types I have created in the local database (sorry for the confusion)

    SQL> select count(1) from t_local;
    
      COUNT(1)
    ----------
             0
    
    SQL> DECLARE
        
        CURSOR get_data IS
        SELECT
             distant_typ (a.n1)
        FROM t@xxx.world a;
        
        l_distant_tab distant_tab := distant_tab();
        
       BEGIN
       
       OPEN get_data;
       LOOP
           FETCH get_data BULK collect INTO l_distant_tab LIMIT 5;
           INSERT INTO t_local
               (n1
               )
               SELECT
                m1
               FROM table (l_distant_tab);
        EXIT WHEN  get_data%NOTFOUND;
       END LOOP;
       COMMIT;
       END;
       /
    
    PL/SQL procedure successfully completed.
    
    SQL> select count(1) from t_local;
    
      COUNT(1)
    ----------
            10
    

    Comment by Mohamed — February 6, 2014 @ 10:31 am | Reply

  7. Hi Mohamed,

    interesting thoughts. But for me this is shifting the problem from a plain SQL Solution to a mixture of PL/SQL and SQL.
    I guess this also implies that different memorie areas are used.
    Maybe You should spend some words on context switches as well?

    Best regards
    Jonas

    Comment by Jonas Gassenmeyer — December 22, 2016 @ 1:59 pm | Reply

  8. Jonas

    This was a real life situation where most of the insert/select time was spent transferring data from the distant to the local database. Since this is an insert there is no way to force a driving site. The driving site is by default the site where the inserted table resides. Of course that whenever possible a plain SQL insert/select is better than a PL/SQL block. But depending on the amount of data that transits between the two database it might reveal better to opt for the PL/SQL option. There is, however, sometimes a pure SQL way to solve performance issue in distributed queries that are due to Oracle going back and forth between the two databases. My comment in Jonathan Lewis article below illustrates this way

    Distributed Trap

    Best regards
    Mohamed Houri

    Comment by hourim — December 22, 2016 @ 2:49 pm | Reply

  9. Hello!

    For me it is not clear why the PL/SQL solution is faster.
    Does it do less roundtrips?
    Does it send less data to remote DB?
    Or else?

    Please clarify the reason why it is a better solution.
    Thanks
    RobK

    Comment by RobK — April 5, 2018 @ 9:53 am | Reply

  10. We should always prefer a SQL insert/select instead of a PL/SQL block with a loop insert end loop. But when it comes to an insert/select via db link sometimes there is performance problem that kicks in because of for example the driving site is always the site where the inserted table is situated. In such a situation a plain SQL insert/select will suffer from this restriction which we can eventually overcome using a PL/SQL block. And the blog example was real life case of such a situation

    Best regards
    Mohamed

    Comment by Mohamed — April 12, 2018 @ 7:22 pm | Reply

  11. I get PL/SQL: ORA-22800: invalid user-defined type distant_typ for this line
    SELECT
    distant_typ (a.n1)
    FROM t@xxx.world a;

    Comment by Raja — July 11, 2018 @ 10:34 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a reply to Houri Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud's blog

Just another blog : Databases, Linux and other stuffs

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)