The Oracle Cost Based Optimizer, the MS-SQL Server optimizer and the PostgreSQL query planner cannot use a NESTED LOOP physical operation to execute FULL OUTER and RIGHT OUTER joins logical operations. They all address the RIGHT OUTER join limitation by switching the inner and the outer row source so that a LEFT OUTER JOIN can be used. While the first two optimizer turn the FULL OUTER join into a LEFT OUTER join concatenated with an ANTI-join, PostgreSQL query planner will always use a HASH/MERGE JOIN to do a FULL OUTER join.
Let’s make this less confusing by starting with the basics. The algorithm of a NESTED LOOP physical operation is:
for each row ro in the outer row source loop for each row ri in the inner row source loop if ro joins ri then return current row end loop end loop
1.Oracle 12cR2
A simple execution of the above algorithm can be schematically represented via the following Oracle execution plan:
select /*+ use_nl(t1 t2) */ t1.* from t1 inner join t2 on t1.n1 = t2.n1; ---------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | ---------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 3 | | 1 | NESTED LOOPS | | 1 | 4 | 3 | | 2 | TABLE ACCESS FULL| T1 | 1 | 3 | 3 | |* 3 | INDEX RANGE SCAN | T2_IDX | 3 | 1 | 3 | ---------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 – access("T1"."N1"="T2"."N1")
As you can see, in accordance with the algorithm, for each row in T1 table (A-Rows=3 operation id n°2) we scanned 3 times (Starts = 3 operation id n°3) the T2_IDX index.
Let’s now try a FULL OUTER join but without any hint:
select t1.* from t1 full outer join t2 on t1.n1 = t2.n1; --------------------------------------------------- | Id | Operation | Name | Rows | --------------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 | VIEW | VW_FOJ_0 | 4 | |* 2 | HASH JOIN FULL OUTER| | 4 | | 3 | TABLE ACCESS FULL | T1 | 3 | | 4 | TABLE ACCESS FULL | T2 | 4 | --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."N1"="T2"."N1")
So far so good. A HASH JOIN FULL OUTER to honor a full outer join between two tables.
But what if I want to use a NESTED LOOP FULL OUTER instead of HASH JOIN FULL OUTER join ?
select /*+ use_nl(t1 t2) */ t1.* from t1 FULL outer join t2 on t1.n1 = t2.n1; ------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | ------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | 1 | VIEW | | 6 | 120 | | 2 | UNION-ALL | | | | | 3 | NESTED LOOPS OUTER| | 4 | 40 | | 4 | TABLE ACCESS FULL| T1 | 3 | 21 | |* 5 | INDEX RANGE SCAN | T2_IDX | 1 | 3 | | 6 | NESTED LOOPS ANTI | | 2 | 12 | | 7 | TABLE ACCESS FULL| T2 | 4 | 12 | |* 8 | TABLE ACCESS FULL| T1 | 2 | 6 | ------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T1"."N1"="T2"."N1") 8 - filter("T1"."N1"="T2"."N1")
What the heck is this execution plan of 8 operations?
Instead of having a simple NESTED LOOP FULL OUTER I got a concatenation of NESTED LOOPS OUTER and a NESTED LOOPS ANTI join.That’s an interesting transformation operated by the CBO.
Should I have tried to reverse engineer the query that sits behind the above execution plan I would have very probably obtained the following query:
select t1.* from t1 ,t2 where t1.n1 = t2.n1(+) union all select t2.* from t2 where not exists (select /*+ use_nl(t2 t1) */ null from t1 where t1.n1 = t2.n1); ------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | ------------------------------------------------------ | 0 | SELECT STATEMENT | | | | | 1 | UNION-ALL | | | | | 2 | NESTED LOOPS OUTER| | 4 | 40 | | 3 | TABLE ACCESS FULL| T1 | 3 | 21 | |* 4 | INDEX RANGE SCAN | T2_IDX | 1 | 3 | | 5 | NESTED LOOPS ANTI | | 2 | 20 | | 6 | TABLE ACCESS FULL| T2 | 4 | 28 | |* 7 | TABLE ACCESS FULL| T1 | 2 | 6 | ------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("T1"."N1"="T2"."N1") 7 - filter("T1"."N1"="T2"."N1")
In fact when I have directed Oracle to use a NESTED LOOP to FULL OUTER JOIN T1 and T2 it has turned out my instruction into:
T1 LEFT OUTER JOIN T2 UNION ALL T2 ANTI JOIN T1
Which is nothing else than :
- select all rows from T1 and T2 provided they join
- add to these rows, rows from T1 that don’t join (LEFT OUTER)
- add to these rows, all rows from T2 that don’t join (ANTI) with rows from T1
Do you know why Oracle did all this somehow complicated gymnastic?
It did it because I asked it to do an impossible operation: NESTED LOOP doesn’t support FULL OUTER join.
It doesn’t support RIGHT OUTER join as well as shown below:
select /*+ use_nl(t1 t2) */ t1.* from t1 RIGHT outer join t2 on t1.n1 = t2.n1; --------------------------------------------------- | Id | Operation | Name | Rows | Bytes | --------------------------------------------------- | 0 | SELECT STATEMENT | | | | | 1 | NESTED LOOPS OUTER| | 4 | 40 | | 2 | TABLE ACCESS FULL| T2 | 4 | 12 | |* 3 | TABLE ACCESS FULL| T1 | 1 | 7 | --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("T1"."N1"="T2"."N1")
Don’t be confused here. The above RIGHT OUTER JOIN has been turned into a LEFT OUTER JOIN by switching the inner and the outer table. As such, T2 being placed in the left side of the join Oracle is able to use a NESTED LOOP to operate a LEFT OUTER JOIN. You will see this clearly explained in the corresponding SQL Server execution plan I will show later in this article.
2. PostgreSQL 10.1
Since there are no hints in PostgreSQL to make a join using a NESTED LOOP I will start by cancelling hash and merge join operations as shown below:
postgres=# set enable_mergejoin=false; SET postgres=# set enable_hashjoin=false; SET
And now I am ready to show you how the PostgreSQL query planner turns a right outer join into a left outer join when a NESTED LOOP operation is used:
postgres=# explain postgres-# select postgres-# t1.* postgres-# from t1 right outer join t2 postgres-# on t1.n1 = t2.n1; QUERY PLAN ------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..95.14 rows=23 width=42) Join Filter: (t1.n1 = t2.n1) -> Seq Scan on t2 (cost=0.00..1.04 rows=4 width=4) -> Materialize (cost=0.00..27.40 rows=1160 width=42) -> Seq Scan on t1 (cost=0.00..21.60 rows=1160 width=42) (5 lignes)
However, in contrast to Oracle and MS-SQL Server, PostgreSQL query planner is unable to transform a full outer join into a combination of an NESTED LOOP LEFT OUTER join and an ANTI-join as the following demonstrates:
explain select t1.* from t1 full outer join t2 on t1.n1 = t2.n1; QUERY PLAN -------------------------------------------------------------------------- Hash Full Join (cost=10000000001.09..10000000027.27 rows=1160 width=42) Hash Cond: (t1.n1 = t2.n1) -> Seq Scan on t1 (cost=0.00..21.60 rows=1160 width=42) -> Hash (cost=1.04..1.04 rows=4 width=4) -> Seq Scan on t2 (cost=0.00..1.04 rows=4 width=4)
Spot in passing how disabling the hash join option (set enable_hashjoin=false) is not an irreversible action. Whenever the query planner is unable to find another way to accomplish its work it will use all option available even those being explicitely disabled.
3. MS-SQL Server 2016
4. Summary
In several if not all modern Relational DataBase Management Systems, NESTED LOOP operation doesn’t support right outer and full outer join. Oracle, MS-SQL Server and PostgreSQL turn “T1 right outer join T2” into “T2 left outer join T1” by switching the inner and the outer row source. Oracle and SQL Server turn a full outer join between T1 and T2 into a T1 left outer join T2 union-all T2 anti-join T1. PostgreSQL will always use a hash/merge to full outer join T1 and T2.
Model
--@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ -- NESTED LOOP and full/right outer join : Oracle --@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ drop table t1; drop table t2; create table t1(t1_id int , t1_vc varchar2(10)); insert into t1 values (1, 't1x'); insert into t1 values (2, 't1y'); insert into t1 values (3, 't1z'); create index t1_idx on t1(t1_id); create table t2 (t2_id int, t2_vc varchar(10)); insert into t2 values (2, 't2x'); insert into t2 values (3, 't2y'); insert into t2 values (3, 't2yy'); insert into t2 values (4, 't2z'); create index t2_idx on t2(t2_id);
Leave a comment