Mohamed Houri’s Oracle Notes

August 29, 2013

INDEX FULL SCAN , NLS_SORT and ORDER BY

Filed under: Oracle — hourim @ 12:38 pm

If I was to ask you how an INDEX FULL SCAN is operated by the SQL engine you would certainly answer that it will go to the first leaf block of the index and walk down up to the last leaf block in the index key order using typically a db file sequential read.

I have strengthened the words  index key order because it is specifically that operation which interests me here in this article. Each time I see an INDEX FULL SCAN operation in an execution plan, I immediately try to know if the CBO did took an advantage of this typical access to avoid an eventual supplementary ORDER BY operation. There is a bug, under the FIRST_ROWS mode, where an INDEX FULL SCAN is preferred, whatever its cost is, in order to avoid an ORDER BY operation. Hence, in the presence of such index operation, I also try to verify the CBO mode and/or the presence of a where clause such as where rownum=1 which makes, behind the scene, the CBO behaving as if it was running under FIRST_ROWS mode.

Recently an excellent question comes up in a French forum where the Original Poster (OP) was wondering why the CBO was making a wrong decision. Several very good interventions by very nice peoples motivated me to write two articles, the first one related to the relationship that might exist between an INDEX FULL SCAN and an ORDER BY operation while the second article will look on the effect the optimizer_index_cost_adj parameter might have on the choice of a good or wrong execution path.

The  OP query and execution plan  are shown below:

SELECT
     colonne1
FROM matable
GROUP BY colonne1
ORDER BY colonne1 ASC NULLS LAST;
Plan hash value: 2815412565

------------------------------------------------------------------------------------------
| Id  | Operation         | Name                 | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                      |      1 |        |      3 |00:00:21.60 |
|   1 |  SORT ORDER BY    |                      |      1 |      3 |      3 |00:00:21.60 |
|   2 |   HASH GROUP BY   |                      |      1 |      3 |      3 |00:00:21.60 |
|   3 |    INDEX FULL SCAN| MATABLE_PK           |      1 |   2923K|   2928K|00:00:21.99 |
------------------------------------------------------------------------------------------

The query takes more than 20 seconds to complete. And when he instructs the CBO to use a FULL table scans the response time fells down to about 4 seconds

SELECT /*+ NO_INDEX(matable matable_pk) */
colonne1
FROM matable
GROUP BY colonne1
ORDER BY colonne1 ASC NULLS LAST;

-----------------------------------------------------------------------------------------
| Id  | Operation           | Name              | Starts | E-Rows | A-Rows |   A-Time   |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                   |      1 |        |      3 |00:00:04.03 |
|   1 |  SORT ORDER BY      |                   |      1 |      3 |      3 |00:00:04.03 |
|   2 |   HASH GROUP BY     |                   |      1 |      3 |      3 |00:00:04.03 |
|   3 |    TABLE ACCESS FULL| MATABLE           |      1 |   2923K|   2928K|00:00:03.19 |
-----------------------------------------------------------------------------------------

The next blog article will discuss the reason of this wrong execution plan choice.  For this moment, let me just spot with you the duplicate sort operation the OP has got.

------------------------------------------------------------------------------------------
| Id  | Operation         | Name                 | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------
|   1 |  SORT ORDER BY    |                      |      1 |      3 |      3 |00:00:21.60 |
|   2 |   HASH GROUP BY   |                      |      1 |      3 |      3 |00:00:21.60 |
|   3 |    INDEX FULL SCAN| MATABLE_PK           |      1 |   2923K|   2928K|00:00:21.99 |
------------------------------------------------------------------------------------------

An ordered INDEX FULL SCAN (on the leading PK column) access followed by a SORT ORDER BY of this PK column.

Why?

This is the aim of the current blog article.

First let me present the model

CREATE TABLE t
(c1 VARCHAR2(64), c2 CHAR(15), d1 DATE);

INSERT INTO t
  SELECT
       mod(ABS(dbms_random.random),3)+ 1||chr(ascii('Y')) ,
       dbms_random.string('L',dbms_random.value(1,5))||rownum ,
       to_date(TO_CHAR(to_date('01/01/1980','dd/mm/yyyy'),'J') + TRUNC(dbms_random.value(1,11280)),'J')
FROM dual
CONNECT BY level <= 2e6;

ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (c1,c2) USING INDEX;

EXEC dbms_stats.gather_table_stats (USER, 't', CASCADE => true, method_opt => 'FOR ALL COLUMNS SIZE 1');

And now the query on 11.2.0.3.0 – 64bit Production

SQL > SELECT  c1
 2    FROM t
 3    GROUP BY c1
 4   ORDER BY c1 ASC NULLS LAST;

C1
 --------------------------
 1Y
 2Y
 3Y

--------------------------------------
 SQL_ID  0nfhzk4r58zuw, child number 1
 -------------------------------------
 SELECT  c1   FROM t   GROUP BY c1  ORDER BY c1 ASC NULLS LAST

Plan hash value: 2111031280
 -----------------------------------------------------------------------------
 | Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
 -----------------------------------------------------------------------------
 |   0 | SELECT STATEMENT     |      |       |       |  2069 (100)|          |
 |   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |  2069   (5)| 00:00:06 |
 |   2 |   INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
 -----------------------------------------------------------------------------

As I have expected, an ordered INDEX FULL SCAN on the leading primary key column which allows the CBO to avoid the ORDER BY c1 operation as clearly shown by the operation 1 SORT GROUP BY NOSORT

So what is the difference between my model and the OP one? Or more precisely what is the difference between my environment and the OP one? It should exist something that makes the difference. Fortunately the thread was under good hands and someone cleverly asked to get the execution plan with the advanced option thought that his intention was to see the cost. Nevertheless, the advanced option shows that the OP was using a French NLS_SORT parameter.

Hmmmm…

Let me then change my nls_sort to FRENCH and see what happens to my engineered query


SQL> show parameter nls_sort

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------
nls_sort                             string      BINARY

SQL> alter session set nls_sort=FRENCH;

Session altered.

SQL> SELECT  c1
2    FROM t
3    GROUP BY c1
4   ORDER BY c1 ASC NULLS LAST;

C1
------------------------
1Y
2Y
3Y

SQL_ID  0nfhzk4r58zuw, child number 3
-------------------------------------
SELECT  c1   FROM t   GROUP BY c1  ORDER BY c1 ASC NULLS LAST

Plan hash value: 1760210272

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |  2451 (100)|          |
|   1 |  SORT ORDER BY        |      |     3 |     9 |  2451  (20)| 00:00:07 |
|   2 |   SORT GROUP BY NOSORT|      |     3 |     9 |  2451  (20)| 00:00:07 |
|   3 |    INDEX FULL SCAN    | T_PK |  2000K|  5859K|  2069   (5)| 00:00:06 |
------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
3 - SEL$1 / T@SEL$1

Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX(@"SEL$1" "T"@"SEL$1" ("T"."C1" "T"."C2"))
END_OUTLINE_DATA
*/

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) NLSSORT("C1",'nls_sort=''FRENCH''')[2000],"C1"[VARCHAR2,256]
2 - (#keys=1) "C1"[VARCHAR2,256]
3 - "C1"[VARCHAR2,256] 

The column projection gives an interesting information on what’s going on here (nls_sort= french)


Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) NLSSORT("C1",'nls_sort=''FRENCH''')[2000],"C1"[VARCHAR2,256]

On contrast to the situation where my column c1 would have been declared as of a NUMBER data type, the nls_sort parameter value would not have played any effect as shown below:


SQL> describe t1

Name                            Null?    Type
------------------------------- -------- ----------
1      C1                              NOT NULL NUMBER
2      C2                              NOT NULL CHAR(15)
3      D1                                       DATE

SQL> show parameter nls_sort

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----------
nls_sort                             string      FRENCH

SQL>SELECT  c1
2    FROM t1
3    GROUP BY c1
4   ORDER BY c1 ASC NULLS LAST;

C1
----------
1
2
3

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |       |       |  2105 (100)|          |
|   1 |  SORT GROUP BY NOSORT|       |  1754K|    21M|  2105   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T1_PK |  1754K|    21M|  2105   (5)| 00:00:06 |
------------------------------------------------------------------------------

SQL> alter session set nls_sort = BINARY;

Session altered.

SQL> SELECT  c1
2    FROM t1
3    GROUP BY c1
4   ORDER BY c1 ASC NULLS LAST;

C1
----------
1
2
3

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |       |       |  2105 (100)|          |
|   1 |  SORT GROUP BY NOSORT|       |  1754K|    21M|  2105   (5)| 00:00:06 |
|   2 |   INDEX FULL SCAN    | T1_PK |  1754K|    21M|  2105   (5)| 00:00:06 |
------------------------------------------------------------------------------

Footnote: When you see in your execution plan two ordered operations like and INDEX FULL SCAN followed by an ORDER BY on the leading index column then check the nls_sort parameter. It might be due to the difference of the session nls_sort parameter and the sort parameter used internally by Oracle when reading the INDEX FULL SCAN keys.

3 Comments »

  1. Salut Mohamed,
    Très bonne initiative de ta part suite à cette intéressante discussion sur DVP !
    Et puis j’aime bien ton blog, tant pour l’esthétique que pour le contenu.
    Bonne continuation,
    Pacmann

    Comment by Pacmann — August 29, 2013 @ 3:26 pm | Reply

  2. Oh, by the way, just a precision : “It might be due to the difference of the session nls_sort parameter and the sort parameter used internally by Oracle when reading the INDEX FULL SCAN keys.”

    => It happens whenever NLS_SORT is not BINARY, since index entries are always binary sorted :

    http://docs.oracle.com/cd/B19306_01/server.102/b14237/initparams130.htm

    “Setting NLS_SORT to anything other than BINARY causes a sort to use a full table scan, regardless of the path chosen by the optimizer. BINARY is the exception because indexes are built according to a binary order of keys.[…]

    Comment by Pacmann — August 29, 2013 @ 3:39 pm | Reply

  3. Pacmann,

    Merci pour tes encouragements ainsi que pour le lien qui confirme mon ”Footenote”. J’essaierai de résumer la deuxième partie assez rapidement si le temps me le permet.

    Bien Cordialement

    Comment by hourim — August 29, 2013 @ 5:49 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Tony's Oracle Tips

Tony Hasler's light hearted approach to learning about Oracle

Richard Foote's Oracle Blog

Focusing Specifically On Oracle Indexes, Database Administration and Some Great Music

Hatem Mahmoud Oracle's blog

Just another Oracle blog : Database topics and techniques

Mohamed Houri’s Oracle Notes

Qui se conçoit bien s’énonce clairement

Oracle Diagnostician

Performance troubleshooting as exact science

Raheel's Blog

Things I have learnt as Oracle DBA

Coskan's Approach to Oracle

What I learned about Oracle

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

EU Careers info

Your career in the European Union

Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Oracle Scratchpad

Just another Oracle weblog

OraStory

Dominic Brooks on Oracle Performance, Tuning, Data Quality & Sensible Design ... (Now with added Sets Appeal)

%d bloggers like this: