Thursday, June 2, 2011

Cardinality and the in clause

I was running a query that had an in clause, and the cardinality turned out to be wrong.. After investigating, I found that the optimizer was double applying the selectivity of the index.

Here is my example

select * from system.tab1 a,system.tab2 b
where a.col1 IN ('WRH$_PARAMETER_PK','WRH$_SEG_STAT','WRH$_DB_CACHE_ADVICE')
and col1=col2;

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 54186084

------------------------------------------------------------------------------------------
| Id  | Operation                      | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |         |     3 |   162 |    14   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                  |         |       |       |            |          |
|   2 |   NESTED LOOPS                 |         |     3 |   162 |    14   (0)| 00:00:01 |
|   3 |    INLIST ITERATOR             |         |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID| TAB1    |     3 |    75 |     7   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN         | TAB1_PK |     3 |       |     4   (0)| 00:00:01 |
|*  6 |    INDEX RANGE SCAN            | TAB2_IX |     1 |       |     2   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID  | TAB2    |     1 |    29 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("A"."COL1"='WRH$_DB_CACHE_ADVICE' OR "A"."COL1"='WRH$_PARAMETER_PK'
              OR "A"."COL1"='WRH$_SEG_STAT')
   6 - access("COL1"="COL2")
       filter("COL2"='WRH$_DB_CACHE_ADVICE' OR "COL2"='WRH$_PARAMETER_PK' OR
              "COL2"='WRH$_SEG_STAT')

23 rows selected.


Notice the cardinality of 3 rows.   What made me wonder, is when I used just an '=' sign

select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_DB_CACHE_ADVICE'
and col1=col2;

------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1964798218

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |    26 |  1404 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |    26 |  1404 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TAB1    |     1 |    25 |     2   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | TAB1_PK |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| TAB2    |    26 |   754 |    21   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | TAB2_IX |    26 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"='WRH$_DB_CACHE_ADVICE')
   5 - access("COL2"='WRH$_DB_CACHE_ADVICE')



Notice the cardinality is 26 ?  why is the optimizer changing the cardinality from 26 to 3, when I include an in clause with one other value ? I would expect the cardinality to be 78.  In fact when I use a union it comes back right.

select * from system.tab1 a,system.tab2 b

where col1 ='WRH$_DB_CACHE_ADVICE'
and col1=col2
union
select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_SEG_STAT'
and col1=col2
union
select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_PARAMETER_PK'
and col1=col2;

------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2805776637

---------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                    |    77 |  7014 |    71  (68)| 00:00:01 |
|   1 |  SORT UNIQUE                       |                    |    77 |  7014 |    71  (68)| 00:00:01 |
|   2 |   UNION-ALL                        |                    |       |       |            |          |
|   3 |    NESTED LOOPS                    |                    |    26 |  1404 |    23   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID    | TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN             | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID    | TAB2               |    26 |   754 |    21   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN              | TAB2_IX            |    26 |       |     2   (0)| 00:00:01 |
|   8 |    NESTED LOOPS                    |                    |       |       |            |          |
|   9 |     NESTED LOOPS                   |                    |    51 |  5610 |    46   (0)| 00:00:01 |
|  10 |      VIEW                          | VW_JF_SET$623BBB07 |     2 |   162 |     4   (0)| 00:00:01 |
|  11 |       SORT UNIQUE                  |                    |     2 |    50 |     4  (50)| 00:00:01 |
|  12 |        UNION-ALL                   |                    |       |       |            |          |
|  13 |         TABLE ACCESS BY INDEX ROWID| TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|* 14 |          INDEX UNIQUE SCAN         | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|  15 |         TABLE ACCESS BY INDEX ROWID| TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|* 16 |          INDEX UNIQUE SCAN         | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|* 17 |      INDEX RANGE SCAN              | TAB2_IX            |    26 |       |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS BY INDEX ROWID    | TAB2               |    26 |   754 |    21   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("COL1"='WRH$_PARAMETER_PK')
   7 - access("COL2"='WRH$_PARAMETER_PK')
  14 - access("COL1"='WRH$_DB_CACHE_ADVICE')
  16 - access("COL1"='WRH$_SEG_STAT')
  17 - access("ITEM_1"="COL2")




So I ran a 10053 trace and this is what I found.

1) the table (tab2) has 1280422 rows with 49,840 distinct values

Table Stats::

Table: TAB2 Alias: B
#Rows: 1280422 #Blks: 5972 AvgRowLen: 29.00 ChainCnt: 0.00
Column (#1): COL2(
AvgLen: 23 NDV: 49840 Nulls: 0 Density: 0.000020
Index Stats::
Index: TAB2_IX Col#: 1
LVLS: 2 #LB: 5787 #DK: 49840 LB/K: 1.00 DB/K: 17.00 CLUF: 884604.00
 
2) oracle takes this cardinality and computes it for the table access (1,280,422/49,840) * 3 = 77
 
SINGLE TABLE ACCESS PATH



Single Table Cardinality Estimation for TAB2[B]
Table: TAB2 Alias: B
Card: Original: 1280422.000000 Rounded: 77 Computed: 77.07 Non Adjusted: 77.07
Access Path: TableScan
 
 
 
 
3) It then takes the cardinality of the first table (tab1) the cardinality for the second table (tab2), and applies the selectivity.
 
Join Card: 2.964306 = = outer (3.000000) * inner (77.071950) * sel (0.012821)
 
Here is my example and my 10053 trace


No comments:

Post a Comment