Well, I have often written about cardinality feedback, and I have been striving to find out more detail on how it works, how to determine why it was used, and how to determine when it was used.
I have ran through a series of tests that answers a lot of these questions. There is some information available, but it isn't complete. Here a couple of good starting points..
They were even able to recreate some examples. Well
here is my example.
First, my dataset. I used the zipcode database you can find
here. I used a sqldr script you can find
here, to load a table structure you can find
here.
First I loaded the data (about 74,000 rows), created a couple of indexes and ran a query.
here are the indexes
CREATE INDEX "SYSTEM"."INDEX1" ON "SYSTEM"."TEST_DATA"
( "ZIPCODE" )
TABLESPACE "SYSTEM" ;
CREATE INDEX "SYSTEM"."INDEX2" ON "SYSTEM"."TEST_DATA"
( "CITY" )
TABLESPACE "SYSTEM" ;
CREATE INDEX "SYSTEM"."INDEX3" ON "SYSTEM"."TEST_DATA"
( "COUNTY" )
TABLESPACE "SYSTEM" ;
select /* colorme */ * from test_data where city='ROCHESTER' and county='MONROE';
There are individual indexes on both city and county, and there is a relationship between the 2 items. This query will return 44 rows of data.
First output is the information from v$sql_shared_cursor. Notice the column "use_feedback_stats". I am also supplying the information from v$sqlstats for this query. Here is the output after the first execution.
v$SQL_SHARED_CURSOR
SQL_ID CHILD_NUMBER USE_FEEDBACK_STATS
------------- ---------------------- ------------------
33x0cazrmgrgy 0 Y
V$SQLSTATS
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED
------------- ---------------------- ---------------------- ---------------------- ----------------------
33x0cazrmgrgy 757026858 1 551 44
Notice that the column "use_feedback_stats" is set to 'Y" indicating that the query is marked for cardinality feedback to determine if the first execution is accurate enough. I also ran a 10053 trace. You can see from the output the query is rewritten to be executed with dynamic sampling.
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), NVL(SUM(C3),0), NVL(SUM(C4),0)
FROM
(SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("TEST_DATA") FULL("TEST_DATA") NO_PARALLEL_INDEX("TEST_DATA") */ 1 AS C1,
CASE WHEN "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE' THEN 1 ELSE 0 END AS C2,
CASE WHEN "TEST_DATA"."COUNTY"='MONROE' THEN 1 ELSE 0 END AS C3,
CASE WHEN "TEST_DATA"."CITY"='ROCHESTER' THEN 1 ELSE 0 END AS C4
FROM "BGRENN"."TEST_DATA" SAMPLE BLOCK (4.972376 , 1) SEED (1) "TEST_DATA") SAMPLESUB
Here is the output from explain plan
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
Plan hash value: 757026858
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 1272 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| TEST_DATA | 8 | 1272 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | INDEX2 | 121 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COUNTY"='MONROE')
2 - access("CITY"='ROCHESTER')
Note
-----
- dynamic sampling used for this statement (level=2)
Second output is the information from v$sql_shared_cursor (again). Notice the column "use_feedback_stats" is set to 'N'
SQL_ID CHILD_NUMBER USE_FEEDBACK_STATS
------------- ---------------------- ------------------
33x0cazrmgrgy 0 Y
33x0cazrmgrgy 1 N
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED
------------- ---------------------- ---------------------- ---------------------- ----------------------
33x0cazrmgrgy 757026858 2 572 88
I ran a 10053 trace again (as before). You can see in this case the results from the first execution (with 44 rows), is passed into the second execution using the hint "opt_estimate". You can also see what parts of the query are utilizing cardinality feedback to change the query.
SELECT /*+ OPT_ESTIMATE (TABLE "TEST_DATA" ROWS=44.000000 ) OPT_ESTIMATE (INDEX_SCAN "TEST_DATA" "INDEX2" MIN=69.000000 )
OPT_ESTIMATE (INDEX_FILTER "TEST_DATA" "INDEX2" ROWS=69.000000 ) */
"TEST_DATA"."ZIPCODE" "ZIPCODE",
"TEST_DATA"."LAT" "LAT",
"TEST_DATA"."CITY" "CITY",
"TEST_DATA"."STATE" "STATE",
"TEST_DATA"."COUNTY" "COUNTY",
"TEST_DATA"."TYPE" "TYPE",
"TEST_DATA"."PREFERRED" "PREFERRED",
"TEST_DATA"."WORLDREGION" "WORLDREGION",
"TEST_DATA"."COUNTRY" "COUNTRY",
"TEST_DATA"."LOCATIONTEXT" "LOCATIONTEXT",
"TEST_DATA"."LOCATION" "LOCATION",
"TEST_DATA"."POPULATION" "POPULATION",
"TEST_DATA"."HOUSINGUNITS" "HOUSINGUNITS"
,"TEST_DATA"."INCOME" "INCOME",
"TEST_DATA"."LANDAREA",
"LANDAREATEST_DATA"."WATERAREA" "WATERAREA",
"TEST_DATA"."DECOMMISIONED" "DECOMMISIONED",
"TEST_DATA"."MILITARYRESTRICTIONCODES" "MILITARYRESTRICTIONCODES"
FROM "BGRENN"."TEST_DATA" "TEST_DATA"
WHERE "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE'
I ran this query until the v$sql_shared_cursor came back clean, and showed a "n" indicating that dynamic feedback was no longer being considered. There was information from 2 hard parses in the 10053 file.
SQL_ID CHILD_NUMBER USE_FEEDBACK_STATS
------------- ---------------------- ------------------
33x0cazrmgrgy 0 Y
33x0cazrmgrgy 1 N
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED
------------- ---------------------- ---------------------- ---------------------- ----------------------
33x0cazrmgrgy 757026858 3 593 132
This answered my first question. How do you tell what caused the cardinality feedback ? You run a 10053, and find the OPT_ESTIMATE cardinality hints that adjust the cardinility and reparse the query.
/***************************************/
For my second test, I used the same result set, and ran statistics against the data. After running the query from the first test, I found that cardinality was not being utilzed, so I changed my query.
select /* colorme */ * from test_data where city='ROCHESTER' and county='MONROE1' and state='NY';
This returns no rows of data, but the optimizer can't figure this out.
First execution you can see that cardinality feedback was marked for the statement, and notice the plan_hash_value of 7570268858. In this example I've also included the output from v$SQLSTATS_PLAN_HASH. This information was added after reading the post from Lisa G.
SQL_ID CHILD_NUMBER USE_FEEDBACK_STATS
------------- ---------------------- ------------------
88jhv4mryj58p 0 Y
V$SQLSTATS
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED
------------- ---------------------- ---------------------- ---------------------- ----------------------
88jhv4mryj58p 757026858 1 146 0
v$SQLSTATS_PLAN_HASH
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED
------------- ---------------------- ---------------------- ---------------------- ----------------------
88jhv4mryj58p 757026858 1 146 0
Here is the output from the 10053 trace.
SELECT
"TEST_DATA"."ZIPCODE" "ZIPCODE",
"TEST_DATA"."LAT" "LAT",
"TEST_DATA"."CITY" "CITY",
"TEST_DATA"."STATE" "STATE",
"TEST_DATA"."COUNTY" "COUNTY",
"TEST_DATA"."TYPE" "TYPE",
"TEST_DATA"."PREFERRED" "PREFERRED",
"TEST_DATA"."WORLDREGION" "WORLDREGION",
"TEST_DATA"."COUNTRY" "COUNTRY",
"TEST_DATA"."LOCATIONTEXT" "LOCATIONTEXT",
"TEST_DATA"."LOCATION" "LOCATION",
"TEST_DATA"."POPULATION" "POPULATION",
"TEST_DATA"."HOUSINGUNITS" "HOUSINGUNITS",
"TEST_DATA"."INCOME" "INCOME",
"TEST_DATA"."LANDAREA",
"LANDAREATEST_DATA"."WATERAREA" "WATERAREA",
"TEST_DATA"."DECOMMISIONED" "DECOMMISIONED",
"TEST_DATA"."MILITARYRESTRICTIONCODES" "MILITARYRESTRICTIONCODES"
FROM "BGRENN"."TEST_DATA" "TEST_DATA"
WHERE "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE1'
Here is the first plan
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
Plan hash value: 757026858
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 120 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| TEST_DATA | 1 | 120 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | INDEX2 | 2 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COUNTY"='MONROE1' AND "STATE"='NY')
2 - access("CITY"='ROCHESTER')
Second execution, notice that cardinality feedback will not be used on subsequent executions.
SQL_ID CHILD_NUMBER USE_FEEDBACK_STATS
------------- ---------------------- ------------------
88jhv4mryj58p 0 Y
88jhv4mryj58p 1 N
V$SQLSTATS
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED
------------- ---------------------- ---------------------- ---------------------- ----------------------
88jhv4mryj58p 2704719303 2 148 0
v$SQLSTATS_PLAN_HASH
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED
------------- ---------------------- ---------------------- ---------------------- ----------------------
88jhv4mryj58p 2704719303 1 2 0
88jhv4mryj58p 757026858 1 146 0
Here is the output from the 10053 trace showing the feedback from the first execution using the OPT_ESTIMATE hint again.
SELECT /*+ OPT_ESTIMATE (INDEX_SCAN "TEST_DATA" "INDEX2" MIN=69.000000 )
OPT_ESTIMATE (INDEX_FILTER "TEST_DATA" "INDEX2" ROWS=69.000000 ) */
"TEST_DATA"."ZIPCODE" "ZIPCODE",
"TEST_DATA"."LAT" "LAT",
"TEST_DATA"."CITY" "CITY",
"TEST_DATA"."STATE" "STATE",
"TEST_DATA"."COUNTY" "COUNTY",
"TEST_DATA"."TYPE" "TYPE",
"TEST_DATA"."PREFERRED" "PREFERRED",
"TEST_DATA"."WORLDREGION" "WORLDREGION",
"TEST_DATA"."COUNTRY" "COUNTRY",
"TEST_DATA"."LOCATIONTEXT" "LOCATIONTEXT",
"TEST_DATA"."LOCATION" "LOCATION",
"TEST_DATA"."POPULATION" "POPULATION",
"TEST_DATA"."HOUSINGUNITS" "HOUSINGUNITS",
"TEST_DATA"."INCOME" "INCOME",
"TEST_DATA"."LANDAREA",
"LANDAREATEST_DATA"."WATERAREA" "WATERAREA",
"TEST_DATA"."DECOMMISIONED" "DECOMMISIONED",
"TEST_DATA"."MILITARYRESTRICTIONCODES" "MILITARYRESTRICTIONCODES"
FROM "BGRENN"."TEST_DATA" "TEST_DATA"
WHERE "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE1'
But notice something else... notice the plan has value changed from 757026858 to 2704719303. Fine right ? Cardinality feedback did it's thing, but also notice this is the output from v$sqlstats. Huh ?? The history for executions of the plan (1 for the first plan, and 1 for the second), got summarized into the second plan.
here is the second plan
Plan hash value: 2704719303
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 6 (100)| |
|* 1 | TABLE ACCESS BY INDEX ROWID| TEST_DATA | 1 | 120 | 6 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | INDEX3 | 38 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CITY"='ROCHESTER')
2 - access("COUNTY"='MONROE1')
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Note
-----
- cardinality feedback used for this statement
What does all this mean ????
1) the 10053 trace will reveal where cardinality feedback was used by finding the OPT_ESTIMATE hint.
2) the v$sql_session_cursor view will show you where cardinality feedback was used.
3) There is a new view for looking at the multiple plans created for cardinality feedback called v$SQLSTATS_PLAN_HASH.
And in looking at the DBA_HIST_SQLSTAT, I do see both plans.
SQL_ID PLAN_HASH_VALUE PARSE_CALLS_DELTA BUFFER_GETS_DELTA ROWS_PROCESSED_DELTA
------------- ---------------------- ---------------------- ---------------------- ----------------------
88jhv4mryj58p 757026858 1 146 0
88jhv4mryj58p 2704719303 1 2 0
Another observation I will would make is that when you multiple columns (and you aren't using a composite index), oracle is likely to utilyze cardinality feedback to ensure the cardinality is correct.
Finally, I did try the same example with bind variables (instead of literals). I found that Cardinality Feedback did not kick in for my example. Bind variables must tell the optimizer to pick a "generically" better plan, and not look around.
This is all on version 11.2.0.2
I have included a query I'm using to point out where cardinality feedback has been utilized, and sorted the queries by the affect they have on performance, so I can assess the gains (or losses) associated with the use of this feature. Overall my databases have show remarkable improvement, and those fiew cases where there is a regression, the sql_id is highligted so I can create a profile.
I have to note that this query has become invaluable to identify the top queries affected by cardinality feedback. I have used it to identify where extended stats may help, and I have been able to determine, that cardinality feedback is indeed a good thing most of the time. A lot of this detail is lost in the AWR reports.