Thursday, August 18, 2011

Hyperthreading

I am am working in standing up a cluster of 2 database servers that look an aweful lot like an exadata 2-8 (only without the infiniband)..  It is a 2 node cluster of 8 socket, 8 core processrs with 256g of memory.

We started with a standard configuration and hyperthreading was turned on.. This means that Oracle sees the 64 cpu, as 128 cpus.  We were getting periodic reboots every Saturday morning.. The culprit was finally found as the analyze.. Why ??? because we have a 2 node rac, with 128 CPU's per, the degree of parallelization chosen was 512 !  With an analyze running with 512  parallel processors we were running out of memory.. The final diagnoses was to change parallel_max_servers to 64 to cut back on the parallel servers utilized. The second recommendation from Oracle was to turn off Hyperthreading.

We all said Huh ?  Hyperthreading gives you 2x the throughput because it doubles your CPU's right ???

Well lets' see.

I took my 2 node cluster and split it in 1/2  First half (A) is non-hyperthreaded, second half (B) is hyperthreaded.

I then took swingbench and threw a workload at it with lots of logical I/O.


Here is the benchmark from (A)   non-hyperthreaded

Notice we are getting 2062 transactions per minute.



 OK.. Lets see what happens with (B) hyperthreading.

Notice we are getting 1,685 transactions per minute. 


LESS wow lets look at our AWR compare of both nodes (same period) and see what the database says.


Notice  that the gets per execution match for both nodes, and the rows processed. The difference is in the exec time(ms) per exectuion, CPU time and executions.

IT is faster non-hyperthreaded.   The CPU's are doing more work with hypertrheading off, and pushing more work through.

OK... I've shown that for a non-saturating workload (under 64 active processes) on a 64 CPU box, hyperthreading off is faster.

Now let's saturate !!

First hyperthreading off.

Notice I can get the throughput up to 7,489 transactions per minute.

Now lets check the second node with hyperthreading on.

Notice we are getting 6552.. Darn !!


It looks like hypthreading isn't helping us eithor !! 

 
Now lets check the AWR for the 2 nodes.


Notice that once again the rows processed match up and the gets/execution match.

The number of executions is greater, the execution time is less, and the CPU time is way less for for hyperthreading off.


For a Logical I/O intensive workload, it looks like Hyperthreading is not as efficient.

I would love to hear any other experiences/opinions.

I just wanted to add on to this a thought as to why hypertrheading is such a big deal... Virtualization.  With 2x as many "cpus", you can fine tune the CPU a lot more.. You can carve up the server into smaller pieces.








Monday, August 1, 2011

Cardinality Feedback detailed example

Well, I have often written about cardinality feedback, and I have been striving to find out more detail on how it works, how to determine why it was used, and how to determine when it was used.

I have ran through a series of tests that answers a lot of these questions.  There is some information available, but  it isn't complete.  Here a couple of good starting points..


They were even able to recreate some examples. Well here is my example.

First, my dataset. I used the zipcode database you can find here.  I used a sqldr script you can find here, to load a table structure you can find here.

First I loaded the data (about 74,000 rows), created a couple of indexes and ran a query.

here are the indexes
 CREATE INDEX "SYSTEM"."INDEX1" ON "SYSTEM"."TEST_DATA"
    (  "ZIPCODE" )
    TABLESPACE "SYSTEM" ;
  
 CREATE INDEX "SYSTEM"."INDEX2" ON "SYSTEM"."TEST_DATA"
    (  "CITY" )
     TABLESPACE "SYSTEM" ;
  
 CREATE INDEX "SYSTEM"."INDEX3" ON "SYSTEM"."TEST_DATA"
    (  "COUNTY" )
     TABLESPACE "SYSTEM" ;  

select /* colorme */ * from test_data where city='ROCHESTER' and county='MONROE';



There are individual indexes on both city and county, and there is a relationship between the 2 items.  This query will return 44 rows of data.

First output is the information from v$sql_shared_cursor.  Notice the column "use_feedback_stats".  I am also supplying the information from v$sqlstats for this query.  Here is the output after the first execution.

v$SQL_SHARED_CURSOR
SQL_ID CHILD_NUMBER USE_FEEDBACK_STATS ------------- ---------------------- ------------------ 33x0cazrmgrgy 0 Y
V$SQLSTATS
SQL_ID PLAN_HASH_VALUE PARSE_CALLS BUFFER_GETS ROWS_PROCESSED ------------- ---------------------- ---------------------- ---------------------- ---------------------- 33x0cazrmgrgy 757026858 1 551 44

Notice that the column "use_feedback_stats" is set to 'Y" indicating that the query is marked for cardinality feedback to determine if the first execution is accurate enough.  I also ran a 10053 trace.  You can see from the output the query is rewritten to be executed with dynamic sampling.

SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), NVL(SUM(C3),0), NVL(SUM(C4),0) 
FROM 
(SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("TEST_DATA") FULL("TEST_DATA") NO_PARALLEL_INDEX("TEST_DATA") */ 1 AS C1, 
CASE WHEN "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE' THEN 1 ELSE 0 END AS C2, 
CASE WHEN "TEST_DATA"."COUNTY"='MONROE' THEN 1 ELSE 0 END AS C3, 
CASE WHEN "TEST_DATA"."CITY"='ROCHESTER' THEN 1 ELSE 0 END AS C4 
FROM "BGRENN"."TEST_DATA" SAMPLE BLOCK (4.972376 , 1) SEED (1) "TEST_DATA") SAMPLESUB



Here is the output from explain plan

 PLAN_TABLE_OUTPUT                                                    
  
 -------------------------------------------------------------------------------------------------------------------------
   Plan hash value: 757026858                                                
   -----------------------------------------------------------------------------------------                
   | Id | Operation          | Name   | Rows | Bytes | Cost (%CPU)| Time   |                
   -----------------------------------------------------------------------------------------                
   |  0 | SELECT STATEMENT      |      |   8 | 1272 |   3  (0)| 00:00:01 |                
   |* 1 | TABLE ACCESS BY INDEX ROWID| TEST_DATA |   8 | 1272 |   3  (0)| 00:00:01 |                
   |* 2 |  INDEX RANGE SCAN     | INDEX2  |  121 |    |   1  (0)| 00:00:01 |                
   -----------------------------------------------------------------------------------------                
   Predicate Information (identified by operation id):                                   
   ---------------------------------------------------                                  
     1 - filter("COUNTY"='MONROE')                                             
     2 - access("CITY"='ROCHESTER')                                            
  
 Note                                                           
   -----                                                          
     - dynamic sampling used for this statement (level=2)                                 
  
Second output is the information from v$sql_shared_cursor (again).  Notice the column "use_feedback_stats" is set to 'N'
SQL_ID        CHILD_NUMBER           USE_FEEDBACK_STATS 
------------- ---------------------- ------------------ 
33x0cazrmgrgy 0                      Y                  
33x0cazrmgrgy 1                      N                  

SQL_ID        PLAN_HASH_VALUE        PARSE_CALLS            BUFFER_GETS            ROWS_PROCESSED         
------------- ---------------------- ---------------------- ---------------------- ---------------------- 
33x0cazrmgrgy 757026858              2                      572                    88                     



I ran a 10053 trace again (as before).  You can see in this case the results from the first execution (with 44 rows), is passed into the second execution using the hint "opt_estimate". You can also see what parts of the query are utilizing cardinality feedback to change the query.


SELECT /*+ OPT_ESTIMATE (TABLE "TEST_DATA" ROWS=44.000000 ) OPT_ESTIMATE (INDEX_SCAN "TEST_DATA" "INDEX2" MIN=69.000000 ) 
OPT_ESTIMATE (INDEX_FILTER "TEST_DATA" "INDEX2" ROWS=69.000000 ) */ 
"TEST_DATA"."ZIPCODE" "ZIPCODE",
"TEST_DATA"."LAT" "LAT",
"TEST_DATA"."CITY" "CITY",
"TEST_DATA"."STATE" "STATE",
"TEST_DATA"."COUNTY" "COUNTY",
"TEST_DATA"."TYPE" "TYPE",
"TEST_DATA"."PREFERRED" "PREFERRED",
"TEST_DATA"."WORLDREGION" "WORLDREGION",
"TEST_DATA"."COUNTRY" "COUNTRY",
"TEST_DATA"."LOCATIONTEXT" "LOCATIONTEXT",
"TEST_DATA"."LOCATION" "LOCATION",
"TEST_DATA"."POPULATION" "POPULATION",
"TEST_DATA"."HOUSINGUNITS" "HOUSINGUNITS"
,"TEST_DATA"."INCOME" "INCOME",
"TEST_DATA"."LANDAREA",
"LANDAREATEST_DATA"."WATERAREA" "WATERAREA",
"TEST_DATA"."DECOMMISIONED" "DECOMMISIONED",
"TEST_DATA"."MILITARYRESTRICTIONCODES" "MILITARYRESTRICTIONCODES" 
FROM "BGRENN"."TEST_DATA" "TEST_DATA" 
WHERE "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE'


I ran this query until the v$sql_shared_cursor came back clean, and showed a "n" indicating that dynamic feedback was no longer being considered.  There was information from 2 hard parses in the 10053 file.
SQL_ID        CHILD_NUMBER           USE_FEEDBACK_STATS 
------------- ---------------------- ------------------ 
33x0cazrmgrgy 0                      Y                  
33x0cazrmgrgy 1                      N                  

SQL_ID        PLAN_HASH_VALUE        PARSE_CALLS            BUFFER_GETS            ROWS_PROCESSED         
------------- ---------------------- ---------------------- ---------------------- ---------------------- 
33x0cazrmgrgy 757026858              3                      593                    132                    


This answered my first question. How do you tell what caused the cardinality feedback ? You run a 10053, and find the OPT_ESTIMATE cardinality hints that adjust the cardinility and reparse the query.


/***************************************/

For my second test, I used the same result set, and ran statistics against the data.  After running the query from the first test, I found that cardinality was not being utilzed, so I changed my query.



select /* colorme */ * from test_data where city='ROCHESTER' and county='MONROE1' and state='NY';

This returns no rows of data, but the optimizer can't figure this out.

First execution you can see that cardinality feedback was marked for the statement, and notice the plan_hash_value of 7570268858.  In this example I've also included the output from v$SQLSTATS_PLAN_HASH. This information was added after reading the post from Lisa G.

SQL_ID        CHILD_NUMBER           USE_FEEDBACK_STATS 
------------- ---------------------- ------------------ 
88jhv4mryj58p 0                      Y                  

V$SQLSTATS

SQL_ID        PLAN_HASH_VALUE        PARSE_CALLS            BUFFER_GETS            ROWS_PROCESSED         
------------- ---------------------- ---------------------- ---------------------- ---------------------- 
88jhv4mryj58p 757026858              1                      146                    0                      

v$SQLSTATS_PLAN_HASH

SQL_ID        PLAN_HASH_VALUE        PARSE_CALLS            BUFFER_GETS            ROWS_PROCESSED         
------------- ---------------------- ---------------------- ---------------------- ---------------------- 
88jhv4mryj58p 757026858              1                      146                    0                      



Here is the output from the 10053 trace.


SELECT 
    "TEST_DATA"."ZIPCODE" "ZIPCODE",
    "TEST_DATA"."LAT" "LAT",
    "TEST_DATA"."CITY" "CITY",
    "TEST_DATA"."STATE" "STATE",
    "TEST_DATA"."COUNTY" "COUNTY",
    "TEST_DATA"."TYPE" "TYPE",
    "TEST_DATA"."PREFERRED" "PREFERRED",
    "TEST_DATA"."WORLDREGION" "WORLDREGION",
    "TEST_DATA"."COUNTRY" "COUNTRY",
    "TEST_DATA"."LOCATIONTEXT" "LOCATIONTEXT",
    "TEST_DATA"."LOCATION" "LOCATION",
    "TEST_DATA"."POPULATION" "POPULATION",
    "TEST_DATA"."HOUSINGUNITS" "HOUSINGUNITS",
    "TEST_DATA"."INCOME" "INCOME",
    "TEST_DATA"."LANDAREA",
    "LANDAREATEST_DATA"."WATERAREA" "WATERAREA",
    "TEST_DATA"."DECOMMISIONED" "DECOMMISIONED",
    "TEST_DATA"."MILITARYRESTRICTIONCODES" "MILITARYRESTRICTIONCODES" 
FROM "BGRENN"."TEST_DATA" "TEST_DATA" 
WHERE "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE1'
Here is the first plan

 PLAN_TABLE_OUTPUT                                                    
  
 -------------------------------------------------------------------------------------------------------------------------
  
 Plan hash value: 757026858                                                
  
 -----------------------------------------------------------------------------------------                
  
 | Id | Operation          | Name   | Rows | Bytes | Cost (%CPU)| Time   |                
 -----------------------------------------------------------------------------------------                
 |  0 | SELECT STATEMENT      |      |   1 |  120 |   3  (0)| 00:00:01 |                
 |* 1 | TABLE ACCESS BY INDEX ROWID| TEST_DATA |   1 |  120 |   3  (0)| 00:00:01 |                
 |* 2 |  INDEX RANGE SCAN     | INDEX2  |   2 |    |   1  (0)| 00:00:01 |                
 -----------------------------------------------------------------------------------------                
 Predicate Information (identified by operation id):                                   
 ---------------------------------------------------                                   
   1 - filter("COUNTY"='MONROE1' AND "STATE"='NY')                                    
   2 - access("CITY"='ROCHESTER')                                            
  
Second execution, notice that cardinality feedback will not be used on subsequent executions.

SQL_ID        CHILD_NUMBER           USE_FEEDBACK_STATS 
------------- ---------------------- ------------------ 
88jhv4mryj58p 0                      Y                  
88jhv4mryj58p 1                      N                  

V$SQLSTATS

SQL_ID        PLAN_HASH_VALUE        PARSE_CALLS            BUFFER_GETS            ROWS_PROCESSED         
------------- ---------------------- ---------------------- ---------------------- ---------------------- 
88jhv4mryj58p 2704719303             2                      148                    0                      

v$SQLSTATS_PLAN_HASH

SQL_ID        PLAN_HASH_VALUE        PARSE_CALLS            BUFFER_GETS            ROWS_PROCESSED         
------------- ---------------------- ---------------------- ---------------------- ---------------------- 
88jhv4mryj58p 2704719303             1                      2                      0                      
88jhv4mryj58p 757026858              1                      146                    0                      


Here is the output from the 10053 trace showing the feedback from the first execution using the OPT_ESTIMATE hint again.

SELECT /*+ OPT_ESTIMATE (INDEX_SCAN "TEST_DATA" "INDEX2" MIN=69.000000 ) 
           OPT_ESTIMATE (INDEX_FILTER "TEST_DATA" "INDEX2" ROWS=69.000000 ) */ 
    "TEST_DATA"."ZIPCODE" "ZIPCODE",
    "TEST_DATA"."LAT" "LAT",
    "TEST_DATA"."CITY" "CITY",
    "TEST_DATA"."STATE" "STATE",
    "TEST_DATA"."COUNTY" "COUNTY",
    "TEST_DATA"."TYPE" "TYPE",
    "TEST_DATA"."PREFERRED" "PREFERRED",
    "TEST_DATA"."WORLDREGION" "WORLDREGION",
    "TEST_DATA"."COUNTRY" "COUNTRY",
    "TEST_DATA"."LOCATIONTEXT" "LOCATIONTEXT",
    "TEST_DATA"."LOCATION" "LOCATION",
    "TEST_DATA"."POPULATION" "POPULATION",
    "TEST_DATA"."HOUSINGUNITS" "HOUSINGUNITS",
    "TEST_DATA"."INCOME" "INCOME",
    "TEST_DATA"."LANDAREA",
    "LANDAREATEST_DATA"."WATERAREA" "WATERAREA",
    "TEST_DATA"."DECOMMISIONED" "DECOMMISIONED",
    "TEST_DATA"."MILITARYRESTRICTIONCODES" "MILITARYRESTRICTIONCODES" 
FROM "BGRENN"."TEST_DATA" "TEST_DATA" 
WHERE "TEST_DATA"."CITY"='ROCHESTER' AND "TEST_DATA"."COUNTY"='MONROE1'
But notice something else... notice the plan has value changed from 757026858 to 2704719303.  Fine right ? Cardinality feedback did it's thing, but also notice this is the output from v$sqlstats.  Huh ?? The history for executions of the plan (1 for the first plan, and 1 for the second), got summarized into the second plan.

here is the second plan
 Plan hash value: 2704719303                                                     
  
 -----------------------------------------------------------------------------------------                      
 | Id | Operation          | Name   | Rows | Bytes | Cost (%CPU)| Time   |                      
 -----------------------------------------------------------------------------------------                      
 |  0 | SELECT STATEMENT      |      |    |    |   6 (100)|     |                      
 |* 1 | TABLE ACCESS BY INDEX ROWID| TEST_DATA |   1 |  120 |   6  (0)| 00:00:01 |                      
 |* 2 |  INDEX RANGE SCAN     | INDEX3  |  38 |    |   1  (0)| 00:00:01 |                      
 -----------------------------------------------------------------------------------------                      
  
 Predicate Information (identified by operation id):                                         
 ---------------------------------------------------  
   1 - filter("CITY"='ROCHESTER')                                                  
   2 - access("COUNTY"='MONROE1')                                                  
 PLAN_TABLE_OUTPUT                                                          
 ------------------------------------------------------------------------------------------------------------------------------------
  
 Note                                                                
 -----                                                                
   - cardinality feedback used for this statement         
What does all this mean ????

1) the 10053 trace will reveal where cardinality feedback was used by finding the OPT_ESTIMATE hint.
2) the v$sql_session_cursor view will show you where cardinality feedback was used.
3) There is a new view for looking at the multiple plans created for cardinality feedback called v$SQLSTATS_PLAN_HASH.

And in looking at the DBA_HIST_SQLSTAT, I do see both plans.

SQL_ID        PLAN_HASH_VALUE        PARSE_CALLS_DELTA      BUFFER_GETS_DELTA      ROWS_PROCESSED_DELTA   
------------- ---------------------- ---------------------- ---------------------- ---------------------- 
88jhv4mryj58p 757026858              1                      146                    0                      
88jhv4mryj58p 2704719303             1                      2                      0                      


Another observation I will would make is that when you multiple columns (and you aren't using a composite index), oracle is likely to utilyze cardinality feedback to ensure the cardinality is correct.

Finally, I did try the same example with bind variables (instead of literals).  I found that Cardinality Feedback did not kick in for my example.  Bind variables must tell the optimizer to pick a "generically" better plan, and not look around.

This is all on version 11.2.0.2


I have included a query I'm using to point out where cardinality feedback has been utilized, and sorted the queries by the affect they have on performance, so I can assess the gains (or losses) associated with the use of this feature. Overall my databases have show remarkable improvement, and those fiew cases where there is a regression, the sql_id is highligted so I can create a profile.


I have to note that this query has become invaluable to identify the top queries affected by cardinality feedback.  I have used it to identify where extended stats may help, and I have been able to determine, that cardinality feedback is indeed a good thing most of the time.  A lot of this detail is lost in the AWR reports.





select  query1.sql_id,
query1.plan_hash_value phv1,
query2.plan_hash_value phv2, 
query1.executions exec1,
query2.executions exec2,
trunc((query1.avg_elapsed_time)/1000000,2) aet,
trunc(( query2.avg_elapsed_time)/1000000,2) aet_card,
trunc(( query1.avg_elapsed_time - query2.avg_elapsed_time)/1000000,2) aet_diff,
abs(trunc(( (query1.avg_elapsed_time - query2.avg_elapsed_time)* query2.executions)/1000000,2)) abs_diff,
trunc( ((query1.avg_elapsed_time - query2.avg_elapsed_time) * query2.executions)/1000000,2) tot_diff,
trunc( ((query1.avg_elapsed_time - query2.avg_elapsed_time)/ query1.avg_elapsed_time) * 100,1) card_feedback_perc_difference,query1.sql_text
  from 
(select sql_id,plan_hash_value,last_active_time,executions,(rows_processed/executions) rows_processed,
(elapsed_time/executions) avg_elapsed_time,(cpu_time/executions) avg_cpu_time,(buffer_gets/executions) avg_buffer_gets,
sql_text from V$SQLSTATS_PLAN_HASH a
where exists (select plan_hash_value  from V$SQLSTATS_PLAN_HASH b  where a.sql_id=b.sql_id and a.plan_hash_value <> b.plan_hash_value and executions > 0)
and executions > 0
order by sql_id,last_active_time) query1,
(select sql_id,plan_hash_value,last_active_time,executions,(rows_processed/executions) rows_processed,
(elapsed_time/executions) avg_elapsed_time,(cpu_time/executions) avg_cpu_time,(buffer_gets/executions) avg_buffer_gets,
sql_text from V$SQLSTATS_PLAN_HASH a
where exists (select plan_hash_value  from V$SQLSTATS_PLAN_HASH b  where a.sql_id=b.sql_id and a.plan_hash_value <> b.plan_hash_value and executions > 0)
and executions > 0
order by sql_id,last_active_time) query2
where query1.sql_id=query2.sql_id
and query1.last_active_time < query2.last_active_time 
and exists (select 1 from V$SQL_SHARED_CURSOR c where query1.sql_id=c.sql_id and c.use_feedback_stats='Y') 
--and query1.avg_elapsed_time < query2.avg_elapsed_time
order by abs_diff desc