Friday, October 5, 2012

Anayzing query Cost example

Well,
  Here is the issue I've been dealing with.  The query cost doesn't stay consistent, and I was wondering if a profile would help keep it stable, and if it keeps things better or worse.

  I'm sure you have all run into this.  For some reason your cardinality can't be well estimated, the optimizer changes plans on you, and you want to know if a profile will help or hurt.

  The good news I had the perfect test case.  The query was part of a data load, and there was a driving table with a varying number of rows in it.  This is how I went about analyzing.

First I created a table to store the results.

create table bgrenn.mytable
 (ROW_COUNT                                          NUMBER,
 PLAN_HASH_VALUE                                    VARCHAR2(15)
 COST                                               number);




Next I took a copy of my driving table ( a full size table). I used this to create a smaller copy of the table.



declare


row_count number:= 0;
v_statement_id varchar2(10);
v_plan_hash_value varchar2(15);
v_cost number;
v_statement varchar2(4000);

begin

for row_count in 0..100000  loop


execute immediate 'drop table BGRENN.TMP_DRIVER purge';
execute immediate 'create table   BGRENN.TMP_DRIVERT as select * from  BGRENN.TMP_DRIVERB where rownum<' ||  to_char(row_count,'9999999');

dbms_stats.gather_table_stats('BGRENN','TMP_DRIVERT');

v_statement_id := to_char(row_count,'99999');

v_statement := 'explain plan SET STATEMENT_ID = ''' || v_statement_id || '''  for ' ||
'select   * ' ||
' from BGRENN.TMP_DRIVERT TMP_DRIVERA  ' ||
' INNER JOIN BGRENN.TABA TABA ON TMP_DRIVERA.ID=TABA.ID  ' ||
' INNER JOIN BGRENN.TABB ON TABA.GCC_ID=TABA.ID AND TABA.LOCN <> 8031431  ' ||
' INNER JOIN BGRENN.TABC TABC ON TABA.ID=TABC.ID AND TABC.id = 1168583  ' ||
' INNER JOIN BGRENN.TABD TABD ON TABA.ID=TABD.ID  ' ||
' INNER JOIN BGRENN.TABE TABE ON TABD.ID=TABE.ID  ' ||
' INNER JOIN BGRENN.TABF TABF ON TABD.ID=TABF.ID  ' ||
' INNER JOIN BGRENN.TABG TABG ON TABF.ID=TABG.ID  ' ||
' INNER JOIN BGRENN.TABH TABH ON TABG.ID=TABH.ID AND TABH.SEQ_NBR < 500  ' ||
' INNER JOIN BGRENN.TABI ON TABC.ID=ID  ' ||
' INNER JOIN BGRENN.TABJ TABJ ON TABC.ID=TABJ.ID  ' ||
' INNER JOIN BGRENN.TABK TABK ON TABJ.ID=TABK.ID and TABK.id in ( 1221589, 1219009, 1191882, 1221590, 1171956)  ' ||
' LEFT OUTER JOIN ERD.TABL TABL ON TABH.ID=TABL.ID  ' ||
' LEFT OUTER JOIN ERD.TABM TABM ON TABE.ID=TABM.ID  ' ||
' where (1=1)';

dbms_output.put_line(v_statement);

execute immediate v_statement;



SELECT substr(plan_table_output,18,12) into v_plan_hash_value  FROM TABLE(dbms_xplan.display(statement_id => v_statement_id)) where rownum <2;
select cost into v_cost from plan_table where id=0 and rownum<2 and statement_id=v_statement_id;

insert into bgrenn.mytable values(row_count,v_plan_hash_value,v_cost);

delete from plan_table where statement_id=v_statement_id;
commit;
end loop;

end;
/                                            number);




This produced a set of rows in the table with the cost.

I then copied the table, installed a profile and reran.

After joining the 2 tables on row count I created an "r" program and analyzed the results.
Here is the program.


# open Libarary psych for functions
library(psych)

#  open file
query_data <- read.table("c:/r/data/query_output.txt", header=T)

#what are the variables
describe (query_data)


pdf("c:/r/data/querydata.pdf")
plot(query_data$ROW_COUNT,query_data$orig_cost,type='l',col="red")
lines(query_data$ROW_COUNT,query_data$new_cost,type='l',col="green")
dev.off()


And here is the output ..  The red is the orginal plan, and the green is the plan with the profile. I can see that the cost of the profile plan remains more consistent. and is probably a better choice.

Wednesday, September 5, 2012

"enq: CR - block range reuse ckpt" and the recycle bin

I decided to write this blog entry because we ran into an issue over the weekend.  The process that normally runs fine, was running slow.. "like molasses" was the comment from one of the folks.  After digging into an AWR report I found one of the top waits for the sql was


enq: CR - block range reuse ckpt

OK.. what is that, and why ?   I didn't find a whole lot except that it indiciated contention on blocks from multiple processes trying to update blocks.. hmmm..

I looked further in the AWR and saw the top Segment for Logical reads was "RECYCLEBIN$"

and one of the top queries was.

"delete from RecycleBin$ where bo=:1"


Well since the process was finished I did my own test.. I created a tablespace and created an object in it.. dropped the object, added the to table, then dropped the table.. over and over again, until the number of objects in  the tablespace remained constant.  I then created the table again, and started to let it grow (so it would have to free up the recycle bin to get space)... And what I saw in the top 5 wait events was again.....

enq: CR - block range reuse ckpt

I wanted to document this for others that may hit this.. If a search for this wait event brought you to my blog, Please check your recycle bin and make sure that it isn't cleaning itself out to make room causing this wait event ...




Tuesday, September 4, 2012

Recognize the magic optimizer numbers

Well I figured I document some of the magic numbers that the optimizer uses to help remember them, and help others. The back ground of this is simple.

I was looking through a query that was running for a long, long time, and the cardinality looked wrong.  I know the developers were using a table operation (looping over a LOB that was treated like  table). 

The Cardinality estimate for the step was 8168, and I thought hmmmm I've seen that before when dynamic sampling didn't happen.  Well after some digging I came across this page. Cardinality

The page contained this handy chart below...  These are important numbers to remember because when you see a cardinality matching this chart it is probably because the optimizer couldn't estimate the correct cardinality, and it couldn't dynamically sample.  Below is a snippet from the query I was investigating. Notice the cardinality on the first line.


0  0  0   COLLECTION ITERATOR PICKLER FETCH PARSE_DYNAMIC_COLS 
(cr=0 pr=0 pw=0 time=0 us cost=29 size=16336 card=8168) 0 0 0 HASH JOIN RIGHT OUTER (cr=0 pr=0 pw=0 time=0 us cost=8757 size=233200 card=100) 0 0 0 VIEW (cr=0 pr=0 pw=0 time=0 us cost=8614 size=14 card=1) 0 0 0 HASH UNIQUE (cr=0 pr=0 pw=0 time=0 us cost=8614 size=2069 card=1) 0 0 0 FILTER (cr=0 pr=0 pw=0 time=0 us) 0 0 0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us) 0 0 0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us cost=8613 size=2069 card=1) 0 0 0 HASH JOIN (cr=0 pr=0 pw=0 time=0 us cost=8612 size=2044 card=1)


Default cardinality for database objects

The following table demonstrates the estimated cardinalities (using a 8K blocksize) of various objects which have had no statistics generated for them :

Object TypeEstimated Cardinality
Heap Table82
Global Temporary Table8168
Index-Organized Table1
System Generated Materialized View
(such as the output of the TABLE operator)
8168

Saturday, September 1, 2012

Exadata sizing updated for 3tb drives 1/2 rack SATA

OK, Now I new Exadata coming in that has 3tb drives, and the first question asked is .. How much disk to I have to configure on it ?  Well I'm going to expand on a previous entry I did on sizing .

1/2 Rack. Sata drives. normal redundancy

This means we have
  • 7 storage cells
  • Each storage cell contains 12 disks
  • each disk is 3tb (which is about 2.794 tb usable)  *** This is calculated using base 1024 
  • The first 2 disks in each storage cell has 29.103g already partitioned for the OS (which is mirrored).
  • The rest of the disks in the group are used for DBFS
Given this, I am going to calculate out the total disk available then subtract out the 29.103g (for OS and DBFS).

First 12 disks * 7 cells x 2.794 = 234.696 tb of total raw storage/
Subtract out 29g* 2 disks * 7 cells = 406g    ----- OS
Subtract out 29g * 10 disks * 7 cells = 2.03tb  -----   DBFS
Available raw is 234.696 - 2.436 = 232.26  

Now I said we were running Normal Redundancy.. This means that we loose 1/2

DBFS = 1.015tb
OS        29g
Remaining for Data and Reco = 116.13

But of course we need to account for cell being off line.  This takes out 1/7 of the storage.

DBFS   === .870 tb (29g * 10 * 6)/2
Everything else  ===  ( 2.765 * 12 disks * 6 cells)/2   == 99.54

So now we have 99.54 raw storage available for Data and Reco.

This is now easy to figure out now.. You have really 100tb raw storage (with normal redundancy) to split up between Data and Reco.

Now a full rack is easy to do.

2.765 * 12 disks * 13 cells) / 2 =  215.67tb 

Tuesday, July 31, 2012

What extended stats do I have on my database?

I've been starting to work with Extended statistics to help the optimizer find the best plan. This is a great feature that is outlined by @sqlmaria (Maria Coogan) here.

But once you create extended statistics, how do you know what is there ?  I wrote this query to find out what function based indexes, what extended statistics, and what their definition are. 

Here is my script.

column table_owner alias "owner" format a15
column table_name alias  "Table Name" format  a30
column function_index alias  "F Index" format  a8
column Index_name  alias  "Index Name"  format a30
column data_default alias  "Definition"  format a50
set pagesize 1000
select table_owner,
         table_name,
        nvl2(index_name,'YES','NO') function_index,
        index_name,
        data_default
        from
        (
select owner table_owner,table_name,
(select distinct index_name from dba_ind_columns b where a.column_name=b.column_name and a.owner=b.index_owner and a.table_name=b.table_name) index_name
,data_default
-- ,     DBMS_LOB.SUBSTR( to_lob(data_default),100,1)
 from dba_tab_cols a
  where virtual_column='YES' and hidden_column='YES'  and (owner not in ('SYS','WMSYS','XDB','SYSMAN','MDSYS','EXFSYS','PR_MDS') and owner not like 'APEX_%')
  )
order by table_owner,table_name;


and this is what the output looks like..

TABLE_OWNER     TABLE_NAME                     FUNCTION INDEX_NAME                     DATA_DEFINITION
--------------- ------------------------------ -------- ------------------------------ --------------------------------------------------
BGRENN          TAB_SCHR_PERD                      NO                                  COALESCE("COL1","COL2")
BGRENN          TAB2                              YES   IDX_TAB2                       "COL1"||' '||"COL2"
BGRENN          TAB3                               NO                                  COALESCE("COL1","COL2")
BGRENN          TAB4                              YES   IDX_TAB4                       COALESCE("COL1","COL2",0)
BGRENN          TAB4                              YES   IDX_TAB4                       COALESCE("COL3",0)
BGRENN          TAB5                              YES   IDX_TAB5                       COALESCE("COL1","COL2",0)
BGRENN          TAB6                              YES   IDX_TAB6                       NVL("COL1",(-1))
BGRENN          TAB6                              YES   IDX_TAB6                       NVL("COL2",(-1))
BGRENN          TAB6                              YES   IDX_TAB6                       NVL("COL3",(-1))
BGRENN          TAB6                              YES   IDX_TAB6                       NVL("COL4",'x')
BGRENN          TAB7                              YES   IDX_COMPOSITE                  "COL1"
BGRENN          TAB7                              YES   IDX_COMPOSITE                  "COL3"

Notice the Function colunmn. This is a "YES" or "NO" depending on if this is a function based index, or just extended statistics.

This should help tell where your extended statistics are in your database.

Sunday, July 15, 2012

Exadata tips

I wanted to write up an Exadata tip that I learned.

Background :  I wanted to do a simple "select count(1) from mytable".  mytable has a primary key on it.  The count seemed to be taking a long time for an Exadata.

First the "select count(1) from mytable". You can see that it uses an index storage fast full scan.  The top wait event is "cell multiblock physical read".  The query does  2 Million Disk reads in 3 Minutes 28 seconds.

But this seems slow...



select count(1)
from
     MYTABLE MYTABLE

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     56.46     208.27    2030378    2031797         20           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     56.46     208.27    2030378    2031797         20           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=2031797 pr=2030378 pw=0 time=208276577 us)
 592848893  592848893  592848893   INDEX STORAGE FAST FULL SCAN PK_MYTABLE (cr=2031797 pr=2030378 pw=0 time=245927651 us cost=523897 size=0 card=572441788)(object id 312310)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  library cache lock                              4        0.00          0.00
  Disk file operations I/O                       42        0.00          0.00
  library cache pin                               2        0.00          0.00
  SQL*Net message to client                       2        0.00          0.00
  cell single block physical read                15        0.00          0.00
  cell list of blocks physical read               2        0.00          0.00
  cell multiblock physical read               15959        0.26        152.20
  latch: object queue header operation            1        0.00          0.00
  SQL*Net message from client                     2       10.38         10.38
********************************************************************************



Next I did a FTS.

"select /*+full(MYTABLE) */ count(1) from  MYTABLE MYTABLE ;

You can see this did 2.2 Million disk reads (more than the index scan), but the wait event is sql_net. With the "cell smart table scan", there were very few waits, and the wait was much shorter.



select /*+full(MYTABLE) */ count(1)
from
     MYTABLE MYTABLE  

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     65.11      66.48    2224642    2225028         21           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     65.11      66.48    2224642    2225028         21           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=2225028 pr=2224642 pw=0 time=66486729 us)
 592848893  592848893  592848893   PARTITION RANGE ALL PARTITION: 1 33 (cr=2225028 pr=2224642 pw=0 time=140325566 us cost=533066 size=0 card=572441788)
 592848893  592848893  592848893    TABLE ACCESS STORAGE FULL MYTABLE PARTITION: 1 33 (cr=2225028 pr=2224642 pw=0 time=54479242 us cost=533066 size=0 card=572441788)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  Disk file operations I/O                      129        0.00          0.00
  gc current block 2-way                         65        0.00          0.00
  enq: KO - fast object checkpoint              101        0.00          0.02
  reliable message                               33        0.09          0.30
  gc current block 3-way                         13        0.00          0.00
  cell smart table scan                        1403        0.03          1.01
  gc cr block 3-way                               1        0.00          0.00
  gc current grant busy                          18        0.00          0.00
  gc cr block 2-way                              17        0.00          0.00
  gc cr multi block request                       5        0.00          0.00
  cell single block physical read                11        0.00          0.00
  cell list of blocks physical read               2        0.00          0.00
  gc cr grant 2-way                               3        0.00          0.00
  SQL*Net message from client                     2        9.62          9.62






Bottom line, if you want to a count on a table use the "FULL" hint.  The exadata is built for table scans, and this example shows that.

It also should make you rethink when to use indexes for an application, you see they can hurt you in some cases.

Friday, July 6, 2012

What happened to my sql (sql_id) ?


While finishing up a few things, I ran across a query that wasn't playing nicely. It had 4 different plans over the course of the last couple of days, and I wanted to see what happend.. I came up with the nifty query below.  If you plug in a sql_id, it will go through the AWR history, and return (ordered by date last executed), the plans grouped by plan_hash_value. Within each plan_hash_value it will give you the objects in the plan, and when they were last analyzed.  By using this you should see what plans are good, when they were last executed, and if anything was analyzed to change the plan.


set linesize 160
set pagesize 1000
break on plan_hash_value skip 1 nodup  on last_executed skip 1 nodup  on avg_exec_time skip 1
select object_owner ||'.'|| object_name object_name,
object_type,
a.plan_hash_value,
case object_type
  when 'INDEX' then (select last_analyzed from dba_indexes b where owner=object_owner and index_name=object_name)
  when 'TABLE'  then (select last_analyzed from dba_tables b where owner=object_owner and table_name=object_name)
 else null 
end last_analyzed,
 to_char((select max(end_interval_time) from dba_hist_snapshot b,
                                           dba_hist_sqlstat c 
                            where c.sql_id=a.sql_id and 
                                  c.plan_hash_value=a.plan_hash_value and 
                              b.snap_id=c.snap_id),'mm/dd/yy hh24:mi') last_Executed,
to_char((select sum(elapsed_time_delta)/sum(executions_delta) from dba_hist_sqlstat d where d.sql_id=a.sql_id and d.plan_hash_value=a.plan_hash_value)/1024/1024,'999.99') avg_exec_time
 from DBA_HIST_SQL_PLAN  a
where a.SQL_ID='gbug7dg8adhgh'
 and object_type in ('INDEX','TABLE')
order by last_executed desc ,a.plan_hash_value , last_analyzed desc;


Here is an example of the output


              OBJECT_NAME                                 OBJECT_TYPE          PLAN_HASH_VALUE LAST_ANALYZED       LAST_EXECUTED  AVG_EXE
-------------------------------------------------------------- -------------------- --------------- ------------------- -------------- -------
MY_SCHEMA.SNP_CDC_SUBS                                     TABLE                     2518369181 2012-07-06 09:25:15 07/06/12 10:00  791.37
MY_SCHEMA.SNP_CDC_SUBS                                     TABLE                                2012-07-06 09:25:15
MY_SCHEMA.D$TAB_REG                                        TABLE                                2012-07-06 09:25:06
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
ERD.DIM_TABS_COMP_PLCY_AGMT                                TABLE                                2012-07-06 00:39:33
MY_SCHEMA.TAB_COMP_PLCY_AGMT                               TABLE                                2012-05-25 09:20:39
MY_SCHEMA.IDX_WCPA_AGMT_ID                                 INDEX                                2012-05-25 09:20:39
MY_SCHEMA.TAB_COMP_PLCY_ST_CLSF_VT                         TABLE                                2012-05-15 18:49:43
MY_SCHEMA.TAB_COMP_PLCY_ST_CLSF                            TABLE                                2012-05-15 18:49:43
MY_SCHEMA.WPTD_COMP_PER_TAX                                TABLE                                2012-05-15 18:39:30
MY_SCHEMA.TAB_REG                                          TABLE                                2012-05-15 18:31:18
MY_SCHEMA.CO_TAB                                           TABLE                                2012-05-15 18:27:50
MY_SCHEMA.TAB_PAYR                                         TABLE                                2012-05-15 18:26:47
MY_SCHEMA.AGMT_REG                                         TABLE                                2012-05-15 18:21:09



MY_SCHEMA.SNP_CDC_SUBS                                     TABLE                     1903861587 2012-07-06 09:25:15 07/06/12 09:00  882.94
MY_SCHEMA.SNP_CDC_SUBS                                     TABLE                                2012-07-06 09:25:15
MY_SCHEMA.D$TAB_REG                                        TABLE                                2012-07-06 09:25:06
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
MY_SCHEMA.J$TAB_REG                                        TABLE                                2012-07-06 09:24:50
ERD.DIM_TABS_COMP_PLCY_AGMT                                TABLE                                2012-07-06 00:39:33
MY_SCHEMA.TAB_COMP_PLCY_AGMT                               TABLE                                2012-05-25 09:20:39
MY_SCHEMA.IDX_WCPA_AGMT_ID                                 INDEX                                2012-05-25 09:20:39
MY_SCHEMA.TAB_COMP_PLCY_ST_CLSF_VT                         TABLE                                2012-05-15 18:49:43
MY_SCHEMA.TAB_COMP_PLCY_ST_CLSF                            TABLE                                2012-05-15 18:49:43
MY_SCHEMA.WPTD_COMP_PER_TAX                                TABLE                                2012-05-15 18:39:30
MY_SCHEMA.TAB_REG                                          TABLE                                2012-05-15 18:31:18
MY_SCHEMA.CO_TAB                                           TABLE                                2012-05-15 18:27:50
MY_SCHEMA.TAB_PAYR                                         TABLE                                2012-05-15 18:26:47
MY_SCHEMA.AGMT_REG                                         TABLE                                2012-05-15 18:21:09