Friday, October 5, 2012

Anayzing query Cost example

Well,
  Here is the issue I've been dealing with.  The query cost doesn't stay consistent, and I was wondering if a profile would help keep it stable, and if it keeps things better or worse.

  I'm sure you have all run into this.  For some reason your cardinality can't be well estimated, the optimizer changes plans on you, and you want to know if a profile will help or hurt.

  The good news I had the perfect test case.  The query was part of a data load, and there was a driving table with a varying number of rows in it.  This is how I went about analyzing.

First I created a table to store the results.

create table bgrenn.mytable
 (ROW_COUNT                                          NUMBER,
 PLAN_HASH_VALUE                                    VARCHAR2(15)
 COST                                               number);




Next I took a copy of my driving table ( a full size table). I used this to create a smaller copy of the table.



declare


row_count number:= 0;
v_statement_id varchar2(10);
v_plan_hash_value varchar2(15);
v_cost number;
v_statement varchar2(4000);

begin

for row_count in 0..100000  loop


execute immediate 'drop table BGRENN.TMP_DRIVER purge';
execute immediate 'create table   BGRENN.TMP_DRIVERT as select * from  BGRENN.TMP_DRIVERB where rownum<' ||  to_char(row_count,'9999999');

dbms_stats.gather_table_stats('BGRENN','TMP_DRIVERT');

v_statement_id := to_char(row_count,'99999');

v_statement := 'explain plan SET STATEMENT_ID = ''' || v_statement_id || '''  for ' ||
'select   * ' ||
' from BGRENN.TMP_DRIVERT TMP_DRIVERA  ' ||
' INNER JOIN BGRENN.TABA TABA ON TMP_DRIVERA.ID=TABA.ID  ' ||
' INNER JOIN BGRENN.TABB ON TABA.GCC_ID=TABA.ID AND TABA.LOCN <> 8031431  ' ||
' INNER JOIN BGRENN.TABC TABC ON TABA.ID=TABC.ID AND TABC.id = 1168583  ' ||
' INNER JOIN BGRENN.TABD TABD ON TABA.ID=TABD.ID  ' ||
' INNER JOIN BGRENN.TABE TABE ON TABD.ID=TABE.ID  ' ||
' INNER JOIN BGRENN.TABF TABF ON TABD.ID=TABF.ID  ' ||
' INNER JOIN BGRENN.TABG TABG ON TABF.ID=TABG.ID  ' ||
' INNER JOIN BGRENN.TABH TABH ON TABG.ID=TABH.ID AND TABH.SEQ_NBR < 500  ' ||
' INNER JOIN BGRENN.TABI ON TABC.ID=ID  ' ||
' INNER JOIN BGRENN.TABJ TABJ ON TABC.ID=TABJ.ID  ' ||
' INNER JOIN BGRENN.TABK TABK ON TABJ.ID=TABK.ID and TABK.id in ( 1221589, 1219009, 1191882, 1221590, 1171956)  ' ||
' LEFT OUTER JOIN ERD.TABL TABL ON TABH.ID=TABL.ID  ' ||
' LEFT OUTER JOIN ERD.TABM TABM ON TABE.ID=TABM.ID  ' ||
' where (1=1)';

dbms_output.put_line(v_statement);

execute immediate v_statement;



SELECT substr(plan_table_output,18,12) into v_plan_hash_value  FROM TABLE(dbms_xplan.display(statement_id => v_statement_id)) where rownum <2;
select cost into v_cost from plan_table where id=0 and rownum<2 and statement_id=v_statement_id;

insert into bgrenn.mytable values(row_count,v_plan_hash_value,v_cost);

delete from plan_table where statement_id=v_statement_id;
commit;
end loop;

end;
/                                            number);




This produced a set of rows in the table with the cost.

I then copied the table, installed a profile and reran.

After joining the 2 tables on row count I created an "r" program and analyzed the results.
Here is the program.


# open Libarary psych for functions
library(psych)

#  open file
query_data <- read.table("c:/r/data/query_output.txt", header=T)

#what are the variables
describe (query_data)


pdf("c:/r/data/querydata.pdf")
plot(query_data$ROW_COUNT,query_data$orig_cost,type='l',col="red")
lines(query_data$ROW_COUNT,query_data$new_cost,type='l',col="green")
dev.off()


And here is the output ..  The red is the orginal plan, and the green is the plan with the profile. I can see that the cost of the profile plan remains more consistent. and is probably a better choice.