Hi all,
I have been working all week on trying to figure out why a query went to hell when we partitioned the tables. I dug into it, and found one good fix.. But I can't implement it.
The detail on what happened in my last post.. Keep in mind I found that issue, but working through this one, and moving the bottleneck.
Here is the problem.. They are joining to a GTT (global temporary table), but they are using a function on the column in the table. ARGH.. They are making it impossible for the optimizer to find the best plan.
Here is an example of what's happening...
First here the GTT I have been working all week on trying to figure out why a query went to hell when we partitioned the tables. I dug into it, and found one good fix.. But I can't implement it.
The detail on what happened in my last post.. Keep in mind I found that issue, but working through this one, and moving the bottleneck.
Here is the problem.. They are joining to a GTT (global temporary table), but they are using a function on the column in the table. ARGH.. They are making it impossible for the optimizer to find the best plan.
Here is an example of what's happening...
CREATE GLOBAL TEMPORARY TABLE my_temp_table (
tmp_strt_dt date,
tmp_end_dt date
) ON COMMIT preserve ROWS;
Here is table and lets load 128 rows of data into it.
create table test_table
( strt_dt date,
end_dt date,
col1 varchar(1));
insert into test_table values(sysdate-1000,sysdate+1000,'Y');
insert into test_table select * from test_table;
insert into test_table select * from test_table;
insert into test_table select * from test_table;
insert into test_table select * from test_table;
insert into test_table select * from test_table;
insert into test_table select * from test_table;
insert into test_table select * from test_table;
commit;
Now lets insert into the temporary table, and analyze both tables.
insert into my_temp_table values(sysdate,sysdate);
exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'MY_TEMP_TABLE',estimate_percent=>null, cascade=>true, method_opt=> 'FOR ALL COLUMNS SIZE 1');
exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'TEST_TABLE',estimate_percent=>null, cascade=>true, method_opt=> 'FOR ALL COLUMNS SIZE 1');
Now for my query ..
select * from my_temp_table ,test_table
where "END_DT">=TRUNC("TMP_STRT_DT") AND
"STRT_DT"<=TRUNC("TMP_END_DT");
and the explain plan.. Notice the cardinality of 1, though there are 128 rows that match
Plan hash value: 1231029307
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 34 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| MY_TEMP_TABLE | 1 | 16 | 2 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| TEST_TABLE | 1 | 18 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("END_DT">=TRUNC(INTERNAL_FUNCTION("TMP_STRT_DT")) AND
"STRT_DT"<=TRUNC(INTERNAL_FUNCTION("TMP_END_DT")))
So what to do ??? I removed the trunc function, and the cardinality was right...
select * from my_temp_table ,test_table
where "END_DT">="TMP_STRT_DT" AND
"STRT_DT"<="TMP_END_DT";
Plan hash value: 1231029307
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 128 | 4352 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 128 | 4352 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| MY_TEMP_TABLE | 1 | 16 | 2 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| TEST_TABLE | 128 | 2304 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("END_DT">="TMP_STRT_DT" AND "STRT_DT"<="TMP_END_DT")
Unfortunately, I can't change the code.. How do I get the optimizer to get the right cardinality ?? Function based indexes to the rescue. Here is what I did.. First create the indexes on the 2 columns.
create index my_temp_table_fbi1 on my_temp_table(TRUNC("TMP_STRT_DT"));
create index my_temp_table_fbi2 on my_temp_table(TRUNC("TMP_END_DT"));
Next insert into the table, and gather stats.. Notice that I am using "hidden" column clause.
insert into my_temp_table values(sysdate,sysdate);
exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'MY_TEMP_TABLE',estimate_percent=>null, cascade=>true, method_opt=> 'FOR ALL HIDDEN COLUMNS SIZE 1');
Now to run my query and look at the cardinality.
elect * from my_temp_table ,test_table
where "END_DT">=TRUNC("TMP_STRT_DT") AND
"STRT_DT"<=TRUNC("TMP_END_DT");
Plan hash value: 1231029307
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 128 | 6400 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 128 | 6400 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| MY_TEMP_TABLE | 1 | 32 | 2 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| TEST_TABLE | 128 | 2304 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("END_DT">=TRUNC(INTERNAL_FUNCTION("TMP_STRT_DT")) AND
"STRT_DT"<=TRUNC(INTERNAL_FUNCTION("TMP_END_DT")))
Notice that the index is not used for the query plan, but by having the index, and gathering statistics, the optimizer is able to figure out the correct cardinality even though a function is used for the column. Problem solved without changing the query.
As always, you can find my script here
in 11g, you can do it without the function based index:
ReplyDelete..method_opt=> 'FOR COLUMNS TRUNC(TMP_STRT_DT))SIZE 1, (TRUNC(TMP_END_DT))SIZE 1' ..
Jan
Jan,
ReplyDeleteI appreciate the suggestion. Yes this would work if the table were a "real" table rather than a GTT. Unfortunately this is not supported with GTT's
exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'MY_TEMP_TABLE',estimate_percent=>null, cascade=>true, method_opt=> 'FOR COLUMNS (TRUNC(TMP_STRT_DT)) SIZE 1, (TRUNC(TMP_END_DT)) SIZE 1');
ERROR at line 1:
ORA-20000: Unable to create extension: not supported for a temporary table