Saturday, December 17, 2011

FTS vs Index scan on Exadata

I loaded up some data on my exadata, and created a pimary key on the table. The table has about 1.8 Billion rows, and is about 186g of space.  I decided to do a select count on the table to see how many rows were really there.

select count(1) from sp.wxyz_detl;

The query took about 5 minutes and 36 seconds (336 seconds) .

I was astonished !  186g took almost 6 minutes ? Well on an exadata that seemed really slow.  I took a closer look at the plan.

-----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name               | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                    |       |   653K(100)|          |       |       |
|   1 |  SORT AGGREGATE        |                    |     1 |            |          |       |       |
|   2 |   PARTITION RANGE ALL  |                    |  1850M|   653K  (6)| 02:10:42 |     1 |    29 |
|   3 |    INDEX FAST FULL SCAN| PIDX_WXYZ_WRKR_ID  |  1850M|   653K  (6)| 02:10:42 |     1 |    29 |
-----------------------------------------------------------------------------------------------------




Well that looked good.  INDEX FAST FULL SCAN should offload to the cells right ? it says "full scan" what could be simpler..

I looked deeper at the V$SQLSTATS table, but it didn't show what I expected.

DISK_READS BUFFER_GETS EXECUTIONS CPU_TIME            ELAPSED_TIME   CELL_OFFLOAD_ELIG_BYTES PHYSICAL_READ_BY  IO_CELL_UNC_bytes OFFLOAD_RETURNED
---------- ----------- ---------- ---------------- ---------------- ------------------------ ----------------   ----------------   ----------------
 3,165,962   3,172,055          1      112,050,965      366,230,300                0            25,935,560,704                0                0


The Offload_eligable bytes is 0, and the IO_CELL_UNCOMPRESSED_BYTES is 0.

Huh ?

Now I figured I would force a full table scan and see what happens


select /*+ full(t) */ count(1) from spr.wxyz_detl t;


-------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |  1958K(100)|          |       |       |
|   1 |  SORT AGGREGATE             |           |     1 |            |          |       |       |
|   2 |   PARTITION RANGE ALL       |           |  1850M|  1958K  (3)| 06:31:40 |     1 |    29 |
|   3 |    TABLE ACCESS STORAGE FULL| WXYZ_DETL |  1850M|  1958K  (3)| 06:31:40 |     1 |    29 |
-------------------------------------------------------------------------------------------------



Looks like the cost went up, the expected elapsed time went up.. this looks like a worse plan to the optimizer, but here are the stats from v$sqlstats.

DISK_READS BUFFER_GETS EXECUTIONS CPU_TIME         ELAPSED_TIME     CELL_OFFLOAD_ELIG_BYTES PHYSICAL_READ_BY  IO_CELL_UNC_bytes  OFFLOAD_RETURNED
---------- ----------- ---------- ---------------- ---------------- ------------------------ ---------------- ---------------- ----------------
 8,684,661   8,685,010          1       36,904,390       52,623,383           71,144,235,008   71,144,742,912   71,324,041,216   25,825,191,648




Wow, look at that.. Using the FULL hint caused the cell offload eligability to change, and the elapsed time is now 52 seconds.  Big change and exactly what I wanted.

I did some searching and came up with this blog with a similar issue, but the key was a reverse lookup.

http://martincarstenbach.wordpress.com/2011/08/22/why-is-my-exadata-smart-scan-not-offloading/


This is a very simple schema (one table, one PK).. The PK was necessary because we are using Golden Gate to insert the data, and we don't want duplicates.

1 comment: