Wednesday, October 6, 2010

Exadata and indexes

This has been a very interesting topic around my shop.. Some people say that you can get rid of all you indexes, some people say no..

Well first lets look at why you have indexes and rule out those as removal candidates.

1) Indexes that support Primary keys. Gotta keep those right ?

2) Indexes that support RI to avoid locking. OLTP ? Gotta keep those.

For a lot of OLTP applications, just the 2 above criteria is enough to keep most of your indexes in play. But what about everything else.

Here is what I've been seeing. The exadata can scan like crazy, but there is a limit (20.8 on a full rack, do the math for your configuration). If you have a FTS on a table containing 50G, you can see that you utilizing ALL I/O for almost 3 seconds. if you have any concurrency, you can imagin what happens.

So in my mind the answer is to keep indexes where they can significantly limit the data access.
Concurrency.

Now that I've had a few beers, and few cups of coffee, I've had time to arrange brain cells in the right trays.. This is what I've found on Concurency with a table doing a FTS.

First.. single query.. 33g. 1/2 rack does 10.4g/second as advertised.. the single query doing a FTS runs in 3.3 seconds (or so).

Now scale up to 10 processes.. The 10 processes all scour 33.g gig apiece, the time goes up. The secret is to cut down the I/O requests at the DB layer to limit the data scoured.

Monday, October 4, 2010

Concurrency on the Exadata

Now that I have some benchmarks, I'm starting to delve into some testing to find out how it scales up.. I started with a large table 200+ million rows.

My base query did a FTS and returned one row of data.


1 execution runs in 3 seconds (DOP 32).

Once I scale up to 100 simultaneus executions, it runs longer, but I can't figure out the average execution time (parallel query skews the numbers).

In looking at the resource usage for both the database nodes and the storage nodes, I found the database nodes are almost Idle, and the storage nodes (7 of them) are producing about 10g of data/second. The cpu usage is about 7% user and 30% wait. When looking at the AWR information, all the time is still going to I/O waits. 399 seconds out of the 444 seconds are I/O wait times.. It appears that the Exadata does fantastic for a single query.. Once you execute that single query 100 times simultaneously, the times start to slow down.

I'm going to do more experients to see how I can get it to scale up nicer :)

Friday, October 1, 2010

Exadata storage Software

How else should I spend a Friday night, other than drinking hard cider, and running performance numbers on the Exadata storage software.

This is my dive into the storage software and WOW is it impressive.. I am selecting from a 200+ million row table (no indexes).. Without storage software it takes 3 minutes to scan the whole table.. really impressive. Then with the storage indexing it takes 30 seconds to come back with a distinct column value. 6 times faster..

Then I was really impressed when I used a Unique key lookup.. No index, it took 8 seconds to find the data, compared to 189 seconds. 23x faster with the Storage software.

Next I made the table parallel 64.. Now it comes back in 3 seconds (no storage software), and 1 second with storage software.. Unbelievable numbers.

One of the first things I noticed is that the Exadata makes you rethink your redo log sizes. When loading data a lot of my waits are waiting on the redo to flush out because it is so small.

All in all the storage software looks pretty impressive.

Saturday, September 25, 2010

Rear view of an Exadata


Why HCC is exadata only

First the Physics.

The SAS drives spinning at 15k rpm's can produce 200m of data/second.Hypbrid Columnar compression get's on average 50x compression rate.200m x 50x = 10g of data PER DISK is read.

There are 100+ disks that can be read. This causes 2 issues.

a) The data is actually compressed/uncompressed at the storage tier. All the CPU's in the storage servers are utilized to make this happen.. Only the exadata can take advantage of the storage CPU's through the storage software

b) The data that is uncompressed is huge.. The disk can return 20.8g of data per second, but if you do get 50x compression, you are now trying to work with 1tb of disk/second..

Even if you are running infiniband, the system can't handle this volume of data.. The predicate elimination, and column eliminate will limit the data returned from the storage tier, making the processes of the data possible.
Without the storage software along with the CPU's at the storage level uncompressing data AND eliminating data, it is impossible to process the volume of data produced from HCC.

Monday, July 5, 2010

DBreplay AKA RAT (real application testing)

I have been spending a lot of time lately working with DBReplay.. This is an AWESOME arrow to have in your quiver to test upgrading to 11g.

I have made some intersting observations I want to share..

  • When you start your replay, remember that your cache is "cold", if you started your capture with a warm cache, it will take a bit for the cache to catch up.
  • If you started your capture during processing, you will see a lot of divergence.. This is normal
  • A single query taking longer can make a huge impact on the replay (I will explain in detail below).
  • Replaying twice in a row will not have the same results, but the results should be withing 5% of each other.

So while does a single query make sunch a huge impact ??? It all has to do with how the replay synch all the workload. You may have 100 or more different sessions all acting independently, but Oracle keeps it all in synch by SCN number.. This is good right ? Well, yes, but it can affect the replay. What happens if a query that usually takes .04 seconds, and is followed by an update, takes 5 minutes ? Well the replay gets held up by 5 minutes, because the SCN doesn't move until the query is finished.. Multiply that by12 executions, and you've lost an hour out of your replay.. WOW.

The best suggestion I have is to look at the DB time, the CPU time, and the reads. You may find that "overall" the replay used less DB time, even though it took longer to replay.