Oracle Database Architecture blog sharing my experiences as an oracle Architect.
Saturday, September 25, 2010
Why HCC is exadata only
The SAS drives spinning at 15k rpm's can produce 200m of data/second.Hypbrid Columnar compression get's on average 50x compression rate.200m x 50x = 10g of data PER DISK is read.
There are 100+ disks that can be read. This causes 2 issues.
a) The data is actually compressed/uncompressed at the storage tier. All the CPU's in the storage servers are utilized to make this happen.. Only the exadata can take advantage of the storage CPU's through the storage software
b) The data that is uncompressed is huge.. The disk can return 20.8g of data per second, but if you do get 50x compression, you are now trying to work with 1tb of disk/second..
Even if you are running infiniband, the system can't handle this volume of data.. The predicate elimination, and column eliminate will limit the data returned from the storage tier, making the processes of the data possible.
Without the storage software along with the CPU's at the storage level uncompressing data AND eliminating data, it is impossible to process the volume of data produced from HCC.
Monday, July 5, 2010
DBreplay AKA RAT (real application testing)
I have made some intersting observations I want to share..
- When you start your replay, remember that your cache is "cold", if you started your capture with a warm cache, it will take a bit for the cache to catch up.
- If you started your capture during processing, you will see a lot of divergence.. This is normal
- A single query taking longer can make a huge impact on the replay (I will explain in detail below).
- Replaying twice in a row will not have the same results, but the results should be withing 5% of each other.
So while does a single query make sunch a huge impact ??? It all has to do with how the replay synch all the workload. You may have 100 or more different sessions all acting independently, but Oracle keeps it all in synch by SCN number.. This is good right ? Well, yes, but it can affect the replay. What happens if a query that usually takes .04 seconds, and is followed by an update, takes 5 minutes ? Well the replay gets held up by 5 minutes, because the SCN doesn't move until the query is finished.. Multiply that by12 executions, and you've lost an hour out of your replay.. WOW.
The best suggestion I have is to look at the DB time, the CPU time, and the reads. You may find that "overall" the replay used less DB time, even though it took longer to replay.
Saturday, March 13, 2010
Direct path Reads
From what I can figure out, they kick in for Full Table Scans (FTS), and only in certain cases.. You can turn them off with event 10949. They are affected by the size of the buffer cache, how big the table is, and how many blocks for the table are already in memory. Oracle has a "secret sauce" to determine when to kick in. I haven't had any luck finding the "secret sauce", just information from others on what the ingredients are, not the amounts of each.
There are advantages to them
- They are much faster than conventional db file sequential read
- They won't age blocks out of memory that you might want to keep
Disadvantages
- If the same query is executed again, the blocks must be read from disk again.
- Running a FTS on a table in multiple sessions won't share blocks, and can cause a large amount of I/O.
I also found that an object is much more likely to be using direct path reads immediately after Startup, since there are no blocks currently cached (thanks Jonathon Lewis for this tidbit!)
Here is as much additional information as I can find on this topic
- http://oraclue.com/2009/07/17/direct-path-reads-and-serial-table-scans-in-11g/
- http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-reads-what-is.html
- http://dioncho.wordpress.com/2009/07/21/disabling-direct-path-read-for-the-serial-full-table-scan-11g/
I also stated asking some Exadata experts, on how Direct path reads relate to exadata. It seems that with Exadata, with Caching at both the DB layer, and the storage layer, along with the infiniband connection to storage, Direct path reads are not an issue. Exadata will cache at enough layers, that a Direct path read from the flashcache on the storage is fast enough to make it worthwhile.. It's almost as is Exadata was made to best utilize direct path reads, and other storage configurations will suffer from bottlenecking... Coincidence ?? you decide.
I will let all know how I make out with them.