Wednesday, October 27, 2010

Concurrency and parallelism

I've had a lot of discussions with some very "seasoned" professions on how to handle a high level concurrency. Most of these professions point to the new features of 11gr2 and and parallization..

True 11gr2 has added a new way of handling degree of parallization.

There are some new parameters

parallel_degree_policy
PARALLEL_MIN_TIME_THRESHOLD

These control how parallism is handled.. They can be used to actually create a funnel to ensure the system isn't flooded.

The problem is all this, is that parallelism has a price.. Take a small efficient query, and turn on these parameters.. Guess what happens when you ramp up and run 500 of the same query concurrently ? You see much lower throughput (I've seen as much as 10x lower throughput). Why ?? The overhead of parallel query can be quite high, and can consume more time than even CPU in your AWR report.



Event Waits Time(s) (ms) time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
enq: JX - SQL statement queue 71 7,815 1.E+05 99.3 Scheduler
DB CPU 96 1.2
PX Nsq: PQ load info query 46 9 201 .1 Other
enq: RD - RAC load 90 2 20 .0 Other
PX Deq: reap credit 152,105 1 0 .0 Other


The moral of the story is, parallism is good for longer queries.. For shorter queries your milege may vary

Adaptive Direct Path Reads

This is one of my favorite topics, as it keeps coming up.

Well as you read the title, you are probably going huh ?? Until you upgrade to 11g, you probably just think direct reads are for reading temp only.. Well it all changes with 11g. Full table scans of large tables turn from "db scattered read" to "direct path read"..


What does this mean ?? well the good news is it runs about 4 times faster than the old fasion reads.. How does it do this ?? It bypasses the SGA.. Is this good ? probably.. especially for full table scans.. it doesn't force anything out of your buffer cache, and you get faster reads. The only concern I would have is if you wanted to read a lot of data into memory, I don't know how to burn things into cache. I tried to disable it to do timings, but no luck. Even with the optimizer set back to 10g it still does direct path reads.

Here is some great information on it.

http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-reads-what-is.html

http://shallahamer-orapub.blogspot.com/2010/01/mystery-surrounding-11g-and-direct.html

I've had some bad experiences with Direct path reads, and concurrency. If you think about lots and lots concurrent sessions, doing direct path reads you start to imagine how this can reek havoc. None of these sessions share the results of the read, and they all independently read disk blocks.. Eventually you run out of runway for I/O.

Happy tuning.

previous blog posts

If anyone is interested in my previous blog posts, you can find them at

http://www.unyoug.com/forums/viewforum.php?f=1

Monday, October 18, 2010

Infiniband coming to a town near you

As the Exadata takes off (I've heard it is REALLY taking off), more and more vendors are opening their eyes to the bottleneck of I/O.

You are going to see some other solutions come to the market that are infiniband based. You are also going to see more solutions like the Storage Cell's.. Just look at the IBM XIV ! they have storage cells just like the exadata with large caches. The concept is catching on..

But what does the Exadata have besides the obvious ? It has the ability to parallelize the I/O at the storage level.. I'm sure you going yea. I knew that.. but think about it.

a) Exadata.. run a non-parallel query that does a FTS on a 5tb table. You will marshall all the resources of the I/O from a single query on a single NODE

b) XXXXXX.. Run a non-parallel query, and you only will be able to marshall all the I/O that the single CPU can handle.

Sure you can build an Exadata like solution, but in order to utilze the power of the storage/infiniband subsystem you need to parallelize across multiple CPU's.. This uses a lot of CPU's, and parallization might not be the best plan for all queries.

In my mind, this is big bonus of the exadata.. Parallize or not, you can do FTS's at 20.8g/s

Wednesday, October 6, 2010

Exadata and indexes

This has been a very interesting topic around my shop.. Some people say that you can get rid of all you indexes, some people say no..

Well first lets look at why you have indexes and rule out those as removal candidates.

1) Indexes that support Primary keys. Gotta keep those right ?

2) Indexes that support RI to avoid locking. OLTP ? Gotta keep those.

For a lot of OLTP applications, just the 2 above criteria is enough to keep most of your indexes in play. But what about everything else.

Here is what I've been seeing. The exadata can scan like crazy, but there is a limit (20.8 on a full rack, do the math for your configuration). If you have a FTS on a table containing 50G, you can see that you utilizing ALL I/O for almost 3 seconds. if you have any concurrency, you can imagin what happens.

So in my mind the answer is to keep indexes where they can significantly limit the data access.
Concurrency.

Now that I've had a few beers, and few cups of coffee, I've had time to arrange brain cells in the right trays.. This is what I've found on Concurency with a table doing a FTS.

First.. single query.. 33g. 1/2 rack does 10.4g/second as advertised.. the single query doing a FTS runs in 3.3 seconds (or so).

Now scale up to 10 processes.. The 10 processes all scour 33.g gig apiece, the time goes up. The secret is to cut down the I/O requests at the DB layer to limit the data scoured.

Monday, October 4, 2010

Concurrency on the Exadata

Now that I have some benchmarks, I'm starting to delve into some testing to find out how it scales up.. I started with a large table 200+ million rows.

My base query did a FTS and returned one row of data.


1 execution runs in 3 seconds (DOP 32).

Once I scale up to 100 simultaneus executions, it runs longer, but I can't figure out the average execution time (parallel query skews the numbers).

In looking at the resource usage for both the database nodes and the storage nodes, I found the database nodes are almost Idle, and the storage nodes (7 of them) are producing about 10g of data/second. The cpu usage is about 7% user and 30% wait. When looking at the AWR information, all the time is still going to I/O waits. 399 seconds out of the 444 seconds are I/O wait times.. It appears that the Exadata does fantastic for a single query.. Once you execute that single query 100 times simultaneously, the times start to slow down.

I'm going to do more experients to see how I can get it to scale up nicer :)