Sunday, July 24, 2011

dbreplay vs SPA (cage match).


DBreplay vs SPA (SQL Profile Analyzer)




DBreplay
– This product is utilized through the following steps



  • Capture a production workload during a critical time. All Statements are captured and stored.


  • Copy (or mount), the capture files on your “replay” server.


  • Preprocess the workload


  • Restore your database, up to the point in time of the capture (the capture can give you the exact SCN needed).


  • Replay the workload on the copy of production


  • Compare the replay to the original capture statistics (through AWR). You can also compare different runs to each other.


Pros


  • The entire workload is replayed with the same data set as your source.


  • GTT (global temporary tables), are primed properly


  • The workload is same as production (top executed sql are executed multiple times).


  • Cardinality feedback, and other optimizer pieces that may change with executions should be seen.


Cons


  • Capturing a production workload, and preparing a copy of production is a huge effort.


  • This only tests workload that occurs during the capture. If there are different workload windows (i.e. batch vs online), you must capture the
    different workloads and restore to match.


  • You cannot touch sql, and anything that part of the capture. If you receive errors, (like ora-4031) you must correct the errors before continuing.



SPA (SQL Profile Analyzer)
– This product is

utilized through the following steps.



  • A sql tuning set is created on the source system (this contains the sql, the plans, and the execution statistics.


  • The sql tuning set is exported from old imported into new system.


  • A SPA job is created that will either compare the statistics with the original execution, or compare statistics by executing on both servers
    (through database links).


  • Information on the sql statements is reported on, and prioritized by the effect on the workload.


PROS


  • You don’t need to have the database synched up, as long as it is somewhat representative.


  • You can re-execute against the source system to compare executions.


  • A report is created comparing each sql execution


CONS


  • Does not work with GTT (global temporary tables), since they cannot be primed to run this process


  • The database test set most likely does not match source, so you need to interpret most of the data.


  • You do not see the interaction of SQL statements.



Conclusion – DBReplay is the preferred tool to test performance differences when making an infrastructure change. If GTT’s are utilized, then DBReplay
is the only tool can really give you useful data. If DBReplay cannot be used, or to supplement dbreplay, SPA is also a useful tool. You can also use
SPA to better tune specific sql (utilize DBReplay to identify sql, and SPA to deep dive them).






Monday, July 11, 2011

DBreplay why are you messing with my sequences ??

Well,
  I have been playing with dbreplay, and trying to re-run a production workload.  In order to capture a production workload, I don't have the luxury of bouncing the database.  I start the capture, and do a restore of production that is "roughly right".  A point in time recovery close to that point.

Well I've been fighting an issue for the last couple of days.  I run a replay, and I find a lot of the my sessions are getting hung up, and I have sequences that have starting values BEFORE the value they were set to at the beginning.  At first I thought the flashback somehow messed up my sequences, but now I know it is the replay.

Part of what the "prepare client" process does is this sql step

 SELECT MAX(FIRST_VALUE), MAX(LAST_VALUE), MIN(FIRST_VALUE), MIN(LAST_VALUE), SEQ_BOW, SEQ_NAME, INCREMENT_BY, CYCLE_FLAG, CACHE_SIZE 
  
 FROM WRR$_REPLAY_SEQ_DATA R, DBA_SEQUENCES S 
  
 WHERE R.SEQ_BOW = S.SEQUENCE_OWNER 
  
 AND R.SEQ_NAME = S.SEQUENCE_NAME 
  
 GROUP BY SEQ_NAME ,SEQ_BNM, SEQ_BOW, INCREMENT_BY, CYCLE_FLAG, CACHE_SIZE  

This step looks to see if the current value of the sequence is out of synch with the value when the capture was done.. If so it resynchs the sequences. Sweet eh ? but maybe not what you wanted.. This is what is actually being executed (when you run a trace).

 SELECT "OE"."CHANGE_TRACKING_SEQ".NEXTVAL FROM dual
  
 ALTER SEQUENCE "OE"."CHANGE_TRACKING_SEQ" MAXVALUE 1E27 MINVALUE -1E26 INCREMENT BY -789390 NOCACHE
  
 SELECT "OE"."CHANGE_TRACKING_SEQ".NEXTVAL FROM dual
  
 ALTER SEQUENCE "OE"."CHANGE_TRACKING_SEQ" INCREMENT BY 1 NOCACHE  

As you can see it takes the sequences, alters it to subract on the next call, executes the nextval, then resets with an alter. Pretty slick, but very hard to find.

Thank goodness for Logminer to help point me in the right direction as to who was altering my sequence.

Here is some additional advice. To check to see what sequences have been "reset" look at the last_ddl time from dba_objects. Capture the list of what's changed, and then flashback and get the originall ddl.

My process is now prepare clients, reset sequences, then release the clients.

Search words. database replay sequences out of order reset

Friday, July 8, 2011

Flashback and Sequences

I just ran into a situation using flashback and dbreplay. 

See my next post on DBReplay. this was the culprit.

Saturday, June 18, 2011

Exadata presentations

Well, I'm finally getting my exadata presentations up on my blog.  I will give a synopsis of it so you can decide if you want to look through them.. I also included a some screenshots that are meaningless without some background.

Hardware  (download)

My conclusion on hardware is that the exadata is merely off the shelf hardware.  Yes putting together the hardware in the exadata configuration (with storage cells, and infiniband) does greatly improve performance over most "normal" configurations that use arrays (like Hitachi, IBM, EMC, etc. etc.) over Fiber.

You can build your own server/storage that is even faster using SSD over infininband and you can customize it to your needs balancing the storage and database for YOUR NEEDS.

Software (download)

This is where the solution shines.  As I said above you can build the hardware yourself, but the gain is in the software.. In my test case I took a 276gb table with 1.7billion rows, and scanned it in under a second by combining HCC, storage indexes and flashcache.  Pretty incredible (it only used 6gb of disk space too).

However, the more your application is OLTP like (so it is less likely to HCC compressed, and scanned) the less gain you see with this solution.

Enjoy !

Sunday, June 5, 2011

indexing HCC partititions

I have been playing with what happens when you index HCC compressed data .  If you follow the recommended strategy, you will partition your history tables, and end up with some data uncompressed, some OLTP compressed, and some in different stages of HCC compressed.  What if you use an index lookup that spans all these levels of compression ?

First I took a table that started at 176g of data (uncompressed).

OLTP compression took it down to 76G

HCC query compression took it down to 6G.

Now I indexed the HCC copy.. My index (on a single column) ended up being 40g. 

Now I queried the data using the index.  First execution was longer than querying the unindexed data (with storage indexes working their magic).  Second index is much, much faster.

The lesson I learned is that HCC really helps with storage, but start indexing the HCC data, and you end up using a lot more storage.  Also unindexed lookups can be faster than indexed (until they hit the SGA).

Interesting information to help plan what happens when I go to query across all flavors of compression.

Friday, June 3, 2011

UNYOUG meeting 6/10 In Buffalo.

Yes, it's once again time for my shameless plug for the Upstate NY Oracle Users Group meeting.  This time it's in Buffalo at Buff State.

I will will be presentation (again) on the Exadata part II (software).

Val Bedard will be presenting on Golden Gate (Val ROCKS !!)

Cristophe will be presenting on ODI.  Should be a great day.

The information can be found here.

Thursday, June 2, 2011

Cardinality and the in clause

I was running a query that had an in clause, and the cardinality turned out to be wrong.. After investigating, I found that the optimizer was double applying the selectivity of the index.

Here is my example

select * from system.tab1 a,system.tab2 b
where a.col1 IN ('WRH$_PARAMETER_PK','WRH$_SEG_STAT','WRH$_DB_CACHE_ADVICE')
and col1=col2;

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 54186084

------------------------------------------------------------------------------------------
| Id  | Operation                      | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |         |     3 |   162 |    14   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                  |         |       |       |            |          |
|   2 |   NESTED LOOPS                 |         |     3 |   162 |    14   (0)| 00:00:01 |
|   3 |    INLIST ITERATOR             |         |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID| TAB1    |     3 |    75 |     7   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN         | TAB1_PK |     3 |       |     4   (0)| 00:00:01 |
|*  6 |    INDEX RANGE SCAN            | TAB2_IX |     1 |       |     2   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID  | TAB2    |     1 |    29 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("A"."COL1"='WRH$_DB_CACHE_ADVICE' OR "A"."COL1"='WRH$_PARAMETER_PK'
              OR "A"."COL1"='WRH$_SEG_STAT')
   6 - access("COL1"="COL2")
       filter("COL2"='WRH$_DB_CACHE_ADVICE' OR "COL2"='WRH$_PARAMETER_PK' OR
              "COL2"='WRH$_SEG_STAT')

23 rows selected.


Notice the cardinality of 3 rows.   What made me wonder, is when I used just an '=' sign

select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_DB_CACHE_ADVICE'
and col1=col2;

------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1964798218

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |    26 |  1404 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |    26 |  1404 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TAB1    |     1 |    25 |     2   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | TAB1_PK |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| TAB2    |    26 |   754 |    21   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | TAB2_IX |    26 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"='WRH$_DB_CACHE_ADVICE')
   5 - access("COL2"='WRH$_DB_CACHE_ADVICE')



Notice the cardinality is 26 ?  why is the optimizer changing the cardinality from 26 to 3, when I include an in clause with one other value ? I would expect the cardinality to be 78.  In fact when I use a union it comes back right.

select * from system.tab1 a,system.tab2 b

where col1 ='WRH$_DB_CACHE_ADVICE'
and col1=col2
union
select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_SEG_STAT'
and col1=col2
union
select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_PARAMETER_PK'
and col1=col2;

------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2805776637

---------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                    |    77 |  7014 |    71  (68)| 00:00:01 |
|   1 |  SORT UNIQUE                       |                    |    77 |  7014 |    71  (68)| 00:00:01 |
|   2 |   UNION-ALL                        |                    |       |       |            |          |
|   3 |    NESTED LOOPS                    |                    |    26 |  1404 |    23   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID    | TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN             | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID    | TAB2               |    26 |   754 |    21   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN              | TAB2_IX            |    26 |       |     2   (0)| 00:00:01 |
|   8 |    NESTED LOOPS                    |                    |       |       |            |          |
|   9 |     NESTED LOOPS                   |                    |    51 |  5610 |    46   (0)| 00:00:01 |
|  10 |      VIEW                          | VW_JF_SET$623BBB07 |     2 |   162 |     4   (0)| 00:00:01 |
|  11 |       SORT UNIQUE                  |                    |     2 |    50 |     4  (50)| 00:00:01 |
|  12 |        UNION-ALL                   |                    |       |       |            |          |
|  13 |         TABLE ACCESS BY INDEX ROWID| TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|* 14 |          INDEX UNIQUE SCAN         | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|  15 |         TABLE ACCESS BY INDEX ROWID| TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|* 16 |          INDEX UNIQUE SCAN         | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|* 17 |      INDEX RANGE SCAN              | TAB2_IX            |    26 |       |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS BY INDEX ROWID    | TAB2               |    26 |   754 |    21   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("COL1"='WRH$_PARAMETER_PK')
   7 - access("COL2"='WRH$_PARAMETER_PK')
  14 - access("COL1"='WRH$_DB_CACHE_ADVICE')
  16 - access("COL1"='WRH$_SEG_STAT')
  17 - access("ITEM_1"="COL2")




So I ran a 10053 trace and this is what I found.

1) the table (tab2) has 1280422 rows with 49,840 distinct values

Table Stats::

Table: TAB2 Alias: B
#Rows: 1280422 #Blks: 5972 AvgRowLen: 29.00 ChainCnt: 0.00
Column (#1): COL2(
AvgLen: 23 NDV: 49840 Nulls: 0 Density: 0.000020
Index Stats::
Index: TAB2_IX Col#: 1
LVLS: 2 #LB: 5787 #DK: 49840 LB/K: 1.00 DB/K: 17.00 CLUF: 884604.00
 
2) oracle takes this cardinality and computes it for the table access (1,280,422/49,840) * 3 = 77
 
SINGLE TABLE ACCESS PATH



Single Table Cardinality Estimation for TAB2[B]
Table: TAB2 Alias: B
Card: Original: 1280422.000000 Rounded: 77 Computed: 77.07 Non Adjusted: 77.07
Access Path: TableScan
 
 
 
 
3) It then takes the cardinality of the first table (tab1) the cardinality for the second table (tab2), and applies the selectivity.
 
Join Card: 2.964306 = = outer (3.000000) * inner (77.071950) * sel (0.012821)
 
Here is my example and my 10053 trace


Wednesday, May 25, 2011

ORA-27303: additional information: requested interface xxx.xxx.xxx.xxx failed bind. Check output from ifconfig command

Finally on the exadata, and I make a simple mistake in the login script. 

First I logon to the exadata, and I can start and stop the default database. Everything looks fine.  I then go to create my own database and I get this message

ORA-27504: IPC error creating OSD context

ORA-27300: OS system dependent operation:bind_fail failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvifconf
ORA-27303: additional information: requested interface xx.xx.xxx.xxx failed bind. Check output from ifconfig command

Huh ?  We are having issues transferring files to the machine also (scp's are stalling). so this looks like a network issue right ?  Check output from ifconfig, bind failed for an IP address.. all network related clues.

Well the issue turned out to be ORA_CRS_HOME set incorrectly.    Once I set it correctly, I am able to start up a new database, and start my dbduplicate process.

I just wanted to pass this along, and hopefully help someone else who get's this cryptic message.




Tuesday, May 24, 2011

Extended RAC vs Golden Gate


 Every so often this question comes up... "Why don't we do extended RAC ?"

 

 
Well here is some of my answer as to what we need to consider.

 
First, Extended RAC is setting up a RAC cluster across datacenters.. Each datacenter has it's own independent Storage arrays. Some Nodes are placed on one datacenter (a) and some are placed in (b). To keep it simple and for redundancy, lets say there is 2 nodes in each datacenter (4 total). These 4 nodes are all part of the same cluster, and they all share the same Global cache. Disk writes are mirrored across datacenters. This means that all writes are synchronous, and the write must be acknowledged on both arrays (in both datacenters).. Dark fiber is a must to accomplish this.

  
Issues ? There are number of concerns this brings up.

 

 
1) The global cache is shared across the datacenters. Any latency is felt by any processing that requires sharing data between clusters in opposite datacenters. This can cause some performance degradation.

 
2) Writes are synchronous across datacenters. This can increase the write time for any disk writes.

 
3) You are not protected against logical corruption, upgrade outages etc.

 
Best practice is to also have a physical standby for HA, and to allow patching etc (transient logical), etc.

 

 
In the end you end up with twice as much equipment, twice as much storage, and a lot more complex system over just keeping a physical standby database. It is even recommended that your quorum is kept in a third datacenter.

 
The cost doubles, and the complexity doubles. The question you need to ask is.. is it worth all this ?

 
Another option is to utilize Golden Gate. Golden gate gives you similar flexibility by going active-active across datacenters. You still need to have a physical standby, but the advantage of GG over Extended RAC is that the physical does not have to be the same class server or storage ($$). The whole point of the physical is to keep the changes so that they can be sent to the other active cluster.. The standby cluster never becomes a a client available primary.

 

 
I would summerize the 2 choices as follows

 

 
Extended RAC

 
PROS
  • Guaranteed synchronous write across datacenters.

 CONS

 
  • Less availability for any database changes requiring shutdown (parameter or patching).
  • Greater latency when data is sent between datacenters.
  • BCP requires full size footprint
 
Golden Gate

 

 
PROS

  •  Database available for any database changes requiring downtime.
  •  Application releases can be applied in a rolling fasion
  •  Smaller footprint for BCP servers
  •  Less lattency for reads and commits (though latency for data availabilty)

 
CONS

 
  • Database updates are asynchronous. There will be a delay before update is visible in other datacenter (< 2 seconds)

 

 
Neithor one of these will guarantee 99.999% uptime, but GG comes a lot closer. With extended RAC, there is more planned downtime because it is a single database.
As you can see the decision should be driven by how important it is for the application to be able to have the data immediately available for read across datacenters. If you can tolerate the < 2 second latency, GG is a better product. If the application can't tolerate the latency Extended RAC is only viable solution.

Oracle and networking latency with small packets

Recently I have run into some performance issues, and trying to track down the cause has been a challenge.  It wanted to share a valuable lesson that I learned.

It started like any other performance issues.  The phone rings, and the database is slow.  Why do they think the database is slow ? Because the app isn't very busy, and the processing isn't moving along at the expected pace.  This is a new app, and they are in beta.  The process goes along as it always does.  Check the database for any bottlenecks, or any performance issues.  The database all looks good.  The queries are all executing in milliseconds. on the new hardware.  The application looks good.  The process is executing in Milliseconds, and then calling out to the database for the next query.  Everything looks good, database is fast, application is fast.  The issue goes to the networking group, and the network group concludes that we have a nice 1ge going between the 2 servers.  The network isn't breaking a sweat.  The issue continues.

So what was the issue ?  It was the number of hops, and the distance of the hops between the application and the database server.  It turns out that this is a chatty app, and it sends a lot of small packets back and forth between the database server, and the app server. 
To make matters worse, the application was just uplifted to new hardware for both the database, and the application.  This uplift caused the applications to be less of a bottleneck, pushing the bottleneck to the network.

The issue turned out to be network latency.  Not a lot of latency, (just a couple of milliseconds), but enough to be noticable for a very chatty application.  It becomes more, and more noticable as servers become faster.  Now that queries run in < 0.00ms, the network is popping up as more of a bottleneck.

The lesson I want to pass on, is that I would highly suggest you measure the network latency, and know how much of a impact it has.. Especially if you going across datacenters, or across many networks in the same datacenters.

My test was a simple one..

1) Create a script full of "select 'x' from dual;

2) put a !date call at the beging and end of the script

3) Run this on a client on your network and compare the difference between the begin and the end time.

4) Run this for multiple scenarios.. I even ran it as IPC, to find out what the network overhead is.


Knowing the expected latency as you go across your datacenter(s) is useful to find out where that missing time is going. So often, it is blamed on the dba (database), and doing this kind of check will let you know what isn't the database.

Thursday, April 28, 2011

Cardinality Feedback recommendations

Here are the recommendations I have for working nicely with Cardinaly feedback.

1) Make sure your statistics are accurate.

2) Try to use histograms where necessary for cadinality

3) Avoid functions on columns or anything that forces oracle to estimate cardinality.

4) Test a query more than once.  Cardinality feedback doesn't show up until the second execution

5) Don't necessarily trust the "explain plan for" for any of the tools that gives you the explain plan based on the first  execution.  The first plan may change.

6) Use function based indexes on expressions (like the decode expression in my previous post), and gather statistics on the function based index

7) Create a histogram for indexes with composite keys, especially when the cardinality isn't as expected. The perfect example of this is a composite index on the column  "Birth Month" and  "Astrological Sign".  There are only so many valid combiniations. Without a histogram, Oracle assumes a cartesion product, and estimates cardinality based on that.

I think it all comes down to Giving oracle as much information as possible, and then making sure you have tested the query a couple of times.

Often the development folks don't realize these things.  They find "cheats" that only work the first time.  They often rely on the explain plan for the first execution.

It is up to us in the DBA realm to educate developers on how things have changed with this feature.

Wednesday, April 27, 2011

Cardinality Feedback Example

I have been fighting cardinality feedback issues all week, and I wanted to document an example of cardinality feedback in action.

First take build my test case, it can be found here

drop user testcard cascade;
create user testcard identified by testcard;
grant dba to testcard;
create TABLE testcard.object_sample1 as select * from dba_objects where 1=2;
create TABLE testcard.object_sample2 as select * from dba_objects where 1=2;

Insert into testcard.object_sample1 (select * from dba_objects where owner in ('SYS','TESTCARD') );
Insert into testcard.object_sample1 (select * from testcard.object_sample1 );
Insert into testcard.object_sample1 (select * from testcard.object_sample1 );
Insert into testcard.object_sample2 (select * from dba_objects where owner in ('SYS','TESTCARD') );

create index testcard.idx1_object_sample2 on testcard.object_sample2(owner);
create index testcard.idx1_object_sample1 on testcard.object_sample1(owner);

EXECUTE DBMS_STATS.GATHER_TABLE_STATS( ownname => 'TESTCARD', tabname => 'OBJECT_SAMPLE1', method_opt => 'FOR ALL COLUMNS SIZE 1',granularity => 'ALL',cascade => TRUE,degree => DBMS_STATS.DEFAULT_DEGREE);

EXECUTE DBMS_STATS.GATHER_TABLE_STATS( ownname => 'TESTCARD', tabname => 'OBJECT_SAMPLE2', method_opt => 'FOR ALL COLUMNS SIZE 1',granularity => 'ALL',cascade => TRUE,degree => DBMS_STATS.DEFAULT_DEGREE);


Now lets take a look a what we have

select owner,count(1) from testcard.object_sample1 group by owner;

OWNER                            COUNT(1)
------------------------------ ----------
TESTCARD                                8
SYS                                126620

select owner,count(1) from testcard.object_sample2 group by owner;

OWNER                            COUNT(1)
------------------------------ ----------
TESTCARD                                2
SYS                                 31655


Now here is the query

select * from testcard.object_sample1 a,
testcard.object_sample2 b
where a.owner='SYS' and a.owner is not null and a.object_type='SSSSS'
and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner
 
Notice the "is not null" clause, and notice the cardinality.  Finally notice I did not create a histogram for the columns.  if you run this query for owner of 'TESTCARD', it should return 2 rows, and if you run it for 'SYS' it should scan 126,00+ rows.
 
Well the developers are running it for 'TESTCARD', and it is doing a full table scan. They want to use the index for OWNER.  They find a cheat, and rewrite the query. This fools the optimize into thinking the index is a better bet since a function is used (the decode function)..
 
Below is the query with the "is not null" and the "decode" and you can see the 2 different paths.
 

"A.OWNER IS NOT NULL"
 
EXPLAIN PLAN
SET STATEMENT_ID = 'abc' FOR
select * from testcard.object_sample1 a,
testcard.object_sample2 b
where a.owner='SYS' and a.owner is not null and a.object_type='SSSSS'
and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner;
set pagesize 25
set linesize 121
set long 1000000
col xplan format a100
SELECT dbms_xplan.display_plan(statement_id => 'abc') AS XPLAN
FROM dual;
 


 Plan Hash Value  : 3876965040

-------------------------------------------------------------------------------------
| Id  | Operation            | Name           | Rows   | Bytes    | Cost | Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                | 294519 | 56547648 |  650 | 00:00:08 |
| * 1 |   HASH JOIN          |                | 294519 | 56547648 |  650 | 00:00:08 |
| * 2 |    TABLE ACCESS FULL | OBJECT_SAMPLE2 |    384 |    36864 |  136 | 00:00:02 |
| * 3 |    TABLE ACCESS FULL | OBJECT_SAMPLE1 |   1535 |   147360 |  513 | 00:00:07 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
* 2 - filter("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS')
* 3 - filter("A"."OBJECT_TYPE"='SSSSS' AND "A"."OWNER"='SYS')



"DECODE(A.OWNER,NULL,1,2)=2"
EXPLAIN PLAN

SET STATEMENT_ID = 'cde' FOR
select * from testcard.object_sample1 a,
testcard.object_sample2 b
where a.owner='SYS' and decode(a.owner,null,1,2)=2 and a.object_type='SSSSS'
and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner ;
set pagesize 25
set linesize 121
set long 1000000
col xplan format a100
SELECT dbms_xplan.display_plan(statement_id => 'cde') AS XPLAN
FROM dual;



----------------------------------------------------------------------------------------------------
 Plan Hash Value  : 2829654189

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                | Rows | Bytes  | Cost | Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                     | 2945 | 565440 |  279 | 00:00:04 |
| * 1 |   HASH JOIN                    |                     | 2945 | 565440 |  279 | 00:00:04 |
| * 2 |    TABLE ACCESS BY INDEX ROWID | OBJECT_SAMPLE1      |   15 |   1440 |  142 | 00:00:02 |
| * 3 |     INDEX RANGE SCAN           | IDX1_OBJECT_SAMPLE1 |  629 |        |  133 | 00:00:02 |
| * 4 |    TABLE ACCESS FULL           | OBJECT_SAMPLE2      |  384 |  36864 |  136 | 00:00:02 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
* 2 - filter("A"."OBJECT_TYPE"='SSSSS')
* 3 - access("A"."OWNER"='SYS')
* 3 - filter(DECODE("A"."OWNER",NULL,1,2)=2)
* 4 - filter("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS')


So as you can see there are decode plan is a better plan for owner of  'TESTCODE'.  The deveopment folks then rewrite their query with this "cheat" thinking they beat the optimizer...


But cardinality feedback kicks in, when the query is executed with an owner of 'SYS'.

Below is the output of the display_cursor.. Notice the 2 plans, and notice that cardinality feedback was used to change the plan from an index lookup to FTS again.


select * from testcard.object_sample1 a,testcard.object_sample2 b where a.owner='SYS' and decode(a.owner,null,1,2)=2 and a.object_type='SSSSS' and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner;


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
SQL_ID  91ugxuy8jc3g4, child number 0
-------------------------------------
 select * from testcard.object_sample1 a,
testcard.object_sample2 b  where a.owner='SYS' and
decode(a.owner,null,1,2)=2  and a.object_type='SSSSS'       and
b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner

Plan hash value: 2829654189

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |       |       |   279 (100)|          |
|*  1 |  HASH JOIN                   |                     |  2945 |   552K|   279   (1)| 00:00:04 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| OBJECT_SAMPLE1      |    15 |  1440 |   142   (1)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | IDX1_OBJECT_SAMPLE1 |   629 |       |   133   (1)| 00:00:02 |
|*  4 |   TABLE ACCESS FULL          | OBJECT_SAMPLE2      |   384 | 36864 |   136   (0)| 00:00:02 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
   1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
   2 - filter("A"."OBJECT_TYPE"='SSSSS')
   3 - access("A"."OWNER"='SYS')
       filter(DECODE("A"."OWNER",NULL,1,2)=2)
   4 - filter(("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS'))






select * from testcard.object_sample1 a,testcard.object_sample2 b where a.owner='SYS' and decode(a.owner,null,1,2)=2 and a.object_type='SSSSS' and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner;

select * from table(dbms_xplan.display_cursor('07prx7pkch6tr',null,'typical +peeked_binds'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
SQL_ID  91ugxuy8jc3g4, child number 0
-------------------------------------
 select * from testcard.object_sample1 a,
testcard.object_sample2 b  where a.owner='SYS' and
decode(a.owner,null,1,2)=2  and a.object_type='SSSSS'       and
b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner

Plan hash value: 2829654189

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |       |       |   279 (100)|          |
|*  1 |  HASH JOIN                   |                     |  2945 |   552K|   279   (1)| 00:00:04 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| OBJECT_SAMPLE1      |    15 |  1440 |   142   (1)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | IDX1_OBJECT_SAMPLE1 |   629 |       |   133   (1)| 00:00:02 |
|*  4 |   TABLE ACCESS FULL          | OBJECT_SAMPLE2      |   384 | 36864 |   136   (0)| 00:00:02 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
   1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
   2 - filter("A"."OBJECT_TYPE"='SSSSS')
   3 - access("A"."OWNER"='SYS')
       filter(DECODE("A"."OWNER",NULL,1,2)=2)
   4 - filter(("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS'))

SQL_ID  91ugxuy8jc3g4, child number 1
-------------------------------------
 select * from testcard.object_sample1 a,
testcard.object_sample2 b  where a.owner='SYS' and
decode(a.owner,null,1,2)=2  and a.object_type='SSSSS'       and
b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner

Plan hash value: 1738482415

-------------------------------------------------------------------------------------
| Id  | Operation          | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                |       |       |   650 (100)|          |
|*  1 |  HASH JOIN         |                |  2945 |   552K|   650   (1)| 00:00:08 |
|*  2 |   TABLE ACCESS FULL| OBJECT_SAMPLE1 |    15 |  1440 |   513   (1)| 00:00:07 |
|*  3 |   TABLE ACCESS FULL| OBJECT_SAMPLE2 |   384 | 36864 |   136   (0)| 00:00:02 |

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND
              "A"."OWNER"="B"."OWNER")
   2 - filter(("A"."OBJECT_TYPE"='SSSSS' AND "A"."OWNER"='SYS' AND
              DECODE("A"."OWNER",NULL,1,2)=2))
   3 - filter(("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS'))

Note
-----
   - cardinality feedback used for this statement


This should be enough to play with and see how cardinality feedback can kick in to dynamically change the plan on you.

NOTE : I've played further, and this is what I could find.  Anyone correct if any of these are wrong..

) The v$sqlstats only showed one plan_hash_value even though the plan changed.


2) The plan didn't change immediately if I execute one after another.. When I put in a sleep, then the plan changes (must be some sort of delay before cardinality feedback kicks in).

3) Once locked into the new plan, it stays with it.

4) Adaptive Cursor sharing only kicks in if a histogram is used, or if the bind values fall outside the high/low values.

5) Oracle avoids using plans that are suspect (i.e. in my example with decode).

6) Oracle tends to use cardinality feedback more when the plan is "suspect" because of some sort of rough estimation.

7) Using dynamic sampling causes Oracle to use cardinality feedback more.



Cardinality feedback seems to be more of a weighting thing. Oracle uses tries to look at the "situation" and the less likely the estimate is reliable, the more likely it is to use cardinality feedback to choose another plan.






Friday, April 22, 2011

cardinality feedback

My big challenge this week was an 11g database that wouldn't behave.  I'm sure you've been there.. Getting frustrated by a database that doesn't seen to perform right.

I can tell you that challenge has gotten even bigger in 11g.. A new feature was added called "cardinality feedback".  The idea is simple.  Oracle comes up with a plan, including cardinality for each step, and then executes the plan.  If, after executing the plan, the optimizer believes that the Actual cardinality is out of line with the estimated cardinality, the optimizer will pick a new plan.

Sounds like a great feature right ?  But what if you have issues with histograms, and your actual cardinality is lopsided.  Oracle may chose a plan, run against the lobsided data, and decided the plan is no good.  Oracle morphs the plan for you.. No charge, no control.  It get's frrustrating.

The only quick fix I've found is the COE_PROFILE script that is part of the SQLT tuning kit available from MOS.

There is quick way to test this behavior (if you think it's occuring), use this undocumented parameter

-- Turn on cardinality feedback

alter session set "_optimizer_use_feedback" = true;

-- Turn off (disable) cardinality feedback
alter session set "_optimizer_use_feedback" = false;

Tuesday, April 19, 2011

HCC compression

In my free time this week I have been playing on the exadata.  The first thing I wanted to determine is the compression ratios for my data, and the select statistics.

One of the first observations I made was how HCC saves you on selects. I took a 158g table from another system, and loaded it up in HCC query mode (I loaded it direct).. the table is now 6g in size.. WOW, but that didn't impress me as much as my select.

First I selected from the table with storage indexes off.  2.02 seconds.. 2.02 seconds ?  1.7 Billion rows that was 158g of data.  wow.

Next I select from the table with storage indexes on. it takes 9.02..  The extra 7 seconds was the exadata building the storage indexes.   Then I select again.  .30 seconds.  No indexes.

I am going to do some more testing.  but this certainly is impressive !!

Saturday, April 16, 2011

pin x wait on s wait event

Chances are you've stumbled onto my website because you are searching for the keywords "pin x wait on s" "wait" "event" and "parsing".

Well I have been fighting a fire for the last 2 weeks in our performance enviroment on just these "keywords".

We have been doing our testing by sending through a set workload with very similar processing needs.  This workload is date sensitive, so it has been a challenge to keep moving dates and retesting. Recently, we have begin utilzing "flashback database" for our testing.. This has great promise for getting consistent testing done!
You run through your test case scenario, and capture your performance data.  You flashback to the start point, make the change, then run through the same exact scenario. Pretty cool huh ?

Well all this was going as planned.  We ran our processing, saved AWR data, and reran.  Suddenly we started seeing these "pin x wait on S" events holding up processing for minutes at a time. Where did these come from ? was it the release ??  We had to dig in and find out.

Well here is some background.
  • Our queries are very, very complex.  They often take ~ 1000ms to parse (1 second), and 4ms to execute.
  • flashback will flush your cache, kind of obvious, but pertinent to my situation.
  • Our processing is very structured, and sends multiple processing streams through the same steps in the same order
So what was happening ?  Well it turns out that this is caused by hard parsing multiple duplicate sql (probably 50+ in our case) simultaneously.  This wait event is Oracle trying to parse the first sql statement, and having all the others wait.  This parse queing doesn't do well with sql of this complexity, and it throws off the parsing causing these wait events.

Just wanted to pass this on, that if you see this event, and you are doing similar testing, look at the complexity of the queries, and consider hard simultaneous parsing of the same sql as the cuplrit.


Friday, April 8, 2011

Exadata class

This week I attended "exadata and database machine admintration" class.  It was a good class, but they usually are from OU.

This class covers a lot of material in 3 days and it is a good intro to what it's going to be like managing an exadara.  Take heed to the prerecs.  They aren't manditory, but knowing about RAC and grid is necessary to understand how Exadata implements those pieces.

For those who might attend, my advice is do more with dbcontrol,  The class doesn't say much about dbcontrol or grid control.

Well to start at the beginning the class is in a virual environment, Each student has their own virtual 1/4 rack with 1 cpu per node,  This works out well because everyone has their own isolated environment.  This environment has a database with the sh default schema.

In order to get the most out of the class
1) I changed database password for sys,system,dbsnmp,and sysman.
2) I recreated the password file.
3) I installed dbcontrol (dbca)
4) I set up ssh keys for celladmin, and cellmonitor to go to each cell from the oracle account.  Be aware that to set up ssh for cellmonitor, copy the keys as root and change permissions.  The cellmonitor account has very little privileges,
5) I created my own account in asm and granted sysasm to it.

After doing all this I was able to start up dbcontrol and connect to my virtual environment,  This gave me a great view into how the disks were configured, and I was able to add the exadata cells to dbcontrol.

I think having dbcontrol set up made the class much more useful so I could visually see all the items newly available on the exadata.

Tuesday, April 5, 2011

Hadoop

I know after all this exadata posting, your are probably wondering why my blood doesn't run black and red ?

Well I've been hearing about Hadoop for years, and I think it's finally about time that I set up an Hadoop envioronment.

If you have never heard of hadoop here is a starting point.

There are also 2 very interesting flavors.

Cassandra - this allows for HA through clustering your hadoop database with no single point of failure
Hive          - This allows for data warehousing with the building of materialized views.

The thing that grabbed my attention is the ability to handle large amounts of log data.  Like most major companies we produce unwielding amounts of log files from our application server farms with no way to sift through it.  It seems like a lot of people are utilizing hadoop to organize and search log files.

Interesting subject that I will be pursuing further. If anyone has any experience, or words of wisdom on this pass it on.

Sunday, April 3, 2011

Configuring an Exadata (lessons learned)

Well, the time is finally here. 

Twas the night before Exadata,
and all through the data warehouse,
Not a keyboard was stirring,
Not even a mouse

ACS was all snug in their hotel beds,
with visions of "ONE" scripts danced in their heads.
And DBA's with laptops, and I with my smart phone,
had just settled in to see the hype all get blown.

When out on the server room there arose such a clatter,

I sprang from the bed to see what was the matter.
Away to the laptop I flew like a flash,
Tore open the lid and started up bash.

The nodes on the cabinet of  the new server appliance
Gave the heat of mid-day to everything near the servers in the alliance.
When, what to my wondering eyes should appear,
But a man with a sweater drinking a beer.
he was a sailing pro with the magic of a fairy,

I knew in a moment it must be Larry.

More rapid than eagles his crew they came,
And he whistled, and shouted, and called them by name!

Now Dan Norris! now, Kerry Osborne! now, .....

Well you get the picture..

Anyway here is my lessons learned.  These are mostly the things that would have helped us get from purchase, to ACS coming on site a little faster.

1) IP addresses.. Yes the exadata needs lots of IP address.  Here is what you need for an x2-2 full rack.


Ethernet Subnet 1 IP addresses 51
ILOM for Database Servers 8
ILOM for Exadata Cells 14
Eth0 for Database Servers 8
Eth0 for Exadata Cells 14
Mgmt port for IB switches 3
IP address for KVM 1
IP address for Ethernet Switch 1
IP address for PDUs 2
Ethernet Subnet 2 IP addresses 19
Eth1 for Database Servers 8
VIPs for Database Servers 8
SCAN Addresses (per Cluster) 3
Total 70

2) Naming - The naming convention for a server name is used for all the components within the Exadata.  Even the disk themselves include the name of the server so you can track down any issues.  That's not saying that your standard host name isn't usable. It just means that you give Oracle 1 name, and it is the building block for everything else.

3) Default database -  Surprisingly the default database is probably going to be useless to you (unless you are currently going to use the same configuration as oracle provides).. Oracle creates a default database with default parameters, and default Characterset.  Anything other than that you are on your own.

4) Backup.  If you've read my previous posts you have probably found that this is the most confusing area.  Each Database server comes with 4 1ge ports, and 2 10ge ports.  If you are using 10ge you are all set.. Bond and aggregate, and you are in business.. Infiniband you just use the 2 infiniband ports.
If you are still on 1ge like a lot of the datacenters, you can't easily make it redundant.  Eth0 is reserved for management.  Eth1 and eth2 are usually bonded for active-passive redundancy. This leaves one port for your tape backup.. No redundancy.  This is one of the most important things if you are planning on using 1ge.  Make sure you understand how you are going to configure the Exadata, and what that you most likely will not have redundancy.

Finally, I'll pass on two of my favorite comments to make on all this.

"Buying an exadata is like putting a window airconditioner in your 1920's house" -- This is from a time when I had a house with original wiring.. My wife would try to blow dry her air with the airconditioner on, and the fuse would blow.. If you put an exadata in your datacenter running 1ge , you will blow a fuse.

"Buying an Exadata doesn't make things easier.. It is easier to be told we can't afford it, it's more difficult if they buy it.  It's like telling your mangement that you need a ferarri to go fast, and they say yes.. now drive it 320MPH without crashing." -This should be pretty self explainitory.  There are strong expectations from managment when you buy one of these.








Friday, April 1, 2011

New Free oracle products are out

Oracle XE  11g beta  just came out today, SQL Developer 3.0 came out, and Oracle SQL developer data modeler came out a couple of months ago. All free.




Wednesday, March 30, 2011

Configuring an Exadata (part III)

Well, the time has come to finally get the exadata configured.   We are coming down the end, and we are still figuring out the network connections.  The problem is the lack of  1ge ports.

The exadata comes with 4 1ge ports.  1 of which is reserved for the management (patching, monitoring etc).  It also comes with 2 10ge ports.  This is where the fun begins.  Our standard is to bind 2 ports togethor for public (active-passive) for redundancy, then bond 2 ports for backup aggregating them to get 2ge speed, and have redundancy for the backup.  How do we do this with 3 ports ?  This leaves us with  3 choices.

1) Take the 2 10ge ports, and funnel them down to 2 1ge ports.  Bond and aggregate these 2 ports togethor, and we have our 2 tan ports.  We would be non-standard, and the only ones doing this as far as I know

2) Disable the management services, and utilze the 2 other 1ge ports for Tan.  This means 2 ports 1ge for public bonded, and 1 ports for TAN bonded and aggregated.  Again non-standard.

3) Utilize the 2 ports 1ge for public bonded, 1 management port, and only 1 tan port.  This would be standard but the least desirable.

In looking at the documentation, it states

When connecting the media servers to the Database Machine through Ethernet, connect the eth3 interfaces from each database server directly into the data center network. For high availability, multiple network interfaces on the database servers and multiple network interfaces on the media server can be bonded together. In this configuration, configure the eth3 interface as the preferred or primary interface and configure eth2 as the redundant interface.


If throughput is of a concern then connect both eth2 and eth3 interfaces from each database server directly into the data center’s redundant network. The two interfaces can then be bonded together in a redundant and aggregated way to provide increased throughput and redundancy.
But this certainly doesn't explain what this means to bond eth2 and eth3. Is oracle suggesting not bonding public, and utilzing 2 of the 3 available ports for TAN, or are they suggesting backing up over LAN ?

In any case this whole network configuration of the Exadata has been very confusing.





Thursday, March 24, 2011

Duplicating an ODI interface module

Here I am day 4 in my ODI class and I am on my quest to copy all the wrh$ performance data to a central repository. I think after this day in class I have all the tools to create jobs to do this.

Of course, being a curious recycle consious individual, I tried to reuse some of code. Specifically I tried to export a interface to an XML file, and do a replace all of the table name to the next table, then import the interface with the new name !! Everything looked good with the mapping, until I looked at the name of the Primary Key. It still had the primary key name from the original interface. This means that there must be some "hooks" from the interface XML document to other related objects in the database.. Oh well..

It looks like for now I will be creating interfaces for the objects I need to pull into my repository.

I have been very impressed with the flexibility of the product, and the way I can easily reuse it to add another source system.. Since I'm going to be pulling from 15+ sources flexibility is important.

I'm also going to be using APEX as the front end of all this data. WIth some simple tools like ODI, and APEX, a DBA type can do some serious reporting !

Wednesday, March 23, 2011

ADG with ODI and Exadata

Recently I've been taking a class on ODI. It is really a very interesting ELT tool (notice I didn't say ETL). I am planning on using it to take data from my ADG (active data guard) copy of production to another server. Perfect right ? Pull from a read-only copy of an oracle database, to another database. I pick my LKM (load Knowledge Module) of Oracle-oracle. Unfortunately the current knowledge module creates a view on the source. As you can imagine, with ADG, this is impossible. The only way to get ODI working against ADG is to create your own Knowledge Module, so I've been spending my evening creating my very own. I am hoping this can help others who are running into the same issue. First this is a great site explaining HOW to create your very own knowledge module...
http://www.oracle.com/technetwork/articles/bethke-odi-090881.html
This is a great site to find all the syntax you need.
http://gerardnico.com/doc/odi/webhelp/en/index.htm#ref_api/

Finally these are the steps I did to make my own knowledge module.

1) copy the oracle to oracle(DBLINK) module
2) Give it a new name like oracle to oracle(ADG)
3) Remove the following steps

- 70 create view/table on source
- 80 Create temp indexes on source
- 150 drop view on source
- 160 Drop temp indexes

4) change drop synonym on target to drop view on target

drop synonym <%=odiRef.getTable("L", "COLL_NAME", "W")%>

becomes

drop view <%=odiRef.getTable("L", "COLL_NAME", "W")%>

4) Change the "drop synonym on target" to "drop view on target"

drop synonym <%=odiRef.getTable("L", "COLL_NAME", "W")%>becomes

drop view <%=odiRef.getTable("L", "COLL_NAME", "W")%>

5) Last change. "create synonym on target" becomes "create view on target"

create synonym <%=odiRef.getTable("L", "COLL_NAME", "W")%>
for <%=odiRef.getTable("R", "COLL_NAME", "W")%>

becomes

<% if ((odiRef.getOption("ENABLE_EDITION_SUPPORT")).equals("0")) { %>
create or replace view <%=odiRef.getTable("L", "COLL_NAME", "W")%>
(
<%=odiRef.getColList("", "[CX_COL_NAME]", ",\n\t", "", "")%>
)
as select <%=odiRef.getPop("DISTINCT_ROWS")%>
<%=odiRef.getColList("", "[COL_NAME]", ",\n\t", "", "")%>
from <%=odiRef.getSrcTablesList("", "[SCHEMA].[TABLE_NAME]@", ", ", "")%><%=odiRef.getInfo("SRC_DSERV_NAME")%>
where (1=1)
<%=odiRef.getFilter()%>
<%=odiRef.getJrnFilter()%>
<%=odiRef.getJoin()%>
<%=odiRef.getGrpBy()%>
<%=odiRef.getHaving()%>
<% } else { %>
create table <%=odiRef.getTable("L", "COLL_NAME", "W")%> as
select <%=odiRef.getPop("DISTINCT_ROWS")&%>
<%=odiRef.getColList("", "[COL_NAME]\t[CX_COL_NAME]", ",\n\t", "", "")%>
from <%=odiRef.getSrcTablesList("", "[SCHEMA].[TABLE_NAME]@", ", ", "")%><%=odiRef.getInfo("SRC_DSERV_NAME%>
where (1=1)
<%=odiRef.getFilter()%>
<%=odiRef.getJrnFilter()%>
<%=odiRef.getJoin()%>
<%=odiRef.getGrpBy()%>
<%=odiRef.getHaving()%>
<%}%>



As you can see the idea is to remove any updates to the source, and switch the synonym in the target to a view pointing at the source.

And some advice. If you are using the simulation button to test the new Knowledge module, the "getinfo" commands only contain data at runtime.. The simulate will show nothing, and you will only see data when you actually execute (lost about an hour on that one).

Enjoy.. I am posting my actual XML knowledge module here.

The usual disclaimers here.. test well.. I also want to point out that I only changed the Oracle-oracle knowledge module. If you are going from Oracle to Netezza for example, you need to make the appropraite changes to that knowledge module.

I am including another article I found on a knowledge mudule that doesn't create the link..
http://www.business-intelligence-quotient.com/?p=1187

Tuesday, March 22, 2011

My quest to consolidate AWR data

I am still embarking on my quest to consolidate all the AWR data from all the database into a central performance database.

My first thought to use a simple CDC tool (goldengate) failed. Goldengate will not replicate sys objects. boo.

I am in Class for ODI this week, so my curent plan is to use ODI to replicate the data from my all my sources to a single target.

So far so good, and I will update on how things go with my quest to consolidate reporting data.

If this goes well with ODI, I will will use it to also consolidate tablespace sizing data, etc from all my databases. Wahoo

Sunday, March 20, 2011

Large SGA's and what it means to the future for databases

Well, first off, I don't have the answer to this one, just some musings.

I've noticed that Memory on Servers has gotten bigger lately or cheaper depending on how you look at it. Case in point is the x2-8 exadata. 1tb of memory per database node. Then you add the new 11gr2 ability to parallelize across nodes, and not have to keep passing blocks, you have close to 2tb of available memory.

So what does this mean to a database ? What does this mean for disk I/O ? What is a database doing if the blocks are all in memory. Essentially you are writing out changed blocks, and logging, that's it.

So what do you need a big disk array for ?

Then with the all the really awesome IP based disk arrays out there (like the isilon), what is the disk future ? Like many companies we are still running on 4gb Fiber for all our servers, and connecting to a San array. Should we go to 8gb Fiber or 10gb IP ?

I would be interested in opinions on what people see as the future of disk. IP, Fiber, or FCOE ? How important is the speed of a disk array going to be ? Just put you Redo logs on SSD (or flashcache ?).

Update :

I just saw that Arup Nanda just posted some writing on this topic.  You can read it here.  He basically said that because of consitent read, and other mechanisms, you might find that your database objects are in the cache multiple times utilzing much more of your buffer cache than you probably realize. 
He recommends using a special database (like times ten) to make sure everything is in memory.

Sunday, March 13, 2011

Social media and being an oracle professional

First I have to admit I'm old. I went to college at a time when card decks were still being used. Those were simpler times. You would submit your program on a deck of punch card, go out for the night, and you get the output the next day. There were numerous advantages to this.

The professors didn't expect you to code and debug through the night.. It just wasn't possible.

You couldn't be sloppy. If you didn't syntax check, and debug your code in your head you would get lots of output back that would require rework. Rework with a 24 hour turnaround time gets old fast.

There were very few resources and very few people to turn to.

Times have changed since then. There is a lot of places to find great information. Here are just a few.

Linked In Groups.. There is a great Exaata groups

Blogs. Most people who are famous in our industry blog

Twitter - Follow the people you like on twitter.

All this Social media really helps to keep up-to-date on what is happening. You don't have to feel all in alone in figuring out whether Huge pages is worth it, or how to deal with the problems with a new feature.

I have turned to these media a couple of times lately and been amazed that some of the top people have answered my questions.. Most recently Tanel Podor answered my question on Huge pages, and Tweeted to me that he is happy to answer any question that I tweet him. Here is his answer

Wednesday, March 9, 2011

Configuration of an Exadata Part II

After seeing how popular Configuration of An Exadata Part I was, I figured I would follow up with a part II.

First off there a couple of things I'd like to point out. Along with an Exadata waiting to be configured at this point, I have a Netezza Twin fin waiting to be configured. Many of the reasons why it is taking so long, aren't specifically because of it being an Exadata.

We did do a POC on an Exadata, and Oracle was able to bring it in quickly. It took a day or so to fill out the configuration sheet, and Oracle had the machine up within 1 week. One onsite person for a couple of days, and we were off and running. Although the Netezza video http://www.youtube.com/watch?v=jB7rlChCG-E is very humorous, it doesn't tell the whole picture. I would guess if you are coming from SQL-Server, which some people are, then the video is probably a little more realistic.

Anyway, as many of you can guess a POC is nothing like bringing a new production box into your production datacenter, to be supported by your production support staff. In the POC, there was no dataguard to instantiate, no backups to take, everyone on the POC knew the root, and oracle passwords, and the machine sat in the middle of a lab.

Bringing the box into a production environment is where all the questions begin. In a company that needs approval from support teams, the Exadata brings up a lot of questions. An appliance like the netezza brings up fewer questions, but still there are some hurdles.

At this point we are still coming up with a backup strategy that gives everyone some degree of comfort, without constantly being in hot backup mode. Utilizing 1ge for backup can be very painful. The configuration worksheets are finally completed, and ACS will be scheduled to come in, once we get some new switches.

In case you missed Part I in the Netezza slam Exadata series it's here.

Tuesday, March 8, 2011

How to utilize AWR beyond what grid gives you

Like many of you I use AWR reports all the time to see what is happening in the database, and especially to compare between different points in time.. Whenever the users tell me the system is "running slow", I always try to find a comparison time.. "what other day recently has a similar workload as today", and run an AWR Comparison between the time periods. Through Grid (or dbconsole), this can be frustrating for a RAC database. Sometimes you want to compare between nodes for the same period, or different periods across nodes. This is only the beginning.. I also want to compare across databases.

As you can guess the reports available through the grid don't give you these choices. But guess what ? DBReplay gives you these choices. In order to support DBReplay, Oracle has added some pretty nifty features into AWR.

First a non-dbreplay feature, the "colored SQL".

As many of you know, Oracle only saves the top 50 SQL of interest. Of course you can change the number of top sql saved with the MODIFY_SNAPSHOT_SETINGS procedure.

dbms_workload_repository.modify_snapshot_settings(
retention IN NUMBER DEFAULT NULL,
interval IN NUMBER DEFAULT NULL,
topnsql IN NUMBER DEFAULT NULL,
dbid IN NUMBER DEFAULT );


The problem is, what happens if you are interested in a SQL that isn't in the top sql statements ? If you have a very active system, you might find that the sql isn't in the top 50, or even the 100. So what is there to do ? "color the sql". Use this procedure to mark sql to gather information about the SQL_ID every snapshot even if it isn't a top sql. Here is the description

Routine to add a colored SQL ID. If an SQL ID is colored, it will always be captured in every snapshot, independent of its level of activities (i.e. does not have to be a TOP SQL). Capturing will occur if the SQL is found in the cursor cache at snapshot time

dbms_workload_repository.add_colored_sql(
sql_id IN VARCHAR2,
dbid IN NUMBER DEFAULT NULL);

Now onto my original problem.

How do I do comparisons against instances or even database ? Well when I took a good look at the AWR_DIFF_REPORT_HTML procedure, I noticed that you can call it with multiple parameters (like instance ID and dbid).

dbms_workload_repository.awr_diff_report_html(
dbid1 IN NUMBER,
inst_num1 IN NUMBER,
bid1 IN NUMBER,
eid1 IN NUMBER,
dbid2 IN NUMBER,
inst_num2 IN NUMBER,
bid2 IN NUMBER,
eid2 IN NUMBER)
RETURN awrrpt_html_type_table PIPELINE;

If you run awr_diff_report with the 2 different instances you compare them against each other. Now how do you do database ? Look at metallink note 872733.1 (yes I still call it metalink). This note tells you how to export the AWR information into a file, and import it into a new database.

Here is my plan to handle all this.

1) Create a centralized repository of all AWR information for all database in the infrastructure..

2) Follow the metalink note to export/import data into the centeralized repository

3) In the centralized repository create a table mapping DBID to source database.

4) Configure APEX (application Express), to pick DBID's, Instances, and snapshot ranges to run a comparison report. Display the report in APEX as a webpage.

This is my summer project, and I will let everyone know how it goes.

Tuesday, February 22, 2011

Configuring an Exadata (part I)

I am getting ready to do a presentation on "real world Experiences" with the exadata, and I am still in process of getting the Exadata set up. I figured I would share some of the pain we are having getting the configuration checklist ready for ACS (advanced Customer Support).. This service comes with your Exadata, and the ACS folks configure the box by installing the OS, and the database software. They even set up a starter database. This is where the fun begins. The exadata is a beast of a box. Remember in a full rack there is 8 database nodes, and 14 storage servers. Switches, power supplies.. And to support all this it needs lots and lots of IP's, and lots and lots of "names" for these pieces. When setting up an Exadata you get 4 characters for a name. For many datacenters, it is hard to get descriptive in those 4 characters, and fitting it all into 4 characters might not follow your standards. This is very rigid because of the "one" script.. If you've never heard that term, it is the "one" script that installs everything. Think of it as the silent install for the whole Exadata box.. Everything is driven off of it.. Server names, Cell disk names.. EVERYTHING. If you to add it up, there are probably 200+ items in an exadata that need to be named with this script (thus the reason why the 4 character name is so important).

Same with the database. Oracle crates a simple UTF8 database. Nothing more. And they create dbcontrol within the Exadata, nothing more.

If you are prepared it should be pretty simple to set up.

I think this is where the difference lies between the Exadata and a true appliance. An appliance, you wheel it in, hook it up and run.. An Exadata, you need to give some thought to how to set it up. The Exadata gives you a lot of flexibility (you can run multiple applications and section them off across nodes), you can create indexes for OLTP type transactions, but with that flexibility comes more complexity. It's a trade off.

Continued in part 2