Wednesday, April 27, 2011

Cardinality Feedback Example

I have been fighting cardinality feedback issues all week, and I wanted to document an example of cardinality feedback in action.

First take build my test case, it can be found here

drop user testcard cascade;
create user testcard identified by testcard;
grant dba to testcard;
create TABLE testcard.object_sample1 as select * from dba_objects where 1=2;
create TABLE testcard.object_sample2 as select * from dba_objects where 1=2;

Insert into testcard.object_sample1 (select * from dba_objects where owner in ('SYS','TESTCARD') );
Insert into testcard.object_sample1 (select * from testcard.object_sample1 );
Insert into testcard.object_sample1 (select * from testcard.object_sample1 );
Insert into testcard.object_sample2 (select * from dba_objects where owner in ('SYS','TESTCARD') );

create index testcard.idx1_object_sample2 on testcard.object_sample2(owner);
create index testcard.idx1_object_sample1 on testcard.object_sample1(owner);

EXECUTE DBMS_STATS.GATHER_TABLE_STATS( ownname => 'TESTCARD', tabname => 'OBJECT_SAMPLE1', method_opt => 'FOR ALL COLUMNS SIZE 1',granularity => 'ALL',cascade => TRUE,degree => DBMS_STATS.DEFAULT_DEGREE);

EXECUTE DBMS_STATS.GATHER_TABLE_STATS( ownname => 'TESTCARD', tabname => 'OBJECT_SAMPLE2', method_opt => 'FOR ALL COLUMNS SIZE 1',granularity => 'ALL',cascade => TRUE,degree => DBMS_STATS.DEFAULT_DEGREE);


Now lets take a look a what we have

select owner,count(1) from testcard.object_sample1 group by owner;

OWNER                            COUNT(1)
------------------------------ ----------
TESTCARD                                8
SYS                                126620

select owner,count(1) from testcard.object_sample2 group by owner;

OWNER                            COUNT(1)
------------------------------ ----------
TESTCARD                                2
SYS                                 31655


Now here is the query

select * from testcard.object_sample1 a,
testcard.object_sample2 b
where a.owner='SYS' and a.owner is not null and a.object_type='SSSSS'
and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner
 
Notice the "is not null" clause, and notice the cardinality.  Finally notice I did not create a histogram for the columns.  if you run this query for owner of 'TESTCARD', it should return 2 rows, and if you run it for 'SYS' it should scan 126,00+ rows.
 
Well the developers are running it for 'TESTCARD', and it is doing a full table scan. They want to use the index for OWNER.  They find a cheat, and rewrite the query. This fools the optimize into thinking the index is a better bet since a function is used (the decode function)..
 
Below is the query with the "is not null" and the "decode" and you can see the 2 different paths.
 

"A.OWNER IS NOT NULL"
 
EXPLAIN PLAN
SET STATEMENT_ID = 'abc' FOR
select * from testcard.object_sample1 a,
testcard.object_sample2 b
where a.owner='SYS' and a.owner is not null and a.object_type='SSSSS'
and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner;
set pagesize 25
set linesize 121
set long 1000000
col xplan format a100
SELECT dbms_xplan.display_plan(statement_id => 'abc') AS XPLAN
FROM dual;
 


 Plan Hash Value  : 3876965040

-------------------------------------------------------------------------------------
| Id  | Operation            | Name           | Rows   | Bytes    | Cost | Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                | 294519 | 56547648 |  650 | 00:00:08 |
| * 1 |   HASH JOIN          |                | 294519 | 56547648 |  650 | 00:00:08 |
| * 2 |    TABLE ACCESS FULL | OBJECT_SAMPLE2 |    384 |    36864 |  136 | 00:00:02 |
| * 3 |    TABLE ACCESS FULL | OBJECT_SAMPLE1 |   1535 |   147360 |  513 | 00:00:07 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
* 2 - filter("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS')
* 3 - filter("A"."OBJECT_TYPE"='SSSSS' AND "A"."OWNER"='SYS')



"DECODE(A.OWNER,NULL,1,2)=2"
EXPLAIN PLAN

SET STATEMENT_ID = 'cde' FOR
select * from testcard.object_sample1 a,
testcard.object_sample2 b
where a.owner='SYS' and decode(a.owner,null,1,2)=2 and a.object_type='SSSSS'
and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner ;
set pagesize 25
set linesize 121
set long 1000000
col xplan format a100
SELECT dbms_xplan.display_plan(statement_id => 'cde') AS XPLAN
FROM dual;



----------------------------------------------------------------------------------------------------
 Plan Hash Value  : 2829654189

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                | Rows | Bytes  | Cost | Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                     | 2945 | 565440 |  279 | 00:00:04 |
| * 1 |   HASH JOIN                    |                     | 2945 | 565440 |  279 | 00:00:04 |
| * 2 |    TABLE ACCESS BY INDEX ROWID | OBJECT_SAMPLE1      |   15 |   1440 |  142 | 00:00:02 |
| * 3 |     INDEX RANGE SCAN           | IDX1_OBJECT_SAMPLE1 |  629 |        |  133 | 00:00:02 |
| * 4 |    TABLE ACCESS FULL           | OBJECT_SAMPLE2      |  384 |  36864 |  136 | 00:00:02 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
* 2 - filter("A"."OBJECT_TYPE"='SSSSS')
* 3 - access("A"."OWNER"='SYS')
* 3 - filter(DECODE("A"."OWNER",NULL,1,2)=2)
* 4 - filter("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS')


So as you can see there are decode plan is a better plan for owner of  'TESTCODE'.  The deveopment folks then rewrite their query with this "cheat" thinking they beat the optimizer...


But cardinality feedback kicks in, when the query is executed with an owner of 'SYS'.

Below is the output of the display_cursor.. Notice the 2 plans, and notice that cardinality feedback was used to change the plan from an index lookup to FTS again.


select * from testcard.object_sample1 a,testcard.object_sample2 b where a.owner='SYS' and decode(a.owner,null,1,2)=2 and a.object_type='SSSSS' and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner;


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
SQL_ID  91ugxuy8jc3g4, child number 0
-------------------------------------
 select * from testcard.object_sample1 a,
testcard.object_sample2 b  where a.owner='SYS' and
decode(a.owner,null,1,2)=2  and a.object_type='SSSSS'       and
b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner

Plan hash value: 2829654189

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |       |       |   279 (100)|          |
|*  1 |  HASH JOIN                   |                     |  2945 |   552K|   279   (1)| 00:00:04 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| OBJECT_SAMPLE1      |    15 |  1440 |   142   (1)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | IDX1_OBJECT_SAMPLE1 |   629 |       |   133   (1)| 00:00:02 |
|*  4 |   TABLE ACCESS FULL          | OBJECT_SAMPLE2      |   384 | 36864 |   136   (0)| 00:00:02 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
   1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
   2 - filter("A"."OBJECT_TYPE"='SSSSS')
   3 - access("A"."OWNER"='SYS')
       filter(DECODE("A"."OWNER",NULL,1,2)=2)
   4 - filter(("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS'))






select * from testcard.object_sample1 a,testcard.object_sample2 b where a.owner='SYS' and decode(a.owner,null,1,2)=2 and a.object_type='SSSSS' and b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner;

select * from table(dbms_xplan.display_cursor('07prx7pkch6tr',null,'typical +peeked_binds'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
SQL_ID  91ugxuy8jc3g4, child number 0
-------------------------------------
 select * from testcard.object_sample1 a,
testcard.object_sample2 b  where a.owner='SYS' and
decode(a.owner,null,1,2)=2  and a.object_type='SSSSS'       and
b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner

Plan hash value: 2829654189

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |       |       |   279 (100)|          |
|*  1 |  HASH JOIN                   |                     |  2945 |   552K|   279   (1)| 00:00:04 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| OBJECT_SAMPLE1      |    15 |  1440 |   142   (1)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | IDX1_OBJECT_SAMPLE1 |   629 |       |   133   (1)| 00:00:02 |
|*  4 |   TABLE ACCESS FULL          | OBJECT_SAMPLE2      |   384 | 36864 |   136   (0)| 00:00:02 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
   1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND "A"."OWNER"="B"."OWNER")
   2 - filter("A"."OBJECT_TYPE"='SSSSS')
   3 - access("A"."OWNER"='SYS')
       filter(DECODE("A"."OWNER",NULL,1,2)=2)
   4 - filter(("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS'))

SQL_ID  91ugxuy8jc3g4, child number 1
-------------------------------------
 select * from testcard.object_sample1 a,
testcard.object_sample2 b  where a.owner='SYS' and
decode(a.owner,null,1,2)=2  and a.object_type='SSSSS'       and
b.owner='SYS' and b.object_type=a.object_type and a.owner=b.owner

Plan hash value: 1738482415

-------------------------------------------------------------------------------------
| Id  | Operation          | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                |       |       |   650 (100)|          |
|*  1 |  HASH JOIN         |                |  2945 |   552K|   650   (1)| 00:00:08 |
|*  2 |   TABLE ACCESS FULL| OBJECT_SAMPLE1 |    15 |  1440 |   513   (1)| 00:00:07 |
|*  3 |   TABLE ACCESS FULL| OBJECT_SAMPLE2 |   384 | 36864 |   136   (0)| 00:00:02 |

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."OBJECT_TYPE"="A"."OBJECT_TYPE" AND
              "A"."OWNER"="B"."OWNER")
   2 - filter(("A"."OBJECT_TYPE"='SSSSS' AND "A"."OWNER"='SYS' AND
              DECODE("A"."OWNER",NULL,1,2)=2))
   3 - filter(("B"."OBJECT_TYPE"='SSSSS' AND "B"."OWNER"='SYS'))

Note
-----
   - cardinality feedback used for this statement


This should be enough to play with and see how cardinality feedback can kick in to dynamically change the plan on you.

NOTE : I've played further, and this is what I could find.  Anyone correct if any of these are wrong..

) The v$sqlstats only showed one plan_hash_value even though the plan changed.


2) The plan didn't change immediately if I execute one after another.. When I put in a sleep, then the plan changes (must be some sort of delay before cardinality feedback kicks in).

3) Once locked into the new plan, it stays with it.

4) Adaptive Cursor sharing only kicks in if a histogram is used, or if the bind values fall outside the high/low values.

5) Oracle avoids using plans that are suspect (i.e. in my example with decode).

6) Oracle tends to use cardinality feedback more when the plan is "suspect" because of some sort of rough estimation.

7) Using dynamic sampling causes Oracle to use cardinality feedback more.



Cardinality feedback seems to be more of a weighting thing. Oracle uses tries to look at the "situation" and the less likely the estimate is reliable, the more likely it is to use cardinality feedback to choose another plan.






Friday, April 22, 2011

cardinality feedback

My big challenge this week was an 11g database that wouldn't behave.  I'm sure you've been there.. Getting frustrated by a database that doesn't seen to perform right.

I can tell you that challenge has gotten even bigger in 11g.. A new feature was added called "cardinality feedback".  The idea is simple.  Oracle comes up with a plan, including cardinality for each step, and then executes the plan.  If, after executing the plan, the optimizer believes that the Actual cardinality is out of line with the estimated cardinality, the optimizer will pick a new plan.

Sounds like a great feature right ?  But what if you have issues with histograms, and your actual cardinality is lopsided.  Oracle may chose a plan, run against the lobsided data, and decided the plan is no good.  Oracle morphs the plan for you.. No charge, no control.  It get's frrustrating.

The only quick fix I've found is the COE_PROFILE script that is part of the SQLT tuning kit available from MOS.

There is quick way to test this behavior (if you think it's occuring), use this undocumented parameter

-- Turn on cardinality feedback

alter session set "_optimizer_use_feedback" = true;

-- Turn off (disable) cardinality feedback
alter session set "_optimizer_use_feedback" = false;

Tuesday, April 19, 2011

HCC compression

In my free time this week I have been playing on the exadata.  The first thing I wanted to determine is the compression ratios for my data, and the select statistics.

One of the first observations I made was how HCC saves you on selects. I took a 158g table from another system, and loaded it up in HCC query mode (I loaded it direct).. the table is now 6g in size.. WOW, but that didn't impress me as much as my select.

First I selected from the table with storage indexes off.  2.02 seconds.. 2.02 seconds ?  1.7 Billion rows that was 158g of data.  wow.

Next I select from the table with storage indexes on. it takes 9.02..  The extra 7 seconds was the exadata building the storage indexes.   Then I select again.  .30 seconds.  No indexes.

I am going to do some more testing.  but this certainly is impressive !!

Saturday, April 16, 2011

pin x wait on s wait event

Chances are you've stumbled onto my website because you are searching for the keywords "pin x wait on s" "wait" "event" and "parsing".

Well I have been fighting a fire for the last 2 weeks in our performance enviroment on just these "keywords".

We have been doing our testing by sending through a set workload with very similar processing needs.  This workload is date sensitive, so it has been a challenge to keep moving dates and retesting. Recently, we have begin utilzing "flashback database" for our testing.. This has great promise for getting consistent testing done!
You run through your test case scenario, and capture your performance data.  You flashback to the start point, make the change, then run through the same exact scenario. Pretty cool huh ?

Well all this was going as planned.  We ran our processing, saved AWR data, and reran.  Suddenly we started seeing these "pin x wait on S" events holding up processing for minutes at a time. Where did these come from ? was it the release ??  We had to dig in and find out.

Well here is some background.
  • Our queries are very, very complex.  They often take ~ 1000ms to parse (1 second), and 4ms to execute.
  • flashback will flush your cache, kind of obvious, but pertinent to my situation.
  • Our processing is very structured, and sends multiple processing streams through the same steps in the same order
So what was happening ?  Well it turns out that this is caused by hard parsing multiple duplicate sql (probably 50+ in our case) simultaneously.  This wait event is Oracle trying to parse the first sql statement, and having all the others wait.  This parse queing doesn't do well with sql of this complexity, and it throws off the parsing causing these wait events.

Just wanted to pass this on, that if you see this event, and you are doing similar testing, look at the complexity of the queries, and consider hard simultaneous parsing of the same sql as the cuplrit.


Friday, April 8, 2011

Exadata class

This week I attended "exadata and database machine admintration" class.  It was a good class, but they usually are from OU.

This class covers a lot of material in 3 days and it is a good intro to what it's going to be like managing an exadara.  Take heed to the prerecs.  They aren't manditory, but knowing about RAC and grid is necessary to understand how Exadata implements those pieces.

For those who might attend, my advice is do more with dbcontrol,  The class doesn't say much about dbcontrol or grid control.

Well to start at the beginning the class is in a virual environment, Each student has their own virtual 1/4 rack with 1 cpu per node,  This works out well because everyone has their own isolated environment.  This environment has a database with the sh default schema.

In order to get the most out of the class
1) I changed database password for sys,system,dbsnmp,and sysman.
2) I recreated the password file.
3) I installed dbcontrol (dbca)
4) I set up ssh keys for celladmin, and cellmonitor to go to each cell from the oracle account.  Be aware that to set up ssh for cellmonitor, copy the keys as root and change permissions.  The cellmonitor account has very little privileges,
5) I created my own account in asm and granted sysasm to it.

After doing all this I was able to start up dbcontrol and connect to my virtual environment,  This gave me a great view into how the disks were configured, and I was able to add the exadata cells to dbcontrol.

I think having dbcontrol set up made the class much more useful so I could visually see all the items newly available on the exadata.

Tuesday, April 5, 2011

Hadoop

I know after all this exadata posting, your are probably wondering why my blood doesn't run black and red ?

Well I've been hearing about Hadoop for years, and I think it's finally about time that I set up an Hadoop envioronment.

If you have never heard of hadoop here is a starting point.

There are also 2 very interesting flavors.

Cassandra - this allows for HA through clustering your hadoop database with no single point of failure
Hive          - This allows for data warehousing with the building of materialized views.

The thing that grabbed my attention is the ability to handle large amounts of log data.  Like most major companies we produce unwielding amounts of log files from our application server farms with no way to sift through it.  It seems like a lot of people are utilizing hadoop to organize and search log files.

Interesting subject that I will be pursuing further. If anyone has any experience, or words of wisdom on this pass it on.

Sunday, April 3, 2011

Configuring an Exadata (lessons learned)

Well, the time is finally here. 

Twas the night before Exadata,
and all through the data warehouse,
Not a keyboard was stirring,
Not even a mouse

ACS was all snug in their hotel beds,
with visions of "ONE" scripts danced in their heads.
And DBA's with laptops, and I with my smart phone,
had just settled in to see the hype all get blown.

When out on the server room there arose such a clatter,

I sprang from the bed to see what was the matter.
Away to the laptop I flew like a flash,
Tore open the lid and started up bash.

The nodes on the cabinet of  the new server appliance
Gave the heat of mid-day to everything near the servers in the alliance.
When, what to my wondering eyes should appear,
But a man with a sweater drinking a beer.
he was a sailing pro with the magic of a fairy,

I knew in a moment it must be Larry.

More rapid than eagles his crew they came,
And he whistled, and shouted, and called them by name!

Now Dan Norris! now, Kerry Osborne! now, .....

Well you get the picture..

Anyway here is my lessons learned.  These are mostly the things that would have helped us get from purchase, to ACS coming on site a little faster.

1) IP addresses.. Yes the exadata needs lots of IP address.  Here is what you need for an x2-2 full rack.


Ethernet Subnet 1 IP addresses 51
ILOM for Database Servers 8
ILOM for Exadata Cells 14
Eth0 for Database Servers 8
Eth0 for Exadata Cells 14
Mgmt port for IB switches 3
IP address for KVM 1
IP address for Ethernet Switch 1
IP address for PDUs 2
Ethernet Subnet 2 IP addresses 19
Eth1 for Database Servers 8
VIPs for Database Servers 8
SCAN Addresses (per Cluster) 3
Total 70

2) Naming - The naming convention for a server name is used for all the components within the Exadata.  Even the disk themselves include the name of the server so you can track down any issues.  That's not saying that your standard host name isn't usable. It just means that you give Oracle 1 name, and it is the building block for everything else.

3) Default database -  Surprisingly the default database is probably going to be useless to you (unless you are currently going to use the same configuration as oracle provides).. Oracle creates a default database with default parameters, and default Characterset.  Anything other than that you are on your own.

4) Backup.  If you've read my previous posts you have probably found that this is the most confusing area.  Each Database server comes with 4 1ge ports, and 2 10ge ports.  If you are using 10ge you are all set.. Bond and aggregate, and you are in business.. Infiniband you just use the 2 infiniband ports.
If you are still on 1ge like a lot of the datacenters, you can't easily make it redundant.  Eth0 is reserved for management.  Eth1 and eth2 are usually bonded for active-passive redundancy. This leaves one port for your tape backup.. No redundancy.  This is one of the most important things if you are planning on using 1ge.  Make sure you understand how you are going to configure the Exadata, and what that you most likely will not have redundancy.

Finally, I'll pass on two of my favorite comments to make on all this.

"Buying an exadata is like putting a window airconditioner in your 1920's house" -- This is from a time when I had a house with original wiring.. My wife would try to blow dry her air with the airconditioner on, and the fuse would blow.. If you put an exadata in your datacenter running 1ge , you will blow a fuse.

"Buying an Exadata doesn't make things easier.. It is easier to be told we can't afford it, it's more difficult if they buy it.  It's like telling your mangement that you need a ferarri to go fast, and they say yes.. now drive it 320MPH without crashing." -This should be pretty self explainitory.  There are strong expectations from managment when you buy one of these.








Friday, April 1, 2011

New Free oracle products are out

Oracle XE  11g beta  just came out today, SQL Developer 3.0 came out, and Oracle SQL developer data modeler came out a couple of months ago. All free.




Wednesday, March 30, 2011

Configuring an Exadata (part III)

Well, the time has come to finally get the exadata configured.   We are coming down the end, and we are still figuring out the network connections.  The problem is the lack of  1ge ports.

The exadata comes with 4 1ge ports.  1 of which is reserved for the management (patching, monitoring etc).  It also comes with 2 10ge ports.  This is where the fun begins.  Our standard is to bind 2 ports togethor for public (active-passive) for redundancy, then bond 2 ports for backup aggregating them to get 2ge speed, and have redundancy for the backup.  How do we do this with 3 ports ?  This leaves us with  3 choices.

1) Take the 2 10ge ports, and funnel them down to 2 1ge ports.  Bond and aggregate these 2 ports togethor, and we have our 2 tan ports.  We would be non-standard, and the only ones doing this as far as I know

2) Disable the management services, and utilze the 2 other 1ge ports for Tan.  This means 2 ports 1ge for public bonded, and 1 ports for TAN bonded and aggregated.  Again non-standard.

3) Utilize the 2 ports 1ge for public bonded, 1 management port, and only 1 tan port.  This would be standard but the least desirable.

In looking at the documentation, it states

When connecting the media servers to the Database Machine through Ethernet, connect the eth3 interfaces from each database server directly into the data center network. For high availability, multiple network interfaces on the database servers and multiple network interfaces on the media server can be bonded together. In this configuration, configure the eth3 interface as the preferred or primary interface and configure eth2 as the redundant interface.


If throughput is of a concern then connect both eth2 and eth3 interfaces from each database server directly into the data center’s redundant network. The two interfaces can then be bonded together in a redundant and aggregated way to provide increased throughput and redundancy.
But this certainly doesn't explain what this means to bond eth2 and eth3. Is oracle suggesting not bonding public, and utilzing 2 of the 3 available ports for TAN, or are they suggesting backing up over LAN ?

In any case this whole network configuration of the Exadata has been very confusing.





Thursday, March 24, 2011

Duplicating an ODI interface module

Here I am day 4 in my ODI class and I am on my quest to copy all the wrh$ performance data to a central repository. I think after this day in class I have all the tools to create jobs to do this.

Of course, being a curious recycle consious individual, I tried to reuse some of code. Specifically I tried to export a interface to an XML file, and do a replace all of the table name to the next table, then import the interface with the new name !! Everything looked good with the mapping, until I looked at the name of the Primary Key. It still had the primary key name from the original interface. This means that there must be some "hooks" from the interface XML document to other related objects in the database.. Oh well..

It looks like for now I will be creating interfaces for the objects I need to pull into my repository.

I have been very impressed with the flexibility of the product, and the way I can easily reuse it to add another source system.. Since I'm going to be pulling from 15+ sources flexibility is important.

I'm also going to be using APEX as the front end of all this data. WIth some simple tools like ODI, and APEX, a DBA type can do some serious reporting !

Wednesday, March 23, 2011

ADG with ODI and Exadata

Recently I've been taking a class on ODI. It is really a very interesting ELT tool (notice I didn't say ETL). I am planning on using it to take data from my ADG (active data guard) copy of production to another server. Perfect right ? Pull from a read-only copy of an oracle database, to another database. I pick my LKM (load Knowledge Module) of Oracle-oracle. Unfortunately the current knowledge module creates a view on the source. As you can imagine, with ADG, this is impossible. The only way to get ODI working against ADG is to create your own Knowledge Module, so I've been spending my evening creating my very own. I am hoping this can help others who are running into the same issue. First this is a great site explaining HOW to create your very own knowledge module...
http://www.oracle.com/technetwork/articles/bethke-odi-090881.html
This is a great site to find all the syntax you need.
http://gerardnico.com/doc/odi/webhelp/en/index.htm#ref_api/

Finally these are the steps I did to make my own knowledge module.

1) copy the oracle to oracle(DBLINK) module
2) Give it a new name like oracle to oracle(ADG)
3) Remove the following steps

- 70 create view/table on source
- 80 Create temp indexes on source
- 150 drop view on source
- 160 Drop temp indexes

4) change drop synonym on target to drop view on target

drop synonym <%=odiRef.getTable("L", "COLL_NAME", "W")%>

becomes

drop view <%=odiRef.getTable("L", "COLL_NAME", "W")%>

4) Change the "drop synonym on target" to "drop view on target"

drop synonym <%=odiRef.getTable("L", "COLL_NAME", "W")%>becomes

drop view <%=odiRef.getTable("L", "COLL_NAME", "W")%>

5) Last change. "create synonym on target" becomes "create view on target"

create synonym <%=odiRef.getTable("L", "COLL_NAME", "W")%>
for <%=odiRef.getTable("R", "COLL_NAME", "W")%>

becomes

<% if ((odiRef.getOption("ENABLE_EDITION_SUPPORT")).equals("0")) { %>
create or replace view <%=odiRef.getTable("L", "COLL_NAME", "W")%>
(
<%=odiRef.getColList("", "[CX_COL_NAME]", ",\n\t", "", "")%>
)
as select <%=odiRef.getPop("DISTINCT_ROWS")%>
<%=odiRef.getColList("", "[COL_NAME]", ",\n\t", "", "")%>
from <%=odiRef.getSrcTablesList("", "[SCHEMA].[TABLE_NAME]@", ", ", "")%><%=odiRef.getInfo("SRC_DSERV_NAME")%>
where (1=1)
<%=odiRef.getFilter()%>
<%=odiRef.getJrnFilter()%>
<%=odiRef.getJoin()%>
<%=odiRef.getGrpBy()%>
<%=odiRef.getHaving()%>
<% } else { %>
create table <%=odiRef.getTable("L", "COLL_NAME", "W")%> as
select <%=odiRef.getPop("DISTINCT_ROWS")&%>
<%=odiRef.getColList("", "[COL_NAME]\t[CX_COL_NAME]", ",\n\t", "", "")%>
from <%=odiRef.getSrcTablesList("", "[SCHEMA].[TABLE_NAME]@", ", ", "")%><%=odiRef.getInfo("SRC_DSERV_NAME%>
where (1=1)
<%=odiRef.getFilter()%>
<%=odiRef.getJrnFilter()%>
<%=odiRef.getJoin()%>
<%=odiRef.getGrpBy()%>
<%=odiRef.getHaving()%>
<%}%>



As you can see the idea is to remove any updates to the source, and switch the synonym in the target to a view pointing at the source.

And some advice. If you are using the simulation button to test the new Knowledge module, the "getinfo" commands only contain data at runtime.. The simulate will show nothing, and you will only see data when you actually execute (lost about an hour on that one).

Enjoy.. I am posting my actual XML knowledge module here.

The usual disclaimers here.. test well.. I also want to point out that I only changed the Oracle-oracle knowledge module. If you are going from Oracle to Netezza for example, you need to make the appropraite changes to that knowledge module.

I am including another article I found on a knowledge mudule that doesn't create the link..
http://www.business-intelligence-quotient.com/?p=1187

Tuesday, March 22, 2011

My quest to consolidate AWR data

I am still embarking on my quest to consolidate all the AWR data from all the database into a central performance database.

My first thought to use a simple CDC tool (goldengate) failed. Goldengate will not replicate sys objects. boo.

I am in Class for ODI this week, so my curent plan is to use ODI to replicate the data from my all my sources to a single target.

So far so good, and I will update on how things go with my quest to consolidate reporting data.

If this goes well with ODI, I will will use it to also consolidate tablespace sizing data, etc from all my databases. Wahoo

Sunday, March 20, 2011

Large SGA's and what it means to the future for databases

Well, first off, I don't have the answer to this one, just some musings.

I've noticed that Memory on Servers has gotten bigger lately or cheaper depending on how you look at it. Case in point is the x2-8 exadata. 1tb of memory per database node. Then you add the new 11gr2 ability to parallelize across nodes, and not have to keep passing blocks, you have close to 2tb of available memory.

So what does this mean to a database ? What does this mean for disk I/O ? What is a database doing if the blocks are all in memory. Essentially you are writing out changed blocks, and logging, that's it.

So what do you need a big disk array for ?

Then with the all the really awesome IP based disk arrays out there (like the isilon), what is the disk future ? Like many companies we are still running on 4gb Fiber for all our servers, and connecting to a San array. Should we go to 8gb Fiber or 10gb IP ?

I would be interested in opinions on what people see as the future of disk. IP, Fiber, or FCOE ? How important is the speed of a disk array going to be ? Just put you Redo logs on SSD (or flashcache ?).

Update :

I just saw that Arup Nanda just posted some writing on this topic.  You can read it here.  He basically said that because of consitent read, and other mechanisms, you might find that your database objects are in the cache multiple times utilzing much more of your buffer cache than you probably realize. 
He recommends using a special database (like times ten) to make sure everything is in memory.

Sunday, March 13, 2011

Social media and being an oracle professional

First I have to admit I'm old. I went to college at a time when card decks were still being used. Those were simpler times. You would submit your program on a deck of punch card, go out for the night, and you get the output the next day. There were numerous advantages to this.

The professors didn't expect you to code and debug through the night.. It just wasn't possible.

You couldn't be sloppy. If you didn't syntax check, and debug your code in your head you would get lots of output back that would require rework. Rework with a 24 hour turnaround time gets old fast.

There were very few resources and very few people to turn to.

Times have changed since then. There is a lot of places to find great information. Here are just a few.

Linked In Groups.. There is a great Exaata groups

Blogs. Most people who are famous in our industry blog

Twitter - Follow the people you like on twitter.

All this Social media really helps to keep up-to-date on what is happening. You don't have to feel all in alone in figuring out whether Huge pages is worth it, or how to deal with the problems with a new feature.

I have turned to these media a couple of times lately and been amazed that some of the top people have answered my questions.. Most recently Tanel Podor answered my question on Huge pages, and Tweeted to me that he is happy to answer any question that I tweet him. Here is his answer

Wednesday, March 9, 2011

Configuration of an Exadata Part II

After seeing how popular Configuration of An Exadata Part I was, I figured I would follow up with a part II.

First off there a couple of things I'd like to point out. Along with an Exadata waiting to be configured at this point, I have a Netezza Twin fin waiting to be configured. Many of the reasons why it is taking so long, aren't specifically because of it being an Exadata.

We did do a POC on an Exadata, and Oracle was able to bring it in quickly. It took a day or so to fill out the configuration sheet, and Oracle had the machine up within 1 week. One onsite person for a couple of days, and we were off and running. Although the Netezza video http://www.youtube.com/watch?v=jB7rlChCG-E is very humorous, it doesn't tell the whole picture. I would guess if you are coming from SQL-Server, which some people are, then the video is probably a little more realistic.

Anyway, as many of you can guess a POC is nothing like bringing a new production box into your production datacenter, to be supported by your production support staff. In the POC, there was no dataguard to instantiate, no backups to take, everyone on the POC knew the root, and oracle passwords, and the machine sat in the middle of a lab.

Bringing the box into a production environment is where all the questions begin. In a company that needs approval from support teams, the Exadata brings up a lot of questions. An appliance like the netezza brings up fewer questions, but still there are some hurdles.

At this point we are still coming up with a backup strategy that gives everyone some degree of comfort, without constantly being in hot backup mode. Utilizing 1ge for backup can be very painful. The configuration worksheets are finally completed, and ACS will be scheduled to come in, once we get some new switches.

In case you missed Part I in the Netezza slam Exadata series it's here.

Tuesday, March 8, 2011

How to utilize AWR beyond what grid gives you

Like many of you I use AWR reports all the time to see what is happening in the database, and especially to compare between different points in time.. Whenever the users tell me the system is "running slow", I always try to find a comparison time.. "what other day recently has a similar workload as today", and run an AWR Comparison between the time periods. Through Grid (or dbconsole), this can be frustrating for a RAC database. Sometimes you want to compare between nodes for the same period, or different periods across nodes. This is only the beginning.. I also want to compare across databases.

As you can guess the reports available through the grid don't give you these choices. But guess what ? DBReplay gives you these choices. In order to support DBReplay, Oracle has added some pretty nifty features into AWR.

First a non-dbreplay feature, the "colored SQL".

As many of you know, Oracle only saves the top 50 SQL of interest. Of course you can change the number of top sql saved with the MODIFY_SNAPSHOT_SETINGS procedure.

dbms_workload_repository.modify_snapshot_settings(
retention IN NUMBER DEFAULT NULL,
interval IN NUMBER DEFAULT NULL,
topnsql IN NUMBER DEFAULT NULL,
dbid IN NUMBER DEFAULT );


The problem is, what happens if you are interested in a SQL that isn't in the top sql statements ? If you have a very active system, you might find that the sql isn't in the top 50, or even the 100. So what is there to do ? "color the sql". Use this procedure to mark sql to gather information about the SQL_ID every snapshot even if it isn't a top sql. Here is the description

Routine to add a colored SQL ID. If an SQL ID is colored, it will always be captured in every snapshot, independent of its level of activities (i.e. does not have to be a TOP SQL). Capturing will occur if the SQL is found in the cursor cache at snapshot time

dbms_workload_repository.add_colored_sql(
sql_id IN VARCHAR2,
dbid IN NUMBER DEFAULT NULL);

Now onto my original problem.

How do I do comparisons against instances or even database ? Well when I took a good look at the AWR_DIFF_REPORT_HTML procedure, I noticed that you can call it with multiple parameters (like instance ID and dbid).

dbms_workload_repository.awr_diff_report_html(
dbid1 IN NUMBER,
inst_num1 IN NUMBER,
bid1 IN NUMBER,
eid1 IN NUMBER,
dbid2 IN NUMBER,
inst_num2 IN NUMBER,
bid2 IN NUMBER,
eid2 IN NUMBER)
RETURN awrrpt_html_type_table PIPELINE;

If you run awr_diff_report with the 2 different instances you compare them against each other. Now how do you do database ? Look at metallink note 872733.1 (yes I still call it metalink). This note tells you how to export the AWR information into a file, and import it into a new database.

Here is my plan to handle all this.

1) Create a centralized repository of all AWR information for all database in the infrastructure..

2) Follow the metalink note to export/import data into the centeralized repository

3) In the centralized repository create a table mapping DBID to source database.

4) Configure APEX (application Express), to pick DBID's, Instances, and snapshot ranges to run a comparison report. Display the report in APEX as a webpage.

This is my summer project, and I will let everyone know how it goes.

Tuesday, February 22, 2011

Configuring an Exadata (part I)

I am getting ready to do a presentation on "real world Experiences" with the exadata, and I am still in process of getting the Exadata set up. I figured I would share some of the pain we are having getting the configuration checklist ready for ACS (advanced Customer Support).. This service comes with your Exadata, and the ACS folks configure the box by installing the OS, and the database software. They even set up a starter database. This is where the fun begins. The exadata is a beast of a box. Remember in a full rack there is 8 database nodes, and 14 storage servers. Switches, power supplies.. And to support all this it needs lots and lots of IP's, and lots and lots of "names" for these pieces. When setting up an Exadata you get 4 characters for a name. For many datacenters, it is hard to get descriptive in those 4 characters, and fitting it all into 4 characters might not follow your standards. This is very rigid because of the "one" script.. If you've never heard that term, it is the "one" script that installs everything. Think of it as the silent install for the whole Exadata box.. Everything is driven off of it.. Server names, Cell disk names.. EVERYTHING. If you to add it up, there are probably 200+ items in an exadata that need to be named with this script (thus the reason why the 4 character name is so important).

Same with the database. Oracle crates a simple UTF8 database. Nothing more. And they create dbcontrol within the Exadata, nothing more.

If you are prepared it should be pretty simple to set up.

I think this is where the difference lies between the Exadata and a true appliance. An appliance, you wheel it in, hook it up and run.. An Exadata, you need to give some thought to how to set it up. The Exadata gives you a lot of flexibility (you can run multiple applications and section them off across nodes), you can create indexes for OLTP type transactions, but with that flexibility comes more complexity. It's a trade off.

Continued in part 2

Monday, January 31, 2011

DBMS_FILE_TRANFER.COPY_FILE and locking

I was working on copying an RMAN backup from an NFS mount into ASM to do a restore of a database.. I figured.. A great opportunity to use DBMS_FILE_TRANSER to copy it in !!
I copied the first file into ASM, and deleted off the mount. I then copied over my next file (I only had room for 1). BAM. I'm out of space. I look at the file system, and the space is used up (with a 'df -k'), but I couldn't file the file using up the space. I thought.. maybe dbms_file_transfer doesn't release the file handle from the OS? I shut down the database, and my space returns.

I haven't had time to play with it, but beware that using DBMS_FILE_TRANSFER holds the file open, so deleting it doesn't remove the space. I'm not sure how to release the handle.