Friday, February 10, 2012

OLTP compression slow for large data set

I am working on loading a large dataset (about 500 Million rows that take up about 100g of space).

I am loading them into a partitioned table and I was hoping to use HCC compression, but at least OLTP compression.

After loading for a while, the inserts seem to go slow and slower,  I was able to test my table structure with OLTP and no compression, and found that there was indeed a bottleneck with compress, but it didn't really get bad until about an hour into the procession.

My process is to do a bulk collect (in pl/sql) of 500 rows, and insert them into my partitioned table.




Below is an example.. From 7:00 am, until 9:30 (in red) I was inserting data into the table with compression off.
You can see that that the number of rows processed in each interval (15 minutes) was consistently ~30 million. for a throughput of 127 Million/hour.
Also take a look at buffer gets/exec, elapsed time/exec, and CPU time/exec.  These values all remain fairly consistent after the first 45 minutes.
At the end of 2.5 hours 351 Million rows were loaded

Now compare to the Blue  (10:15 - 10:45), I was inserting data into the same table with compression on.
You can see that the rows processed started at 22 Million (for the first 15 minutes), but it kept trending downward.  You will also notice that the reads went up,
Compare same values (buffer gets, rows processed, cpu time), and you can see the performance just continues to degrade.
Finally, look at the Violet. This is a snapshot of the currently running load after over 250 Million of data has been loaded.
Notice that we are processing at about 10 Million rows/ hour, the buffer gets are up, and the CPU time has increased.

OLTP compression seems to be significantly slowing down the loads once they get moving along.

 
Anyone see this, or have any idea why it slows down ?  The only theory I can come up with is "garbage collection" for the partitions.. I reach a point, where I am inserting into blocks, that haven't been compressed, and oracle is now going back and compressing the blocks to make room.

Here are the performance numbes.  I've also included the AWR logic read output, If you take number of executions * buffer gets, you find that the logical reads are all from the inserts.


END_TIME
ELAPSED_TIME
EXECUTIONS
TOTAL_READS_PER_EXECUTION
ROWS_PROCESSED Total
BUFFER_GETS
CPU_TIME


2/10/2012 7:00
0.004
56,711
283
28,355,500
283
4,316


2/10/2012 7:15
0.004
83,225
262
41,612,500
262
4,206


2/10/2012 7:30
0.004
81,178
293
40,589,000
293
4,332


2/10/2012 7:45
0.007
66,630
945
33,315,000
945
6,821


2/10/2012 8:00
0.007
62,374
1,190
31,187,000
1,190
7,353


2/10/2012 8:15
0.009
58,031
1,640
29,015,500
1,640
8,912


2/10/2012 8:30
0.008
59,598
1,442
29,799,000
1,442
8,292


2/10/2012 8:45
0.009
57,116
1,648
28,558,000
1,648
8,952


2/10/2012 9:00
0.008
60,477
1,410
30,238,500
1,410
8,057


2/10/2012 9:15
0.009
56,334
1,710
28,167,000
1,710
9,060


2/10/2012 9:30
0.008
61,627
1,293
30,813,500
1,293
7,681






351,650,500







Throughput
127,872,909






















2/10/2012 10:15
0.013
45,964
1,940
22,982,000
1,940
12,878


2/10/2012 10:30
0.019
33,048
3,014
16,524,000
3,014
19,466


2/10/2012 10:45
0.018
36,192
2,235
18,096,000
2,235
18,024


2/10/2012 11:00
0.017
37,362
1,737
18,681,000
1,737
17,507


2/10/2012 11:15
0.018
34,992
1,526
17,496,000
1,526
17,799


2/10/2012 11:30
0.036
20,757
6,253
10,378,500
6,253
35,703


2/10/2012 11:45
0.046
16,744
8,714
8,372,000
8,714
46,436






112,529,500







throughput
64,302,571













END_TIME
ELAPSED_TIME_DELTA
EXECUTIONS_DELTA
TOTAL_READS_PER_EXECUTION
ROWS_PROCESSED_DELTA
DISK_READS_DELTA
BUFFER_GETS
CPU_TIME

2/9/2012 22:00
0.186
4,572
33,631
2,286,000
11
33,620
171,338

2/9/2012 22:15
0.188
4,632
33,240
2,316,000
11
33,229
171,302

2/9/2012 22:30
0.19
4,545
33,574
2,272,500
10
33,564
174,641

2/9/2012 22:45
0.182
4,762
33,027
2,381,000
11
33,016
167,433














9,255,500













Segments by Logical Reads

  • Total Logical Reads: 159,380,402
  • Captured Segments account for 99.5% of Total
Owner
Tablespace Name
Object Name
Subobject Name
Obj. Type
Logical Reads
%Total
BGRENN
BGRENN_2009
FACT_BGRENN_DETL
ERD_BGRENN_2009
TABLE PARTITION
36,379,968
22.83
BGRENN
BGRENN_2010
FACT_BGRENN_DETL
ERD_BGRENN_2010
TABLE PARTITION
35,459,344
22.25
BGRENN
BGRENN_2011
FACT_BGRENN_DETL
ERD_BGRENN_2011
TABLE PARTITION
34,801,888
21.84
BGRENN
BGRENN_2008
FACT_BGRENN_DETL
ERD_BGRENN_2008
TABLE PARTITION
33,651,856
21.11
BGRENN
BGRENN_2007
FACT_BGRENN_DETL
ERD_BGRENN_2007
TABLE PARTITION
9,641,168
6.05

No comments:

Post a Comment