Tuesday, January 24, 2012

Calculating disk space usage on an exadata.

I have been working on figuring out where all our space is, and how much space is actually available on our Exadata.  First to clarify what all the calculations are based on.

1/2 Rack. Sata drives. normal redundancy

This means we have
  • 7 storage cells
  • Each storage cell contains 12 disks
  • each disk is 2tb (which is about 1.862 tb usable)
  • The first 2 disks in each storage cell has ~30g already partitioned for the OS (which is mirrored).
Next I looked to see how the disks were allocated within each storage cell (they are all consistent)

list griddisk attributes name, celldisk, size
         DATA_DMPF_CD_00_srcell1        CD_00_srcell1  733G
         DATA_DMPF_CD_01_srcell1        CD_01_srcell1  733G
         DATA_DMPF_CD_02_srcell1        CD_02_srcell1  733G
         DATA_DMPF_CD_03_srcell1        CD_03_srcell1  733G
         DATA_DMPF_CD_04_srcell1        CD_04_srcell1  733G
         DATA_DMPF_CD_05_srcell1        CD_05_srcell1  733G
         DATA_DMPF_CD_06_srcell1        CD_06_srcell1  733G
         DATA_DMPF_CD_07_srcell1        CD_07_srcell1  733G
         DATA_DMPF_CD_08_srcell1        CD_08_srcell1  733G
         DATA_DMPF_CD_09_srcell1        CD_09_srcell1  733G
         DATA_DMPF_CD_10_srcell1        CD_10_srcell1  733G
         DATA_DMPF_CD_11_srcell1        CD_11_srcell1  733G
         DBFS_DG_CD_02_srcell1          CD_02_srcell1  29.109375G
         DBFS_DG_CD_03_srcell1          CD_03_srcell1  29.109375G
         DBFS_DG_CD_04_srcell1          CD_04_srcell1  29.109375G
         DBFS_DG_CD_05_srcell1          CD_05_srcell1  29.109375G
         DBFS_DG_CD_06_srcell1          CD_06_srcell1  29.109375G
         DBFS_DG_CD_07_srcell1          CD_07_srcell1  29.109375G
         DBFS_DG_CD_08_srcell1          CD_08_srcell1  29.109375G
         DBFS_DG_CD_09_srcell1          CD_09_srcell1  29.109375G
         DBFS_DG_CD_10_srcell1          CD_10_srcell1  29.109375G
         DBFS_DG_CD_11_srcell1          CD_11_srcell1  29.109375G
         RECO_DMPF_CD_00_srcell1        CD_00_srcell1  1099.546875G
         RECO_DMPF_CD_01_srcell1        CD_01_srcell1  1099.546875G
         RECO_DMPF_CD_02_srcell1        CD_02_srcell1  1099.546875G
         RECO_DMPF_CD_03_srcell1        CD_03_srcell1  1099.546875G
         RECO_DMPF_CD_04_srcell1        CD_04_srcell1  1099.546875G
         RECO_DMPF_CD_05_srcell1        CD_05_srcell1  1099.546875G
         RECO_DMPF_CD_06_srcell1        CD_06_srcell1  1099.546875G
         RECO_DMPF_CD_07_srcell1        CD_07_srcell1  1099.546875G
         RECO_DMPF_CD_08_srcell1        CD_08_srcell1  1099.546875G
         RECO_DMPF_CD_09_srcell1        CD_09_srcell1  1099.546875G
         RECO_DMPF_CD_10_srcell1        CD_10_srcell1  1099.546875G
         RECO_DMPF_CD_11_srcell1        CD_11_srcell1  1099.546875G


This is giving me a lot of information of how things were configured as griddisks.

I can tell from this that there are 3 sets of griddisks (for my diskgroups).

Data - this is composed of 12 disks containing 733g luns
reco  - this is composed of 12 disks containing 1100g luns
dbfs  - this is composed of 10 disks containing 29g luns

Notice that I mentioned previously, that the first 2 disks are used for the os (mirrored), this is why there are only 10 luns of 29g available for the dbfs disk group.

I then run the numbers for each one (7 cells * #disks * luns)

data -  61.572 tb
reco -  92.4 tb
dbfs  -   2.03 tb

Remember this is raw disk available, and I am running in normal reduncy (mirrored), if you are running triple mirrored keep this in mind.

Now this gets me a starting point, and took a look at the what asm is showing for disk usage to try see what is going on..


There are 3 values that I am looking at trying to figure out.

 Disk Group      SIZE         USED              USABLE FREE
data                     61.572   32.692             10.042
reco                     92.4         3.003             38.082
dbfs                       2.03       2.018            -.135 

Now these numbers don't seem to add up.. Only the size seems to match what I was expecting.

These are the things I started wondering about
  • How can I be using 33 tb out of 62tb raw when I am mirrored (unless it is the total raw used)
  • How can my usable free be 10tb if I am using 1/2 of the raw disk ?
  • How can my usable free be negative ???
Well in looking at the number further, and looking at the data I was able to answer the first question. The 32 tb is the raw so to state it again in actual usage...

Disk group   mirrored_used
data               16.346
reco                 1.502
dbfs                  1.009

Ok this makes a little more sense.  Looking at this this the following must be true also....

Disk group      raw left
data                 28.88
reco                 89.397
dbfs                     .019


OK, first number solved.. now lets see the next number.. The usable free must be the amount of mirred storage available (rather then raw), so If I go back to the usable free, and convert back to raw (x2 for mirrored) I get

Disk group       Usable free     Raw usable
data                   10.042          20.082
reco                    38.082         76.164
dbfs                      -.135             -.270

OK, I'm getting close, but why the discrepency, and why the negative number ??? Lets look at the diff

Disk group    Raw left     raw usable      missing raw storage
data                28.88          20.082              8.8
reco                 89.397       76.164            13.233
dbfs                     .019         -.270              -.29

Now lets take a closer look at the numbers...   and what it means to be negative.

TIP The usable free space specifies the amount of space that can be safely used for data. A value above zero means that redundancy can be properly restored after a disk failure.

So we need to reserve some space to absorb a disk loss.. hmm, in this case, it means being able to lose a storage cell, and be able to mirror on a different cell.. So lets take that calculation and see what happens
Lun_size  * disks

Disk group      calculation                   Storage cell usage
data               (.733 x 12)                     8.8
reco                (1.1 x 12)                    13.23
dbfs               (.029 x 10)                     .29

Well there is missing, space and I got answers to all my questions.

Well to summarize.

1) How much space is there to use on a 1/2 rack with 2tb Sata drives mirrored (normal redundancy) ???
    ((29g * 10 disks)     * 6 cells +
    (1833g * 12 disks) * 6 cells)/2

    66.858 tb mirrored

2) What does the values USED and SIZE mean when I am looking at ASM ?
These are the raw space avalailable across all cells, and it is the amount of raw space allocated.

3) What does the USABLE FREE show me ?
This is the amount of space you can safely allocate to your data. this (like the 2 above values) is not measured in raw, but it is measured in usable.


If anyone see's anything wrong with with my calculations let me know.  they seem to add up, and explain all the numbers...





Here is some good information, and the display from the storage cell to comfirm my sizes on whats available from the disks. My numbers match up.

http://blog.enkitec.com/wp-content/uploads/2011/02/Enkitec-Exadata-Storage-Layout11.pdf


CellCLI> list celldisk attributes name, devicePartition, size where diskType = 'HardDisk'
         CD_00_srcell1  /dev/sda3       1832.59375G
         CD_01_srcell1  /dev/sdb3       1832.59375G
         CD_02_srcell1  /dev/sdc        1861.703125G
         CD_03_srcell1  /dev/sdd        1861.703125G
         CD_04_srcell1  /dev/sde        1861.703125G
         CD_05_srcell1  /dev/sdf        1861.703125G
         CD_06_srcell1  /dev/sdg        1861.703125G
         CD_07_srcell1  /dev/sdh        1861.703125G
         CD_08_srcell1  /dev/sdi        1861.703125G
         CD_09_srcell1  /dev/sdj        1861.703125G
         CD_10_srcell1  /dev/sdk        1861.703125G
         CD_11_srcell1  /dev/sdl        1861.703125G

2 comments:

  1. Nice demonstration !

    Thank's

    One question : you said "...restored after a disk failure..." but the demonstration show that a entire cell is mirrored on others, not the failed disk.

    ReplyDelete
  2. Nice Post, Just now got alert form OEM and this post has given me clear idea about alert

    ReplyDelete