Pillar claiming their RAID5 is more reliable than RAID6? Wizardry or fiction?

Competing against Pillar at an account. One of the things they said: That their RAID5 is superior in reliability to RAID6. I wanted to put this on the public domain and, if true, invite Pillar engineers to comment here and explain how it works for all to see. If untrue, again I invite the Pillar engineers to comment and explain why it’s untrue.

The way I see it: very simply, RAID5 is N+1 protection, RAID6 is N+2. Mathematically, RAID5 is about 4,000 times more likely to lose data than a RAID6 group with the same number of data disks. Even RAID10 is about 160 times more likely to lose data than RAID6.

The only downside to RAID6 is performance – if you want the protection of RAID6 but with extremely high performance then look at NetApp, the RAID-DP NetApp employs by default has in many cases better performance than RAID10 even. Oracle has several PB of DB’s running on NetApp RAID-DP. Can’t be all that bad.

See here for some info…

D

2 Replies to “Pillar claiming their RAID5 is more reliable than RAID6? Wizardry or fiction?”

  1. @ former netapp:

    Indeed!

    In the light of the recent reconstruct time frenzy, it appears, that some vendors claim to indeed make good some of the headway any Raid6 scheme offers.

    The factor mentioned by Dimitris – probability of data loss being a factor >4000 times lower with any Raid6 scheme – applies under the following assumptions:

    a) Parity Overhead stays equal – ie. 7+1 Raid4 or Raid5 vs. 14+2 Raid6

    b) Reconstruct Time for a failed disk stays constant (limited by the disk where the reconstruct is done)

    c) Latent defects occur at a rate significantly lower than 1 per 10^14 (otherwise, during the reconstruction of a 1T Disk, you’ll hit a latent defect and have data loss; further note, that silent corruption aka lost writes, and of course the more prevalent neighbor-track magnetic influence, fall into this category).

    For Raidgroups of the same number of data disks, ie. 5, the difference is actually more like 10000…

    Anyone inclined in learning the maths behind this can read this public document:

    http://media.netapp.com/documents/tr-3574.pdf

    The math required is only high-school calculus. Also, the paper has proper citations for the actual formulas used, and lists the assumptions made.

    Thus, the only way to keep that claim true would be to

    a) have disks showing significantly lower latent defects as the rest of the storage community (availability? cost?)

    and

    b) reduce the time spent in reconstructions by extreme factors (not being 2 or 3 times faster, but something like 1000+ times faster – well exceeding the interface speed of at least the disk where the reconstruction is to be done.

    or

    c) One could simply and silently ignoring defective media blocks during reconstruct…

    Disclaimer: I have no knowledge of any vendor doing this explicitly. However, vendors seem to be paranoid to different levels, when it comes to making sure the magnetic domains keep their data correct.

    Also, there are limits (limits as in law of physics) on what you can do if your protection scheme can only cover a single erasure, and only detecte a single errors. These are two distinctively different things in information theory. The one thing often overlooked is, that RAID is designed to deal with erasures (you know what disk failed), not errors (which bit has flipped again?). Basically, Shannon Theorem tells us, that a single erasure bit can be coped with by a single parity bit, while a single (information theory) error bit will require at least 2 bits of somehow redundant information to detect, localize and thereby correcting it. Also note that lost writes are especially nasty, as the previous write will have left all the previous CRCs in prestine state – a violated CRC would render this case into an erasure and a single parity could recover it.

    Neither a nor b is very likely, with the basic building blocks (disks of spinning rust) being quite optimized as they are…

    Unfortunately, basic math is not very well acknowledged these days…

    Example:

    a) erasure:
    D1 D2 D3 D4 P
    X 1 0 1 0

    (D1 is known to be defective; P is the xor sum of all the data. Therefore, D1 has to be 0)

    b) error:
    D1 D2 D3 D4 P
    1 1 0 1 0

    Now, a check the parity and all the data (which is unlikely, if D1 reports some data including valid CRC), would show a mismatch between Parity and the Data… What to do?
    Which Disk contains the problematic Block?

    c) error with 2 Parity Disks
    D1 D2 D3 D4 P DP
    1 1 0 1 0 X
    X 0 X X X X
    X X 0 X X X
    X X X 0 X X
    X X X X 0 X
    X X X X X 0

    (For simplicity, this is an example of Raid-DP; Reed-Solomon, LDPC, Even-Odd, Star and all the other more complex Raid-6 schemes work identical, but the math required is more complex)

    Here, a check of the Parity again shows that one of the data (or the parity) block is silently corrupt. Using the diagonal for D1 (I excluded drawing all the others, assuming those are correct), it becomes clear that D1 is the defective block – and immediately you can fix it (in this case, swap the bit).

    Of course, some scheme to permanently check the parity has to be done – and for performance reasons, this is again an area of differentiation…

Leave a comment for posterity...