In this post I will try to help you understand how to objectively calculate the cost of space-efficient storage solutions – there’s just too much misinformation out there and it’s getting irritating since certain vendors aren’t exactly honest with how they do certain calculations…

### A brief history lesson:

The faster a storage device, the smaller and more expensive it usually is. Flash was initially insanely expensive relative to spinning disk, so it was used in small amounts, typically as a tier and/or cache augmentation.

And so it came to be that flash-based storage systems started implementing some of the more interesting space efficiency techniques around. Interesting because it’s algorithmically easy to reduce data dramatically, but hard to do under high load while maintaining impressive IOPS *and* low latency.

Space efficiencies plus lower flash media costs bring us to today’s ability to use all-flash storage in ever-increasingly cost-effective amounts.

### But how does one figure out the best deal?

There are some factors I won’t get into in this article. Company size and viability, support staff strength, maturity of the code, automation, overall features etc. all may play a huge role depending on the environment and requirements (and, indeed, will often eliminate several of the players from further consideration). However, I want to focus on the basics.

### Recommended metric: Cost per effective TB

It’s easy to get lost in the hype. One company says they reduce by 3:1, another might say 5:1, yet another 10:1, etc. The high efficiency ratios seem to be attractive, right?

Well – you’re not paying for a high efficiency ratio. What you *are* paying for is for usable capacity.

If all solutions cost the same, the systems with high efficiency ratios would win this battle every day of the week and twice on Sundays.

However, solutions don’t all cost the same. Ask your vendor what the projected effective capacity will be for each specific configuration, and the Cost/Effective TB is a trivial calculation.

But there’s *one* more thing to do in order for the calculation to be correct:

### Insist on calculating the efficiency ratio yourself.

Most storage systems will show a nice picture in the GUI with an overall efficiency ratio. Looks nice and easy. Well – the devil is in the details.

If a vendor is upfront about how they measure efficiency, your numbers might make sense.

This is where you trust but verify. Some pointers:

- Take a note of the initial usable space
*before*putting anything on the system. - If you store a 1TB DB and do
*nothing*else to the data, what’s the efficiency? **Calculate the ratio yourself!**Divide the amount of capacity the data is taking in the OS by the amount it’s taking on the storage.- Does the number make sense given the size of the data you just put on the system and how much usable space is left now?
- If you take 10 snapshots of the data, what’s the efficiency? How about if you delete the snaps, does the efficiency change?
- If you take a clone of the DB, what’s the efficiency?
- If you delete the clone you just took, what’s the efficiency?
- Create a large LUN (10TB for example) and only store 1TB of data in it. What’s the efficiency? Do you count thin provisioning as data reduction?
- Does this all add up if you do the math manually instead of the GUI doing it for you?
**Does it all meet your expectations? For example, if a vendor is claiming 5:1 reduction, can you actually store 5***different*DBs in the space of one? Or do they really mean something else? That’s a pretty easy test…

You see, most vendors count savings a bit differently. In the examples above, that 1TB DB, if stored in a 10TB LUN, and cloned 10 times, will probably result in a very high efficiency number. It doesn’t mean however that 10 *different* DBs of the same size would have nearly the same efficiency ratio.

If you don’t have time to do a test in-house, have the vendor prove their claims and show how they do their math in their labs while you watch. You will typically find that each data type has a wildly different space efficiency ratio.

### The bottom line

It’s pretty easy. Figure out the efficiency ratio on your own *based on how you expect to use the system*, then plug that ratio into the Price/Effective TB formula like so:

Real Cost per TB = Price/(Usable TB * *Real* Efficiency Ratio as a multiplier)

### And, finally, a word on capacity guarantees:

Some vendors will guarantee capacity efficiencies. Always, *always* demand to see the fine print. If a vendor insists they will *guarantee* x:1 efficiency, have them sign an official *legally binding agreement* that has the backing of the vendor’s HQ (and isn’t some desperate local sales office ploy that might not be worth the paper it’s printed on).

Insist the guarantee states you will get that claimed efficiency *no matter what you’re storing on the box.*

Notice how quickly the small print will come 🙂

D

Technorati Tags: Compression, Flash, IOPS, SSD, Deduplication

Hi Dimitris,

Couldn’t agree more with you on this topic.

Why and how can vendor X claim 10:1 ; Vendor Y 5:1, Vendor Z 3:1 and so on and so on….

Like any cake, the proof is in the eating….

The ingredients and assumptions and caveats behind many of the reduction claims are pretty humorous if not downright laughable once you pull them apart – and focus on the bottom line: How much of MY actual data can I store out of what I am paying for??

To all our beloved customers out there – educate yourself and look into the devil’s eyes !!

cheers,

Paul H.

[HP Storage APJ]