This is going to be a short post, to atone for my past sins of overly long posts but mostly because I want to eat dinner.
On storage systems with spinning disks, a favorite method for getting more performance is short-stroking the disk.
It’s a weird term but firmly based on science. Some storage vendors even made a big deal about being able to place data on certain parts of the disk, geometrically speaking.
Consider the relationship between angular and linear velocity first:
Assuming something round that rotates at a constant speed (say, 15 thousand revolutions per minute), the angular speed is constant.
The linear speed, on the other hand, increases the further you get away from the center of rotation.
Which means, the part furthest away from the center has the highest linear speed.
Now imagine a hard disk, and let’s say you want to measure its performance for some sort of random workload demanding low latency.
What if you could put all the data of your benchmark at the very outer edge of the disk?
You would get several benefits:
- The data would enjoy the highest linear speed and
- The disk tracks at the outer edge store more data, further increasing speeds, plus
- The disk heads would only have to move a small amount to get to all the data (the short-stroking part). This leads to much reduced seek times, which is highly beneficial for most workloads.
I whipped this up to explain it pictorially:
Using a lot of data in a benchmark would not be enough to avoid short-stroking. One would also need to ensure that the access pattern touches the entire disk surface.
This is why NetApp always randomizes data placement for the write-intensive SPC-1 benchmark, to ensure we are not accused of short-stroking, no matter how much data is used in the benchmark.
Hope this clears it up.
If you are participating at any vendor proof-of-concept involving performance – I advise you to consider the implications of short-stroking. Caching, too, but I feel that has been tackled sufficiently.