Filesystem and OS benchmark extravaganza: Software makes a huge difference

I’ve published benchmarks of various OSes and filesystems in the past, but this time I thought I’d try a slightly different approach.

For the regular readers of my articles, I think there is something in here for you. The executive summary is this:

Software is what makes hardware sing. 

All too frequently I see people looking at various systems and focus on what CPU each box has, how many GHz, cores etc, how much memory. They will try to compare systems by putting the specs on a spreadsheet.

OK – let’s see how well that approach works.

For this article I used the old postmark benchmark. You can find the source and various executables here. I’ve been using this version for many years.

It is important to note that this is not a modern benchmark any more. It creates highly compressible data (which makes it bad for anything that does compression) and isn’t properly multithreaded. But it will serve to illustrate the purpose of the article.

The way I always run it:

  • set directories 5
  • set number 10000
  • set transactions 20000
  • set read 4096
  • set write 4096
  • set buffering false
  • set size 500 100000
  • run

This way it will create 10,000 files, and do 20,000 things with them, all in 5 directories, with a 4K read and write, no buffered I/O, and the files will range in size from 500 bytes to 100,000 bytes in size. It will also ask the OS to bypass the buffer cache. Then it will delete everything.

It’s fairly easy to interpret since the workload is pre-determined, and a low total time to complete the workload is indicative of good performance, instead of doing an IOPS/latency measurement.

This is very much a transactional/metadata-heavy way to run the benchmark, and not a throughput one (as you will see from the low MB/s figures). Overall, this is an old benchmark, not multithreaded, and really not the best way to test modern stuff any more, but, again, for the purposes of this experiment it’s more than enough to illustrate the point of the article.

It’s very important to realize one thing before looking at the numbers: This article was NOT written to show which OS or filesystem is overall faster. Merely to make the point that an OS and filesystem can potentially massively impact I/O performance given the same exact base hardware and the same application driving the same workload.

More of a reductio ad absurdum experiment, if you like.


OSes tested (all 64-bit, latest patches as of the time of writing):

  • Windows Server 2012
  • Ubuntu 12.10 (in Linux Mint guise)
  • Fedora 18
  • Windows 8
  • Windows 7

For Linux, I compiled postmark with the -O3 directive, the deadline scheduler was used, I disabled filesystem barriers and used the following sysctl tweaks:

  • vm.vfs_cache_pressure=50
  • vm.swappiness=20
  • vm.dirty_background_ratio=10
  • vm.dirty_ratio=30

Filesystems tested:

  • NTFS
  • XFS
  • EXT4

The hardware is unimportant, the point is it was the exact same for all tests and that there was no CPU or memory starvation… There was no SSD involved, BTW, even though some of the results may make it seem like there was.

The results

First, transactions per second. That doesn’t include certain operations. Higher is better.

Postmark tps

Second, total time. This includes all transactions, file creation and deletion:

Postmark time

Finally, the full table that includes the mount parameters and other details:

Postmark table

Some interesting discoveries…

…At least in the context of this test, the same findings may or may not apply with other workloads and hardware configurations:

  • Notice how incredibly long Windows 8 took with its default configuration. I noticed the readyboost file being modified while the benchmark was running, which must be a serious bug, since readyboost isn’t normally used for ephemeral stuff (windows 7 didn’t touch that file while the benchmark was running). If you’re running Windows 8 you probably want to consider disabling the superfetch service… (I believe it’s disabled by default if Windows detects an SSD).
  • BTRFS performance varies too much between Linux kernels (to be expected from a still experimental filesystem). With The Ubuntu 12.10 kernel, it looks like it ignored the directive to not use buffer cache – I didn’t see the drive light blink throughout the benchmark 🙂 Kinda suspect behavior, especially when compared to the Fedora BTRFS results, which were closer to what I was expecting.
  • Filesystem compression works and improves performance dramatically for extremely compressible data (so, beware – this isn’t normal).
  • BTRFS does use CPU much more than the other filesystems, with the box I was using it was OK but it’s probably not a good choice for something like an ancient laptop. But I think overall it shows great promise, given the features it can deliver.
  • The Linux NTFS driver did pretty well compared to Windows itself 🙂
  • Windows Server 2012 performed well, similar to Ubuntu Linux with ext4
  • XFS did better than ext4 overall

What did we learn?

We learned that the choice of OS and filesystem is enough to create a huge performance difference given the exact same workload generated by the exact same application compiled from the exact same source code.

When looking at purchasing something, looking at the base hardware specs is not enough.

Don’t focus too much on the underlying hardware unless the hardware you are looking to purchase is running the exact same OS (for example, you are comparing laptops running the same rev of Windows, or phones running the same rev of Android, or enterprise storage running the same OS, etc). You can apply the same logic to some extent to almost anything – for example, car engines. Bigger can mean more powerful but it depends.

And, ultimately, what really matters isn’t the hardware or even the OS it’s running. It’s how fast it will run your stuff in a reliable fashion with the features that will make you more productive.


Technorati Tags: , , , , , , , ,