Sunday, February 12, 2017

mkfile(8) is severely syscall limited on OS X

When I got my brand-new MacBook Pro (late 2016), I was interested in testing out the phenomenal SSD performance I had been reading about, reportedly around 2GB/s. Unix geek that I am, I opened a Terminal window and tapped out the following:
mkfile 8G /tmp/testfile

To my great surprise and consternation, both the time command and an iostat 1 running in another window showed a measly 250MB/s throughput. That's not much faster than a spinning rust disk, and certainly much much slower than previous SSDs, never mind the speed demon that is supposed the MBP 2016's SSD.

So what was going on? Were the other reports false? At first, my suspicions fell on FileVault, which I was now using for the first time. It didn't make any sense, because what I had heard was that FileVault had only a minimal performance impact, whereas this was roughly a factor 8 slowdown.

Alas, turning FileVault off and waiting for the disk to be decrypted did not change anything. Still 250MB/s. Had I purchased a lemon? Could I return the machine because the SSD didn't perform as well as advertised? Except, of course, that the speed wasn't actually advertised anywhere.

It never occurred to me that the problem could be with mkfile(8). Of course, that's exactly where the problem was. If you check the mkfile source code, you will see that it writes to disk in 512 byte chunks. That doesn't actually affect the I/O path, which will coalesce those writes. However, you are spending one syscall per 512 bytes, and that turns out to be the limiting factor. Upping the buffer size increases throughput until we hit 2GB/s at a 512KB buffer size. After that throughput stays flat.

Mkfile ssd throughput X-Axis is buffer size in KB. The original 512 byte size isn't on there because it would be 0.5KB or the entire axis would need to be bytes, which would also be awkward at the larger sizes. Also note that the X-Axis is logarithmic.

Radar filed: 30482489. I did not check on other operating systems, but my guess is that the results would be similar.

UPDATE: In the HN discussion, a number of people interpreted this as saying that syscall speed is slow on OS X. AFAIK that is no longer the case, and in case not the point. The point is that the hardware has changed so dramatically that even seemingly extremely safe and uncontroversial assumptions no longer hold. Heck, 250MB/s would be perfectly fine if we still had spinning rust, but SSDs in general and particularly the scorchingly fast ones Apple has put in these laptops just changed the equation so that something that used to just not factor into the equation at all, such as syscall performance, can now be the deciding factor.

In the I/O tests I did for my iOS/macOS performance book (see sidebar), I found that CPU nowadays generally dominates over actual hardware I/O performance. This was a case where I just wouldn't have expected it to be the case, and it took me over day to find the culprit, because the CPU should be doing virtually nothing. But once again, assumptions just got trampled by hardware advancements.

So check your assumptions.