Tuesday, September 9, 2014

No Virginia, Swift is not 10x faster than Objective-C

About a month ago, Jesse Squires published a post titled Apples to Apples, documenting benchmark results that he claims show Swift now with a roughly 10x performance advantage over Objective-C. Although completely bogus, the post was retweeted by Chris Lattner (who should know better, and was supposedly mostly interested in highlighting the improvements in the Swift optimizer, rather than the bogus comparison) and has now been referenced a number of times as background knowledge as to the state of Swift. More importantly, though the actual mistake Jesse makes is pretty basic and not that interesting, it does point to some deeper misunderstandings about performance and language that I at least do find interesting.

So what's the mistake? Ironically, given the post's title, is that he is comparing apples to oranges, so to speak. The following table, which shows the time to sort an array of 10000 numbers 10 times in millisecond, illustrates the problem:

NSNumbernative integer
Jesse compared the two versions highlighted, so native Swift integers with Objective-C NSNumber object wrappers. All times are for binaries with optimization enabled, the machine was a 13" MBR with 2.9 GHz Intel Core i7 and 8GB of RAM. The integer sort was done using a C integer array and the system qsort() function. When you compare apples to apples, Objective-C has a roughly 2x edge with NSNumbers and is around 18% slower for native integers, at least when using qsort()

Why the 18% disadvantage? The qsort() function is made generically applicable to different types of arrays using a function pointer parameter for the comparison function that itself is parametrized using pointers to the elements to be compared. This means there is a per-comparison overhead of one function call and two pointer dereferences per comparison. That overhead overwhelms the actual comparison operation, which is a single machine instruction on most processors.

Swift, on the other hand, appears to produce a version of the sort function that is specialized to the integer type, with the comparison function inlined to the generated function so there is no function call or pointer dereference overhead. That is very clever and a Good Thing™ for performance. Sort of. The drawback is that this breaks separate compilation, because the functions actually have to be combined during the compile/link process every time it is used (I assume there is caching going on so we only got one per type combination).

Apart from making the compiler/linker slower , possibly significantly so (like C++ headers, though I presume they use LLVM bitcode to optimize the process), it also likely bloats the executable, causing cache and memory pressure. So it's a tradeoff, as usual, and while I think having the ability to specialize at compile-time is good, not being able to control it is not.

Objective-C doesn't have this ability to automagically adapt a function or method to parameters, if you want inlining the relationship has to be known at definition not at point of use. However, if the benefit of inlining is only 21% for the most primitive type, a machine integer, then it is clear that that the set of types for which compile-time specialization is beneficial at all is small.

Cocoa of course already provides specialized collection classes for the byte and unichar types, NSData and NSString respectively. I never quite understood why this wasn't extended to the other primitive types, particularly integer and float/double. On the other hand, the omission never bothered me much, I just implemented those classes myself in MPWFoundation. MPWRealArray even has support for DisplayPostscript binary object sequences, it's that old!

Both MPWRealArray and the corresponding MPWIntArray classes are small and fairly trivial to implement, and once I have them, using a specialized integer or real array is at least as convenient as using an NSArray, just a lot faster. They could also be quite a bit smaller than they are, sharing code either via subclassing or poor-man's generic programming via include files. Once I have a nice OO interface, I can swap out the implementation for something really quick like a dual-pivot integer sort I found in Java-land and adapted to C. (It is surprising just how similar they are at that level). With that sort, the test time drops to 0.56 ms, so 42% faster than the Swift version and almost twice as fast as the system qsort() function.

So the takeaway is that if you are using NSNumber objects in performance-sensitive code: stop. This is always a mistake. The native number types for Objective-C are int, float, double and friends, not NSNumber. After all, how do you perform arithmetic? Either directly on a primitive or by unboxing the NSNumber and then performing arithmetic on the primitive and then reboxing. Use primitive scalar types as much as possible where they make sense.

A second takeaway is that the question "which language is faster" doesn't really make sense, a more relevant and interesting question is "does this language make it hard/possible/easy to write fast code". Objective-C lets you write really fast code, if you want to, because it has the low-level chops and an understandable performance model. Swift so far can achieve reasonable performance at times, ludicrously bad at other times (especially with the optimizer turned off, which hardly fazes Objective-C), with as far as I can tell fairly little predictability or control. Having 10% faster (or slower) performance for code I don't particularly care about is not worth nearly as much as the knowledge that I can get the 1-5% of code that I do care about in shape no matter what. Swift is definitely not there yet, and given the direction it is taking I am not sure whether it will allow that kind of control, at least in comprehensible ways.

A third point is something more general about language. The whole argument that NSNumber and NSArray are "built in" somehow and int is not displays a lack of understanding of Objective-C that to me seems staggering. Even more so, the whole idea that you must only use what comes provided with Cocoa and are not allowed to build your own flies in the face of modern language design, throwing us back to the times of BASIC (Arthur Luehrmann, in the comments):

I had added graphics primitives to Dartmouth Basic around 1976 and developed an X-Y pen-plotter to carry out graphics commands mixed in with the text being sent to Teletype terminals.
The idea is that is that a language is a bundle of features, or to put it linguistically, a language is a list of words to be used as is.

Both C and Pascal introduced me to a new notion: that languages are not lists of words, but means of constructing your own words. For example, C did/does not have I/O as a language feature. I/O was just another set of functions placed in a library that you included just like any of your own functions/libraries. And there were two sets of them, the stdio package and the raw Unix I/O.

At around the same time I was introduced to both top-down and bottom-up programming. Both assume there is a recursive de-composition of the problem at hand (assuming the problem sufficiently complex to warrant it).

In bottom-up programming, you build up the vocabulary (the procedures and functions) that are necessary to succinctly describe your top-level problem, and then you describe your program in terms of that vocabulary you created. In top-down programming, you start at the other end and write your top-level program in terms of the vocabulary you wish you had to optimally describe the problem. Then you fill in the blanks.

In both, you define your own language to fit the problem, then you solve the problem using the language you defined. You would not add plotting commands to the language, you would either add plotting commands as a library or, if that were not possible, a way of adding plotting commands as a library. You would not look at whether plotting comes with the "standard library" or not. To quote Guy Steele in Growing a Language:

This is the nub of what I want to say. A language design can no longer be a thing. It must be a pattern—a pattern for growth—a pattern for growing the pattern for defining the patterns that programmers can use for their real work and their main goal.
So build your own libraries, your own abstractions. It's easy, fun and useful. It's the heart of Domain Driven Design, probably the most productive and effective software construction technique we as an industry have come up with to date. See what abstractions you can build easily and which ones are hard. Analyze the latter and you have started on the road to modern language design.

CORRECTION (June 4th 2015): I misattributed the Dartmouth BASIC quote to Cathy Doser, when the comment line on the Macintosh folklore entry clearly said Arthur Luehrmann. (Cathy's comment was a bit earlier).


Truelove said...

What about JavaScript (Node.js) vs Swift speed?

Adam R. Maxwell said...

This and the "map function in Cocoa" post are just depressing. I'm not doing much Cocoa/Obj-C development anymore, and it looks like the C++/Java developers have taken over Apple in the meantime, without taking time to understand and enjoy the benefits of Obj-C. Performance of NSNumbers in an array of a few thousand elements will be entirely adequate in the real world, in most cases, but I'd never use it as a performance test.