Monday, April 14, 2014

cc -Osmartass

I have to admit I am a bit startled to see pople seriously (?) advocate exploitation of "undefined behavior" in the C standard to just eliminate that code altogether, arguing that undefined means literally anything is OK. I've certainly seen it justified many times. Apart from being awful, this idea smacks of hubris on part of the compiler writers.

The job of the compiler is to do the best job it can at turning the programmer's intent into executable machine code, as expressed by the program. It is not to show how clever the optimizer writer is, how good at lawyering the language standard, or to wring out a 0.1% performance improvement on <benchmark-of-choice>, at least not when it conflicts with the primary goal.

For let's not pretend that these optimizations are actually useful or significant: Proebsting's law shows that all compiler optimizations have been at best 1/10th as effective at improving performance as hardware advances, and recent research suggests that even that may be optimistic.

That doesn't mean that I don't like my factor 2 or 3 improvement in code performance for codes where basic optimizations apply. But almost all of those performance gains come at the lowest levels of optimization, the more sophisticated stuff just doesn't bring much if any additional benefit. (There's a reason Apple recommends -Os and not -O3 as default). So don't get ahead of yourselves, other non-compiler optimizations can often achieve 2-3 orders of magnitude improvement, and for a lot of Objective-C code, for example, the compiler's optimizations barely register at all. Again: perspective!

Furthermore, the purpose of "undefined behavior" was (not sure it still is) to be inclusive, so for example compilers for machines with slightly odd architectures could still be called ANSI-C without having to do unnatural things on that architecture in order to conform to over-specification. Sometimes, undefined behavior is needed for programs to work.

So when there is integer overflow, for example, that's not a license to silently perform dead code elimination at certain optimization levels, it's license to do the natural thing on the platform, which on most platforms these days is let the integer overflow, because that is what a C programmer is likely to expect. In addition, feel free to emit a warning. The same goes for optimizing away an out of bounds array access that is intended to terminate a loop. If you are smart enough to figure out the out-of-bounds access, warn about it and then proceed to emit the code. Eliminating the check and turning a terminating loop into an infinite loop is never the right answer.

So please don't do this, you're not producing value: those optimizations will cease to "help" when programmers "fix" their code. You are also not producing value: any additional gains are extremely modest compared to the cost. So please stop doing, certainly stop doing it on purpose, and please carefully evaluate the cost/benefit ratio when introducing optimizations that cause this to happen as a side effect...and then don't. Or do, and label them appropriately.

Saturday, April 12, 2014

Sophisticated Simplicity

This quote from Steve Jobs is one that's been an inspiration to me for some time:
[...] when you first attack a problem it seems really simple because you don't understand it. Then when you start to really understand it, you come up with these very complicated solutions because it's really hairy. Most people stop there. But a few people keep burning the midnight oil and finally understand the underlying principles of the problem and come up with an elegantly simple solution for it. But very few people go the distance to get there.
In other words:
  1. Naive Simplicity
  2. Sophisticated Complexity
  3. Sophisticated Simplicity
It's from the February 1984 Byte Interview introducing the Macintosh.

UPDATE: Well, it seems that Heinelein got there first:

Every technology goes through three stages: first, a crudely simple and quite unsatisfactory gadget; second, an enormously complicated group of gadgets designed to overcome the shortcomings of the original and achieving thereby somewhat satisfactory performance through extremely complex compromise; third, a final stage of smooth simplicity and efficient performance [..]
(From the book Rolling Stones, 1952)