Saturday, June 28, 2014

Compiler Writers Gone Wild: ARC Madness

In this week's episode of CWGW: This can't possibly crash, yet crash it does.

In a project I am currently working on, the top crash for the last week or so has been the following NSOutlineView delegate method:

- (BOOL)outlineView:(NSOutlineView *)outlineView isGroupItem:(id)item
{
    return NO;
}
The team had been ignoring it, because it just didn't make any sense and they had other things to do. (Fortunately not too many other crashes, the app is pretty solid at this point). When they turned to me, I was also initially puzzled, because all this should do on x86 is stuff a zero into %eax and return. This cannot possibly crash[1], so everyone just assumed that the stack traces were off, as they frequently are.

Fortunately I had just looked at the project settings and noticed that we were compiling with -O0, so optimizations disabled, and my suspicion was that ARC was doing some unnecessary retaining. That suspicion turned out to be on the money, otool -Vt revealed that ARC had turned our innocuous return NO; into the following monstrosity:

-[SomeOutlineViewDelegeate outlineView:isGroupItem:]:
00000001001bfdb0        pushq   %rbp
00000001001bfdb1        movq    %rsp, %rbp
00000001001bfdb4        subq    $0x30, %rsp
00000001001bfdb8        leaq    -0x18(%rbp), %rax
00000001001bfdbc        movq    %rdi, -0x8(%rbp)
00000001001bfdc0        movq    %rsi, -0x10(%rbp)
00000001001bfdc4        movq    $0x0, -0x18(%rbp)
00000001001bfdcc        movq    %rax, %rdi
00000001001bfdcf        movq    %rdx, %rsi
00000001001bfdd2        movq    %rcx, -0x30(%rbp)
00000001001bfdd6        callq   0x10027dbaa             ## symbol stub for: _objc_storeStrong
00000001001bfddb        leaq    -0x20(%rbp), %rdi
00000001001bfddf        movq    $0x0, -0x20(%rbp)
00000001001bfde7        movq    -0x30(%rbp), %rsi
00000001001bfdeb        callq   0x10027dbaa             ## symbol stub for: _objc_storeStrong
00000001001bfdf0        leaq    -0x20(%rbp), %rdi
00000001001bfdf4        movabsq $0x0, %rsi
00000001001bfdfe        movl    $0x1, -0x24(%rbp)
00000001001bfe05        callq   0x10027dbaa             ## symbol stub for: _objc_storeStrong
00000001001bfe0a        movabsq $0x0, %rsi
00000001001bfe14        leaq    -0x18(%rbp), %rax
00000001001bfe18        movq    %rax, %rdi
00000001001bfe1b        callq   0x10027dbaa             ## symbol stub for: _objc_storeStrong
00000001001bfe20        movb    $0x0, %r8b
00000001001bfe23        movsbl  %r8b, %eax
00000001001bfe27        addq    $0x30, %rsp
00000001001bfe2b        popq    %rbp
00000001001bfe2c        retq
00000001001bfe2d        nopl    (%rax)
Yikes! Of course, this is how ARC works: it generates an insane amount of retains and releases (hidden inside objc_storeStrong()), then relies on a special optimization pass to remove the insanity and leave behind the necessary retains/releases. Turn on the "standard" optimization -Os and we get the following, much more reasonable result:
-[WLTaskListsDataSource outlineView:isGroupItem:]:
00000001000e958a        pushq   %rbp
00000001000e958b        movq    %rsp, %rbp
00000001000e958e        xorl    %eax, %eax
00000001000e9590        popq    %rbp
00000001000e9591        retq
Much better!

It isn't clear why those retains/releases were crashing, all the objects involved looked OK in the debugger, but at least we will no longer be puzzled by code that can't possibly crash...crashing, and therefore have a better chance of actually debugging it.

Another issue is performance. I just benchmarked the following equivalent program:

#import 

@interface Hi:NSObject {}
-(BOOL)doSomething:arg1 with:arg2;
@end

@implementation Hi
-(BOOL)doSomething:arg1 with:arg2
{
  return NO;
}
@end

int main( int argc, char *argv[] ) 
{
  Hi *hi=[Hi new];
  for (int i=0;i < 100000000; i++ ) {
    [hi doSomething:hi with:hi];
  } 
  return 0;
}
On my 13" MBPR, it runs in roughly 0.5 seconds with ARC disabled and in 13 seconds with ARC enabled. That's 26 time slower, meaning we now have a highly non-obvious performance model, where performance is extremely hard to predict and control. The simple and obvious performance model was one of the main reasons Objective-C code tended to actually be quite fast if even minimal effort was expended on performance, despite the fact that some parts of Objective-C aren't all that fast.

I find the approach of handing off all control and responsibility to the optimizer writers worrying. My worries stem partly from the fact that I've never actually had that work in the past. With ARC it also happens that the optimizer can't figure out a retain/release isn't needed, so you need to sprinkle a few __unsafe_unretains throughout your code (not many, but you need to figure out which).

Good optimization has always been something that needed a human touch (with automatic assistance), the message "just trust the compiler" doesn't resonate with me. Especially since, and this is the other part I am worried about, compiler optimizations have been getting crazier and crazier, clang for example thinks there is nothing wrong with producing two different values for de-referencing the same pointer (at the same time, with no stores in-between (source: http://blog.regehr.org/archives/767):

#include 
#include 
 
int main() {
  int *p = (int*)malloc(sizeof(int));
  int *q = (int*)realloc(p, sizeof(int));
  *p = 1;
  *q = 2;
  if (p == q)
    printf("%d %d\n", *p, *q);
}
I tested this with clang-600.0.34.4 on my machine and it also gives this non-sensical result: 1 2. There are more examples, which I also wrote about in my post cc -Osmartass. Of course, Swift moves further in this direction, with expensive default semantics and reliance on the compiler to remove the resulting glaring inefficiencies.

In what I've seen reported and tested myself, this approach results in differences between normal builds and -Ofast-optimized builds of more than a factor of 100. That's not close to being OK, and it makes code much harder to understand and optimize. My guess is that we will be looking at assembly a lot more when optimizing Swift than we ever did in Objective-C, and then scratching our heads as to why the optimizer didn't manage to optimize that particular piece of code.

I fondly remember the "Java optimization" WWDC sessions back when we were supposed to rewrite all our code in that particular new hotness. In essence, we were given a model of the capabilities of HotSpot's JIT optimizer, so in order to optimize code we had to know what the resulting generated code would be, what the optimizer could handle (not a lot), and then translate that information back into the original source code. At that point, it's simpler to just write yourself the assembly that you are trying to goad the JIT into emitting for you. Or portable Macro Assembler. Or object-oriented portable Macro Assembler.

Well it could if the stack had previously reached its limit
Discuss here or on HN

Thursday, June 26, 2014

How to Swiftly Destroy a $370 Million Dollar Rocket with Overflow "Protection"

Apple's new Swift programming language has been heavily promoted as being a safer alternative to Objective-C, with a much stronger emphasis on static typing, for example. While I am dubious about the additional safety of static typing, I argue that it produces far more safyness than actual safety, this post is going to look at a different feature: overflow protection.

Overflow protection means that when an arithmetic operation on an integer exceeds the maximum value for that integer type, the value doesn't wrap around as it does on most CPU ALUs, and by extension C. Instead the program signals an exception and since Swift has no exception handling the program crashes.

While this looks a little like the James Bond anti theft device in For Your Eyes Only, which just blows up the car, the justification is that the program should be protected from operating on values that have become bogus. While I understand the reasoning, I am dubious that it really is safer to have every arithmetic operation on integers and every conversion from higher precision to lower in the entire program become a potential crash site, when before those operations could never crash (except for division by zero).

While it would be interesting to see what evidence there is for this argument, I can give at least one very prominent example against it. On June 4th 1996, ESA's brand new Ariane 5 rocket blew up during launch, due to a software fault, with a total loss of US $370 million, apparently one of the most expensive software faults in history. What was that software fault? An overflow protection exception triggered by a floating point to (short) integer conversion.

The resulting core-dump/diagnostics were then interpreted by the next program in line as valid data, causing effectively random steering inputs that caused the rocket to break up (and self destruct when it detected it was breaking up).

What's interesting is that almost any other handling of the overflow apart from raising an exception would have been OK and saved the mission and $370 million. Silently truncating/clamping the value to the maximum permissible range (which some in the static typing community incorrectly claim was the problem) would have worked perfectly and was the actual solution used for other values.

Even wraparound might have worked, at least there would have been only one bogus transition after which values would have been mostly monotonic again. Certainly better than effectively random values.

Ideally, the integer would have just overflowed into a higher precision as in a dynamic language such as Smalltalk, or even Postscript. Even JavaScript's somewhat wonky idea that all numbers are floats, but some just don't know it yet would have been better in this particular case. Considering the limitations of the hardware those languages weren't options, but nowadays the required computational horsepower is there.

In Ada you at least could potentially trap the exception generated by overflow, but in Swift the only protection is to manually trace back the inputs of every arithmetic operation on integers and enforce ranges for all possible combinations of inputs that do not result in that operation overflowing. For any program with external inputs and even slightly complex data paths and arithmetic, I would venture to say that that is next to impossible.

The only viable method for avoiding arithmetic overflow is to not use integer arithmetic with any external input, ever. Hello JavaScript!

You can try the Ada code with GNAT, or online:

with Ada.Text_IO,Ada.Integer_Text_IO;
use Ada.Text_IO,Ada.Integer_Text_IO;
procedure Hello is
  b : FLOAT;
  a : INTEGER;
begin
  b:=3123123.0;
  b:=b*b;
  a:=INTEGER(b);
  
  Put("a=");
  Put(a);
end Hello;
You can watch your Swift playground crash using the following code:

var a = 2
var b:Int16
for i in 1..100 {
  a=a*2
  println(a)
  b=Int16(a)
}
Note that neither the Ada nor Swift compilers have static checks that detect the overflow, even when all the information is statically available, for example in the following Swift code:

var a:UInt8
a = 254
println(a)
a += 2
println(a)
What's even worse is that the -Ofast flag will remove the checks, the integer will just wrap around. Optimization flags in general should not change visible program behavior, except for performance. Or maybe this is good, since it looks like we need that flag to get decent performance at all, we also remove the overflow crashers...

Discuss here or on Hacker News.

Thursday, June 19, 2014

The Safyness of Static Typing

I like static (manifest) typing. This may come as a shock to those who have read other posts of mine, but it is true. I certainly am more comfortable with having a MPWType1FontInterper *interpreter rather than id interpreter. Much more comfortable, in fact, and this feeling extends to Xcode saying "0 warnings" and the clang static analyzer agreeing.

Safety

The question though is: are those feelings actually justified? The rhetoric on the subject is certainly strong, and very rigid/absolute. I recently had a Professor of Computer Science state unequivocally that anyone who doesn't use static typing should have their degree revoked. In a room full of Squeakers. And that's not an extreme or isolated case. Just about any discussion on the subject seems to quickly devolve into proponents of static typing claiming absolutely that dynamic typing invariably leads to programs that are steaming piles of bugs and crash left and right in production, whereas statically typed programs have their bugs caught by the compiler and are therefore safe and sound. In fact, Milner has supposedly made the claim that "well typed programs cannot go wrong". Hmmm...

That the compiler is capable of catching (some) bugs using static type checks is undeniably true. However, what is also obviously true is that not all bugs are type errors (for example, most of the 25 top software errors don't look like type errors to me, and neither goto fail; nor Heartbleed look like type errors either, and neither do the top errors in my different projects), so having the type-checker give our programs a clean bill of health does not make them bug free, it eliminates a certain type or class of bugs.

With that, we can take the question from the realm of religious zealotry to the realm of reasoned inquiry: how many bugs does static type checking catch?

Alas, this is not an easy question to answer, because we are looking for something that is not there. However, we can invert the question: what is the incidence of type-errors in dynamically typed programs, ones that do not benefit from the bug-removal that the static type system gives us and should therefore be steaming piles of those type errors?

With the advent of public source repositories, we now have a way of answering that question, and Robert Smallshire did the grunt work to come up with an answer: 2%.

The 2%

He talks about this some more in the talk titled The Unreasonable Effectiveness of Dynamic Typing, which I heartily recommend. However, this isn't the only source, for example there was a study with the following title: An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time (pdf), which found the following to be true in experiments: not only were development times significantly shorter on average with dynamically typed languages, so were debug times.

So all those nasty type errors were actually not having any negative impact on debug times, in fact the reverse was true. Which of course makes sense if the incidence of type errors is even near 2%, because then other factors are almost certain to dominate. Completely.

There are more studies, for example on generics: Do developers benefit from generic types?: an empirical comparison of generic and raw types in java. The authors found a documentation benefit, no error-fixing benefits and a negative impact on extensibility.

Others have said it more eloquently than I can:

Some people are completely religious about type systems and as a mathematician I love the idea of type systems, but nobody has ever come up with one that has enough scope. If you combine Simula and Lisp—Lisp didn’t have data structures, it had instances of objects—you would have a dynamic type system that would give you the range of expression you need.
Even stringent advocates of strong typing such as Uncle Bob Martin, with whom I sparred many a time on that and other subjects in comp.lang.object have now come around to this point of view: yeah, it's nice, maybe, but just not that important, and in fact he has actually reversed his position, as seen in this video of him debating static typing with Chad Fowler.

Truthiness and Safyness

What I find interesting is not so much whether one or the other is right/wronger/better/whatever, but rather the disparity between the vehemence of the rhetoric, at least on one side of the debate ("revoke degrees!", "can't go wrong!") and both the complete lack of empirical evidence for (there is some against) and the lack of magnitude of the effect.

Stephen Colbert coined the term "truthiness" for "a "truth" that a person making an argument or assertion claims to know intuitively 'from the gut' or because it 'feels right' without regard to evidence, logic, intellectual examination, or facts." [Wikipedia]

To me it looks like a similar effect is at play here: as I notice myself, it just feels so much safer if the computer tells you that there are no type errors. Especially if it is quite a bit of effort to get to that state, which it is. As I wrote, I notice that effect myself, despite the fact that I actually know the evidence is not there, and have been a long-time friendly skeptic.

So it looks like static typing is "safy": people just know intuitively that it must be safe, without regard to evidence. And that makes the debate both so heated and so impossible to decide rationally, just like the political debate on "truth" subjects.

Discuss on Hacker News.

Friday, June 6, 2014

Remove features for greater power, aka: Swift and Objective-C initializers

One of the things I find curious is how Apple's new Swift language rehashes mistakes that were made in other languages. Let's take construction or initializers.

Objective-C/Smalltalk

These are the rules for initializers in Smalltalk and Objective-C:
  1. An "initializer" is a normal method and a normal message send.
  2. There is no second rule.
There's really nothing more to it, the rest follows organically and naturally from this simple fact and various things you like to see happen. For example, is there a rule that you have to send the initial initializer (alloc or new) to the class? No there isn't, it's just a convenient and obvious place to put it since we don't have the instance yet and the class exists and is an obvious place to go to for instances of that class. However, we could just as well ask a different class to create the object for us.

The same goes with calling super. Yes, that's usually a good idea, because usually you want the superclass's behavior, but if you don't want the superclass's behavior, then don't call. Again, this is not a special rule for initializers, it usually follows from what you want to achieve. And sometimes it doesn't, just like with any other method you override: sometimes you call super, sometimes you do not.

The same goes for assigning the return value, doing the self=[super init]; dance. Again, this is not at all required by the language or the frameworks, although apparently it is a common misconception that it is, a misconception that is, IMHO, promoted by careless creation of "best practices" as "immutable rules", something I wrote about earlier when talking about the useless typing out of the id type in method declarations.

However, returning self and using the returned value is a useful convention, because it makes it possible for init methods to return a different object than what they started with (for example a specific subclass or a singleton).

Swift initializers

Apple's new Swift language has taken a page from the C++ and Java playbooks and made initialization a special case. Well, lots of special cases actually. The Swift book has 30 pages on initialization, and they aren't just illustration and explanation, they are dense with rules and special cases. For example:
  1. You can set a default value of a property in the variable definition.
  2. Or you can set the default value in an initializer.
  3. Designated initializers are now a first class language construct.
  4. Parameterized initializers have local and external parameter names, line methods.
  5. Except that the first parameter name is different and so Swift automatically provides and external parameter name for all arguments, which it doesn't with methods.
  6. Constant properties aren't constant in initializers.
  7. Swift creates a default initializer for both classes and structs.
  8. Swift also creates a default member wise initializer, but only for structs.
  9. Initializers can (only) call other initializers, but there are special rules for what is and is not allowed and these rules are different for structs and classes.
  10. Providing specialized initializers removes the automatically-provided default initializers.
  11. Initializers are different from other methods in that they are not inherited, usually.
  12. Except that there are specific circumstances where they are inherited.
  13. Confused yet? There's more!
  14. If your subclass provides no initializers itself, it inherits all the superclass's initializers
  15. If your subclass overrides all the superclass's designated initializers, it inherits all the convenience initializers (that's also a language construct). How does this not break if the superclass adds initializers? I think we've just re-invented the fragile-base-class problem.
  16. Oh, and you can initialize instance variables with the values returned by closures or functions.
Well, that was easy, but that's probably only because I missed a few. Having all these rules means that this new way of initialization is less powerful than the one before it, because all of these rules restrict the power that a general method has.

Particularly, it is not possible to substitute a different value or return nil to indicate failure to initialize, nor is it possible to call other methods (as far as I can tell).

To actually provide these useful features, we need something else:

  1. Use the Factory method pattern to actually do the powerful stuff you need to do ...
  2. ...which gets you back to where we were at the beginning with Objective-C or Smalltalk, namely sending a normal message.
Of course, we are familiar with this because both C++ and Java also have special constructor language features, plagued by the same problems. They are also the source of the Factory method pattern, at least as a separate "pattern". Smalltalk and Objective-C simply made that pattern the default for object creation, in fact Brad Cox called classes "Factory Objects", long long before the GOF patterns book.

So with all due respect to Michael A. Jackson:

First rule of baking programming conventions into the language: Don't do it!
The second rule of baking programming conventions into the language (experts only): Don't do it yet!


p.s.: I have filed a radar, please dup
p.p.s.: HN