Showing posts with label Swift. Show all posts
Showing posts with label Swift. Show all posts

Wednesday, October 31, 2018

Even Easier Notification Sending with Notification Protocols in Objective-C

Having revisited Notification Protocols in order to ensure they work with Swift, I had another look at how to make them even more natural in Objective-C.

Fortunately, I found a way:


[@protocol(ModelDidChange) notify:@"Payload"];
This will send the ModelDidChange notification with the object parameter @"Payload". Note that the compiler will check that there is a Protocol called ModelDidChange and the runtime will verify that it is an actual notification protocol, raising an exception if that is not true. You can also omit the parameter:
[@protocol(ModelDidChange) notify];
In both cases, the amount of boilerplate or semantic noise is minimised, whereas the core of what is happening is put at the forefront. Compare this to the traditional way:
[[NSNotificationCenter defaultCenter] postNotificationName:@"ModelDidChange" object:@"Payload"]
Here, almost the entire expression is noise, with the relevant pieces being buried near the end of the expression as parameters. No checking is (or can be) performed to ensure that the argument string actually refers to a notification name.

Of course, you can replace that literal string with a string constant, but that constant is also not checked, and since it lives in a global namespace with all other identifiers, it needs quite a bit of prefixing to disambiguate:


[[NSNotificationCenter defaultCenter] postNotificationName:WLCoreNotificationModelDidChange object:@"Payload"]
Would it be easy to spot that this was supposed to be WLCoreNotificationModelWasDeleted?

The Macro PROTOCOL_NOTIFY() is removed, whereas the sendProtocolNotification() function is retained for Swift compatibility.

Notification Protocols from Swift

When I introduced Notification Protocols, I mentioned that they should be Usable from Swift.

This is code from a sample Swift Playground that shows how to do this. The Playground needs to have access to MPWFoundation, for example by being inside a Xcode workspace that includes it.


import Foundation
import MPWFoundation

@objc protocol ModelDidChange:MPWNotificationProtocol {
    func modelDidChange( payload:NSNotification );
}

class MyView : NSObject,ModelDidChange {
    override public init() {
        super.init()
        self.installProtocolNotifications()
    }
    func modelDidChange( payload:NSNotification ) {
        print("I was notified, self: \(self) payload: \"\(payload.object!)\"")
    }
}

let target1 = MyView()
let target2 = MyView()

sendProtocolNotification( ModelDidChange.self , "The Payload")

A brief walkthrough:
  1. We declare a ModelDidChange notification protocol.
  2. We indicate that it is a notification protocol by adopting MPWNotificationProtocol.
  3. The notification protocol has the message modelDidChange
  4. We declare that MyView conforms to ModelDidChange. This means we declaratively indicate that we receive ModelDidChange notifications, which will result in MyView instance being sent modelDidChange() messages.
  5. It also means that we have to implement modelDidChange(), which will be checked by the compiler.
  6. We need to call installProtocolNotifications() in order to activate the declared relationships.
  7. We use sendProtocolNotification() with the Protocol object as the argument and a payload.
  8. The fact that we need a protocol object instead of any old String gives us additional checking.
Enjoy!

Saturday, April 21, 2018

Even Simpler Notifications: Notification Messages

The Notification Protocols implementation I showed yesterday is actually just one in a series of implementations that came out of thinking about notifications.

The basic insight, and it is a pretty trivial one, is that a notification is just a broadcast message. So when we have real messaging, which we do, notifications consist of sending a message to a dynamically defined set of receivers.

How can we send a message, for example -modelDidChange: to a set of receivers? With HOM that's also trivial:


[[receivers do] modelDidChange:aPath];

So how do we define the set of receivers? Well, the collection is defined as all the objects listening to a particular notification, in this case this notification is the message. So instead of do, we need a notify: HOM that collects all the receivers registered for that message:


[[center notifyListenersOf:@selector(modelDidChange:)] modelDidChange:aPath];

But of course this is redundant, we already have the modelDidChange: message in the argument message of the HOM. So we can remove the argument from the notify HOM:


[[center notify] modelDidChange:aPath];

A trivial notification center now is just a dictionary of arrays of receivers, with the keys of the dictionary being the message names. With that dictionary called receiversByNotificationName, the notify HOM would look as follows:
DEFINE_HOM(notify, void)
{
    NSString *key = NSStringFromSelector([invocation selector]);
    NSArray *receivers = self.receiversByNotificationName[key];
    [[invocation do] invokeWithTarget:[receivers each]];
}

Of course, you can also integrate with NSNotificationCenter by using the converted message name as the NSNotification name.

You would then register for the notification with the following code:


-(void)registerNotificationMessage:(SEL)aMessage
{
    [[NSNotificationCenter defaultCenter] addObserver:self selector:aMessage name:NSStringFromSelector(aMessage) object:nil];
}

The HOM code is a little more involved, I'll leave that as an exercise for the reader.

So, by taking messaging seriously, we can get elegant notifications in even less code than Notification Protocols. The difference is that Notification Protocols actually let you declare adoption of a notification statically in your class's interface, and have that declaration be meaningful.


@interface MyGreatView : NSView <ModelDidChangeNotification>

Another aspect is that while HOM is almost certainly too much for poor old Swift to handle, it should be capable of dealing with Notification Protocols. It is!

Wednesday, October 7, 2015

Jitterdämmerung

So, Windows 10 has just been released, and with it Ahead Of Time (AOT) compilation feature .NET native. Google also just recently introduced ART for Android, and I just discovered that Oracle is planning an AOT compiler for mainstream Java.

With Apple doggedly sticking to Ahead of Time Compilation for Objective-C and now their new Swift, JavaScript is pretty much the last mainstream hold-out for JIT technology. And even in JavaScript, the state-of-the-art for achieving maximum performance appears to be asm.js, which largely eschews JIT techniques by acting as object-code in the browser represented in JavaScript for other languages to be AOT-compiled into.

I think this shift away from JITs is not a fluke but was inevitable, in fact the big question is why it has taken so long (probably industry inertia). The benefits were always less than advertised, the costs higher than anticipated. More importantly though, the inherent performance characteristics of JIT compilers don't match up well with most real world systems, and the shift to mobile has only made that discrepancy worse. Although JITs are not going to go away completely, they are fading into the sunset of a well-deserved retirement.

Advantages of JITs less than promised

I remember reading the copy of the IBM Systems Journal on Java Technology back in 2000, I think. It had a bunch of research articles describing super amazing VM technology with world-beating performance numbers. It also had a single real-world report from IBM's San Francisco project. In the real world, it turned out, performance was a bit more "mixed" as they say. In other words: it was terrible and they had to do an incredible amount of work for the system be even remotely usable.

There was also the experience of the New Typesetting System (NTS), a rewrite of TeX in Java. Performance was atrocious, the team took it with humor and chose a snail as their logo.

Nts at full speed One of the reasons for this less than stellar performance was that JITs were invented for highly dynamic languages such as Smalltalk and Self. In fact, the Java Hotspot VM can be traced in a direct line to Self via the Strongtalk system, whose creator Animorphic Systems was purchased by Sun in order to acquire the VM technology.

However, it turns out that one of the biggest benefits of JIT compilers in dynamic languages is figuring out the actual types of variables. This is a problem that is theoretically intractable (equivalent to the halting problem) and practically fiendishly difficult to do at compile time for a dynamic language. It is trivial to do at runtime, all you need to do is record the actual types as they fly by. If you are doing Polymorphic Inline Caching, just look at the contents of the caches after a while. It is also largely trivial to do for a statically typed language at compile time, because the types are right there in the source code!

So gathering information at runtime simply isn't as much of a benefit for languages such as C# and Java as it was for Self and Smalltalk.

Significant Costs

The runtime costs of a JIT are significant. The obvious cost is that the compiler has to be run alongside the program to be executed, so time compiling is not available for executing. Apart from the direct costs, this also means that your compiler is limited in the types of analyses and optimizations it can do. The impact is particularly severe on startup, so short-lived programs like for example the TeX/NTS are severely impacted and can often run slower overall than interpreted byte-code.

In order to mitigate this, you start having to have multiple compilers and heuristics for when to use which compilers. In other words: complexity increases dramatically, and you have only mitigated the problem somewhat, not solved it.

A less obvious cost is an increase in VM pressure, because the code-pages created by the JIT are "dirty", whereas executables paged in from disk are clean. Dirty pages have to be written to disk when memory is required, clean pages can simply be unmapped. On devices without a swap file like most smartphones, dirty vs. clean can mean the difference between a few unmapped pages that can be swapped in later and a process getting killed by the OS.

VM and cache pressure is generally considered a much more severe performance problem than a little extra CPU use, and often even than a lot of extra CPU use. Most CPUs today can multiply numbers in a single cycle, yet a single main memory access has the CPU stalled for a hundred cycles or more.

In fact, it could very well be that keeping non-performance-critical code as compact interpreted byte-code may actually be better than turning it into native code, as long as the code-density is higher.

Security risks

Having memory that is both writable and executable is a security risk. And forbidden on iOS, for example. The only exception is Apple's own JavaScript engine, so on iOS you simply can't run your own JITs.

Machines got faster

On the low-end of performance, machines have gotten so fast that pure interpreters are often fast enough for many tasks. Python is used for many tasks as is and PyPy isn't really taking the Python world by storm. Why? I am guessing it's because on today's machines, plain old interpreted Python is often fast enough. Same goes for Ruby: it's almost comically slow (in my measurements, serving http via Sinatra was almost 100 times slower than using libµhttp), yet even that is still 400 requests per second, exceeding the needs of the vast majority of web-sites including my own blog, which until recently didn't see 400 visitors per day.

The first JIT I am aware of was Peter Deutsch's PS (Portable Smalltalk), but only about a decade later Smalltalk was fine doing multi-media with just a byte-code interpreter. And native primitives.

Successful hybrids

The technique used by Squeak: interpreter + C primitives for heavy lifting, for example for multi-media or cryptography has been applied successfully in many different cases. This hybrid approach was described in detail by John Ousterhout in Scripting: Higher-Level Programming for the 21st Century: high level "scripting" languages are used to glue together high performance code written in "systems" languages. Examples include Numpy, but the ones I found most impressive were "computational steering" systems apparently used in supercomputing facilities such as Oak Ridge National Laboratories. Written in Tcl.

What's interesting with these hybrids is that JITs are being squeezed out at both ends: at the "scripting" level they are superfluous, at the "systems" level they are not sufficient. And I don't believe that this idea is only applicable to specialized domains, though there it is most noticeable. In fact, it seems to be an almost direct manifestation of the observations in Knuth's famous(ly misquoted) quip about "Premature Optimization":

Experience has shown (see [46], [51]) that most of the running time in non-IO-bound programs is concentrated in about 3 % of the source text.

[..] The conventional wisdom shared by many of today's software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can't debug or maintain their "optimized" programs. In established engineering disciplines a 12 % improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in soft- ware engineering. Of course I wouldn't bother making such optimizations on a one-shot job, but when it's a question of preparing quality programs, I don't want to restrict myself to tools that deny me such efficiencies.

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should not pass up our opportunities in that critical 3 %. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail. After working with such tools for seven years, I've become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.

[..]

(Most programs are probably only run once; and I suppose in such cases we needn't be too fussy about even the structure, much less the efficiency, as long as we are happy with the answers.) When efficiencies do matter, however, the good news is that usually only a very small fraction of the code is significantly involved.

For the 97%, a scripting language is often sufficient, whereas the critical 3% are both critical enough as well as small and isolated enough that hand-tuning is possible and worthwhile.

I agree with Ousterhout's critics who say that the split into scripting languages and systems languages is arbitrary, Objective-C for example combines that approach into a single language, though one that is very much a hybrid itself. The "Objective" part is very similar to a scripting language, despite the fact that it is compiled ahead of time, in both performance and ease/speed of development, the C part does the heavy lifting of a systems language. Alas, Apple has worked continuously and fairly successfully at destroying both of these aspects and turning the language into a bad caricature of Java. However, although the split is arbitrary, the competing and diverging requirements are real, see Erlang's split into a functional language in the small and an object-oriented language in the large.

Unpredictable performance model

The biggest problem I have with JITs is that their performance model is extremely unpredictable. First, you don't know when optimizations are going to kick in, or when extra compilation is going to make you slower. Second, predicting which bits of code will actually be optimized well is also hard and a moving target. Combine these two factors, and you get a performance model that is somewhere between unpredictable and intractable, and therefore at best statistical: on average, your code will be faster. Probably.

While there may be domains where this is acceptable, most of the domains where performance matters at all are not of this kind, they tend to be (soft) real time. In real time systems average performance matters not at all, predictably meeting your deadline does. As an example, delivering 80 frames in 1 ms each and 20 frames in 20 ms means for 480ms total time means failure (you missed your 60 fps target 20% of the time) whereas delivering 100 frames in 10 ms each means success (you met your 60 fps target 100% of the time), despite the fact that the first scenario is more than twice as fast on average.

I really learned this in the 90ies, when I was doing pre-press work and delivering highly optimized RIP and Postscript processing software. I was stunned when I heard about daily newspapers switching to pre-rendered, pre-screened bitmap images for their workflows. This is the most inefficient format imaginable for pre-press work, with each page typically taking around 140 MB of storage uncompressed, whereas the Postscript source would typically be between 1/10th and 1/1000th of the size. (And at the time, 140MB was a lot even for disk storage, never mind RAM or network capacity.

The advantage of pre-rendered bitmaps is that your average case is also your worst case. Once you have provisioned your infrastructure to handle this case, you know that your tech stack will be able to deliver your newspaper on time, no matter what the content. With Postscript (and later PDF) workflows, you average case is much better (and your best case ridiculously so), but you simply don't get any bonus points for delivering your newspaper early. You just get problems if it's late, and you are not allowed to average the two.

Eve could survive and be useful even if it were never faster than, say, Excel. The Eve IDE, on the other hand, can't afford to miss a frame paint. That means Imp must be not just fast but predictable - the nemesis of the SufficientlySmartCompiler.
I also saw this effect in play with Objective-C and C++ projects: despite the fact that Objective-C's primitive operations are generally more expensive, projects written in Objective-C often had better performance than comparable C++ projects, because the Objective-C's performance model was so much more simple, obvious and predictable.

When Apple was still pushing the Java bridge, Sun engineers did a stint at a WWDC to explain how to optimize Java code for the Hotspot JIT. It was comical. In order to write fast Java code, you effectively had to think of the assembler code that you wanted to get, then write the Java code that you thought might net that particular bit of machine code, taking into account the various limitations of the JIT. At that point, it is a lot easier to just write the damn assembly code. And vastly more predictable, what you write is what you get.

Modern JITs are capable of much more sophisticated transformations, but what the creators of these advanced optimizers don't realize is that they are making the problem worse rather than better. The more they do, the less predictable the code becomes.

The same, incidentally, applies to SufficentlySmart AOT compilers such as the one for the Swift language, though the problem is not quite as severe as with JITs because you don't have the dynamic component. All these things are well-intentioned but all-in-all counter-productive.

Conclusion

Although the idea of Just in Time Compilers was very good, their area of applicablity, which was always smaller than imagined and/or claimed, has shrunk ever further due to advances in technology, changing performance requirements and the realization that for most performance critical tasks, predictability is more important than average speed. They are therefore slowly being phased out in favor of simpler, generally faster and more predictable AOT compilers. Although they are unlikely to go away completely, their significance will be drastically diminished.

Alas, the idea that writing high-level code without any concessions to performance (often justified by misinterpreting or simply just misquoting Knuth) and then letting a sufficiently smart compiler fix it lives on. I don't think this approach to performance is viable, more predictability is needed and a language with a hybrid nature and the ability for the programmer to specify behavior-preserving transformations that alter the performance characteristics of code is probably the way to go for high-performance, high-productivity systems. More on that another time.

What do you think? Are JITs on the way out or am I on crack? Should we have a more manual way of influencing performance without completely rewriting code or just trusting the SmartCompiler?

Update: Nov. 13th 2017

The Mono Project has just announced that they are adding a byte-code interpreter: "We found that certain programs can run faster by being interpreted than being executed with the JIT engine."

Update: Mar. 25th 2021

Facebook has created an AOT compiler for JavaScript called Hermes. They report significant performance improvements and reductions in memory consumption (with of course some trade-offs for specific benchmarks).

Speaking of JavaScript, Google launhed Ignition in 2017, a JS bytecode interpreter for faster startup and better performance on low-memory devices. It also did much better in terms of pure performance than they anticipated.

So in 2019, Google introduced their JIT-Less JS engine. Not even an AOT compiler, just a pure interpreter. While seeing slowdowns in the 20-80% range on synthetic benchmarks, they report real-world performance only a few percent lower than their high-end, full-throttle TurboFan JIT.

Oh, how could I forget WebAssembly? "One of the reasons applications target WebAssembly is to execute on the web with predictable high performance. ".

I've also been told by People Who Know™ that the super-amazing Graal JIT VM for Java is primarily used for its AOT capabilities by customers. This is ironic, because the aptly named Graal pretty much is the Sufficiently Smart Compiler that was always more of a joke than a real goal...but they achieved it nonetheless, and as a JIT no less! And now that we have it, it turns out customers mostly don't care. Instead, they care about predictability, start-up performance and memory consumption.

Last not least, Apple's Rosetta 2 translator for running Intel binaries on ARM is an AOT binary-to-binary translator. And it is much faster, relatively, than the original Rosetta or comparable JIT-based binary translators, with translated Intel binaries clocking in at around 80% the speed of native ARM binaries. Combined with the amazing performance of the M1, this makes M1 Macs frequently faster than Intel Macs at running Intel Mac binaries. (Of course it also includes a (much slower) JIT component as a fallback when the AOT compiler can't figure things out. Like when the target program is itself a JIT... )

Update: Oct. 3rd 2021

It appears that 45% of CVEs for V8 and more than half of "in the wild" Chrome exploits abused a JIT bug, which is why Microsoft is experimenting with what they call SDSM (Super Duper Secure Mode). What's SDSM? JavaScript without the JIT.

Unsurprisingly at this point, performance is affected far less than one might have guessed, even for the benchmarks, and even less in real-world tasks: "Anecdotally, we find that users with JIT disabled rarely notice a difference in their daily browsing.".

Update: Mar. 30th 2023

To my eyes, this looks like startup costs, much of which will be jit startup costs.

HN user jigawatts appears to confirm my suspicion:

Teams is an entire web server written in an interpreted language compiled on the fly using a "Just in Time" optimiser (node.js + V8) -- coupled to a web browser that is basically an entire operating system. You're not "starting Teams.exe", you are deploying a server and booting an operating system. Seen in those terms, 9 seconds is actually pretty good. Seen in more... sane terms, showing 1 KB of text after 9 seconds is about a billion CPU instructions needed per character.
. It also looks to be the root problem, well one of the root problems with Visual Studio's launch time.

Update: Apr. 1st 2023

"One of the most annoying features of Julia is its latency: The laggy unresponsiveness of Julia after starting up and loading packages." -- Julia's latency: Past, present and future

I find it interesting that the term "JIT" is never mentioned in the post, I guess it's just taken as a given.

I am pretty sure I've missed some developments.

Wednesday, June 17, 2015

Protocol-Oriented Programming is Object-Oriented Programming

Crusty here, I just saw that my young friend Dave Abrahams gave a talk that was based on a little keyboard session we had just a short while ago. Really sharp fellow, you know, I am sure he'll go far someday, but that's the problem with young folk these days: they go rushing out telling everyone what they've learned when the lesson is only one third of the way through.

You see, I was trying to impart some wisdom on the fellow using the old Hegelian dialectic: thesis, antithesis, synthesis. And yes, I admit I wasn't completely honest with him, but I swear it was just a little white lie for a good educational cause. You see, I presented ADT (Abstract Data Type) programming to him and called it OOP. It's a little ruse I use from time to time, and decades of Java, C++ and C# have gone a long way to making it an easy one.

Thesis

So the thesis was simple: we don't need all that fancy shmancy OOP stuff, we can just use old fashioned structs 90% of the time. In fact, I was going to show him how easy things look in MC68K assembly language, with a few macros for dispatch, but then thought better of it, because he might have seen through my little educational ploy.

Of course, a lot of what I told him was nonsense, for example OOP isn't at all about subclassing, for example the guy who coined the term, Alan I think, wrote: "So I decided to leave out inheritance as a built-in feature until I understood it better." So not only isn't inheritance not the defining feature of OOP as I let on, it actually wasn't even in the original conception of the thing that was first called "object-oriented programming".

Absolute reliance on inheritance and therefore structural relationships is, in fact, a defining feature of ADT-oriented programming, particularly when strong type systems are involved. But more on that later. In fact, OOP best practices have always (since the late 80ies and early 90ies) called for composition to be used for known axes of customization, with inheritance used for refinement, when a component needs to be adapted in a more ad-hoc fashion. If that knowledge had filtered down to young turks writing their master's thesis back in what, 1997, you can rest assured that the distinction was well known and not exactly rocket science.

Anyway, I kept all that from Dave in order to really get him excited about the idea I was peddling to him, and it looks like I succeeded. Well, a bit too well, maybe.

Antithesis

Because the idea was really to first get him all excited about not needing OOP, and then turn around and show him that all the things I had just shown him in fact were OOP. And still are, as a matter of fact. Always have been. It's that sort of confusion of conflicting truth seeming ideas that gets the gray matter going. You know, "sound of one hand clapping" kind of stuff.

The reason I worked with him on a little graphics context example was, of course, that I had written a graphics context wrapper on top of CoreGraphics a good three years ago. In Objective-C. With a protocol defining the, er, protocol. It's called MPWDrawingContext and live on github, but I also wrote about it, showed how protocols combine with blocks to make CoreGraphics patterns easy and intuitive to use and how to combine this type of drawing context with a more advanced OO language to make live coding/drawing possible. And of course this is real live programming, not the "not-quite-instant replay" programming that is all that Swift playgrounds can provide.

The simple fact is that actual Object Oriented Programming is Protocol Oriented Programming, where Protocol means a set of messages that an object understands. In a true and pure object oriented language like Smalltalk, it is all that can be, because the only way to interact with an object is to send messages. Even if you do simple metaprogramming like checking the class, you are still sending a message. Checking for object identity? Sending a message. Doing more intrusive metaprogramming like "directly" accessing instance variables? Message. Control structures like if and while? Message. Creating ranges? Message. Iterating? Message. Comparing object hierarchies? I think you get the drift.

So all interacting is via messages, and the set of messages is a protocol. What does that make OO? Say it together: Protocol Oriented Programming.

Synthesis

So we don't need objects when we have POP, but at the same time POP is OOP. Confused? Well, that's kind of the point of a good dialectic argument.

One possible solution to the conflict could be that we don't need any of this stuff. C, FORTRAN and assembly were good enough for me, they should be good enough for you. And that's true to a large extent. Excellent software was written using these tools (and ones that are much, much worse!), and tooling is not the biggest factor determining success or failure of software projects.

On the other hand, if you want to look beyond what OOP has to offer, statically typed ADT programming is not the answer. It is the question that OOP answered. And statically typed ADT programming is not Protocol Oriented Programming, OOP is POP. Repeat after me: OOP is POP, POP is OOP.

To go beyond OOP, we actually need to go beyond it, not step back in time to the early 90ies, forget all we learned in the meantime and declare victory. My personal take is that our biggest challenges are in "the big", meaning programming in the large. How to connect components together in a meaningful, tractable and understandable fashion. Programming the components is, by and large, a solved problem, making it a tiny bit better may make us feel better, but it won't move the needle on productivity.

Making architecture malleable, user-definable and thus a first class citizen of our programming notation, now that is a worthwhile goal and challenge.

Crusty out.

As always, comments welcome here and on HN.

Sunday, June 7, 2015

Steve Jobs on Swift

No, there is no actual evidence of Steve commenting on Swift. However, he did say something about the road to sophisticated simplicity.

In short, at first you think the problem is easy because you don't understand it. Then you begin to understand the problem and everything becomes terribly complicated. Most people stop there, and Apple used to make fun of the ones that do.

To me this is the perfect visual illustration of the crescendo of special cases that is Swift.

The answer to this, according to Steve, is "[..] a few people keep burning the midnight oil and finally understand the underlying principles of the problem and come up with an elegantly simple solution for it. But very few people go the distance to get there."

Apple used to be very much about going that distance, and I don't think Swift lives up to that standard. That doesn't mean it's all bad or that it's completely irredeemable, there are good elements. But they stopped at sophisticated complexity. And "well, it's not all bad" is not exactly what Apple stands for or what we as Apple customers expect and, quite frankly, deserve. And had there been a Steve in Dev Tools, he would have said: do it again, this is not good enough.

As always, comments welcome here or on HN

Thursday, March 19, 2015

Why overload operators?

One of the many things that's been puzzling me for a long time is why operator overloading appears to be at the same time problematic and attractive in languages such as C++ and now Swift. I know I certainly feel the same way, it's somehow very cool to massage the language that way, but at the same time the thought of having everything redefined underneath me fills me with horror, and what little I've seen and heard of C++ with heavy overloading confirms that horror, except for very limited domains. What's really puzzling is that binary messages in Smalltalk, which are effectively the same feature (special characters like *,+ etc. can be used as message names taking a single argument), do not seem to not have either of these effects: they are neither particularly attractive to Smalltalk programmers, nor are their effects particularly worrisome. Odd.

Of course we simply don't have that problem in C or Objective-C: operators are built-in parts of the language, and neither the C part nor the Objective part has a comparable facility, which is a large part of the reason we don't have a useful number/magnitude hierarchy in Objective-C and numeric/array libraries are't that popular: writing [number1 multipliedBy:number2] is just too painful.

Some recent articles and talks that dealt with operator overloading in Apple's new Swift language just heightened my confusion. But as is often the case, that heightened confusion seems to have been the last bit of resistance that pushed through an insight.

Anyway, here is an example from NSHipster Matt Thompson's excellent post on Swift Operators, an operator for exponentiation wrapping the pow() function:

func ** (left: Double, right: Double) -> Double {
    return pow(left, right)
}
This is introduced as "the arithmetic operator found in many programming languages, but missing in Swift [is **]". Here is an example of the difference:
pow( left, right )
left ** right
pow( 2, 3 )
2 ** 3
How come this is seen as an improvement (and to me it does)? There are two candidates for what the difference might be: the fact that the operation is now written in infix notation and that it's using special characters. Do these two factors contribute evenly or is one more important than the other. Let's look at the same example in Smalltalk syntax, first with a normal keyword message and then with a binary message (Smalltalk uses raisedTo:, but let's stick with pow: here to make the comparison similar):
left pow: right.
left ** right.
2 pow: 3.
2 ** 3.
To my eyes at least, the binary-message version is no improvement over the keyword message, in fact it seems somewhat worse to me. So the attractiveness of infix notation appears to be a strong candidate for why operator overloading is desirable. Of course, having to use operator overloading to get infix notation is problematic, because special characters generally do not convey the meaning of the operation nearly as well as names, conventional arithmetic aside.

Note that dot notation for message sends/method calls does not really seem to have the same effect, even though it could technically also be considered an infix notation:

left.pow( right)
left ** right
2.pow( 3 )
2 ** 3
There is more anecdotal evidence. In Chris Eidhof's talk on functional swift, scrub to around the 10 minute mark. There you'll find the following code with some nested and curried function calls:
let result = colorOverlay( overlayColor)(blur(blurRadius)(image))
"This does not look to nice [..] it gets a bit unreadable, it's hard to see what's going on" is the quote.
let result = colorOverlay( overlayColor)(blur(blurRadius)(image))
Having a special compose function doesn't actually make it better
let myFilter = composeFilters(blur(blurRadius),colorOverlay(overlayColor))
let result = myFilter(image)
Infix to the rescue! Using the |>operator:
let myFilter = blur(blurRadius) |> colorOverlay(overlayColor)
let result = myFilter(image)
Chris is very fair-minded about this, he mentions that due to the special characters involved, you can't really infer what |> means from looking at the code, you have to know, and having many of these sorts of operators makes code effectively incomprehensible. Or as one twitter use put it: Like most things in engineering, it's a trade-off, though my guess is the trade-off would shift if we had infix without requiring non-sensical characters.

Built in
I do believe that there is another factor involved, one that is more psychologically subtle having to do with the idea of language as a (pre-defined) thing vs. a mechanism for building your own abstractions that I mentioned in my previous post on Swift performance.

In that post, I mentioned BASIC as the primary example of the former, a language as a collection of built-in features, with C and Pascal as (early) examples of the latter, languages as generic mechanisms for building your own features. However, those latter languages don't treat all constructs equally. Specifically, all the operators are built-in, not user-definable over -overridable. They also correspond closely to those operations that are built into the underlying hardware and map to single instructions in assembly language. In short: even in languages with a strong "user-defined" component, there is a hard line between "user-defined" and "built-in", and that line just happens to map almost 1:1 to the operator/function boundary.

Hackers don't like boundaries. Or rather: they love boundaries, the overcoming of. I'd say that overloaded operators are particularly attractive (to hacker mentalities, but that's probably most of us) in languages where this boundary between user-defined and built-in stuff exists, and therefore those overloaded operators let you cross that boundary and do things normally reserved for language implementors.

If you think this idea is too crazy, listen to John Siracusa, Guy English and Rene Ritchie discussing Swift language features and operator overloading on Debug Podcast Number 49, Siracusa Round 2, starting at 45:45. I've transcribed a bit below, but I really recommend you listen to the podcast, it's very good:

  • 45:45 Swift is a damning comment on C++ [benefits without the craziness]
  • 46:06 You can't do what Swift did [putting basic types in the standard library] without operator overloading. [That's actually not true, because in Swift the operators are just syntax -> but it is exactly the idea I talked about earlier]
  • 47:50 If you're going to add something like regular expressions to the language ... they should have some operators of their own. That's a perfect opportunity for operator overloading
  • 48:07 If you're going to add features to the language, like regular expressions or so [..] there is well-established syntax for this from other languages.
  • 48:40 ...or range operators. Lots of languages have range operators these days. Really it's just a function call with two different operands. [..] You're not trying to be clever All you're trying to do is make it natural to use features that exist in many of other languages. The thing about Swift is you don't have to add syntax to the language to do it. Because it's so malleable. If you're not adding a feature, like I'm adding regular expressions to the language. If you're not doing that, don't try to get clever. Consider the features as existing for the benefit of the expansion of the language, so that future features look natural in it and not bolted on even though technically everything is in a library. Don't think of it as in my end user code I'm going to come up with symbols that combine my types in novel ways, because what are you even doing there?
  • 50:17 if you have a language like this, you need new syntax and new behavior to make it feel natural. [new behavior strings array] and it has the whole struct thing. The basics of the language, the most basic things you can do, have to be different, look different and behave different for a modern language.
  • 51:52 "using operator overloading to add features to the language" [again, not actually true]
The interesting thing about this idea of a boundary between "language things" and "user things" is that it does not align with the "operators" and "named operators" in Swift, but apparently it still feels like it does, so we "extending the language" is seen as roughly equivalent to "adding some operators", with all the sound caveats that apply.

In fact, going back to Matt Thompson's article from above, it is kind of odd that he talks about exponentiation operator as missing from the language, when if fact the operation is available in the language. So if the operation crosses the boundary from function to operator, then and only then does it become part of the language.

In Smalltalk, on the other hand, the boundary has disappeared from view. It still exists in the form of primitives, but those are well hidden all over the class hierarchy and not something that is visible to the developer. So in addition to having infix notation available for named operations, Smalltalk doesn't have the notion of something being "part of the language" rather than "just the library" just because it uses non-sensical characters. Everything is part of the library, the library is the language and you can use names or special characters as appropriate, not because of asymmetries in the language.

And that's why operator overloading is a a thing even in languages like Swift, whereas it is a non-event in Smalltalk.

Wednesday, September 10, 2014

collect is what for does

I recently stumbled on Rob Napier's explanation of the map function in Swift. So I am reading along yadda yadda when suddenly I wake up and my eyes do a double take:
After years of begging for a map function in Cocoa [...]
Huh? I rub my eyes, probably just a slip up, but no, he continues:
In a generic language like Swift, “pattern” means there’s a probably a function hiding in there, so let’s pull out the part that doesn’t change and call it map:
Not sure what he means with a "generic language", but here's how we would implement a map function in Objective-C.
#import <Foundation/Foundation.h>

typedef id (*mappingfun)( id arg );

static id makeurl( NSString *domain ) {
  return [[[NSURL alloc] initWithScheme:@"http" host:domain path:@"/"] autorelease];
}

NSArray *map( NSArray *array, mappingfun theFun )
{
  NSMutableArray *result=[NSMutableArray array];
  for ( id object in array ) {
    id objresult=theFun( object );
    if ( objresult ) {
       [result addObject:objresult];
    }
  }
  return result;
}

int main(int argc, char *argv[]) {
  NSArray *source=@[ @"apple.com", @"objective.st", @"metaobject.com" ];
  NSLog(@"%@",map(source, makeurl ));
}

This is less than 7 non-empty lines of code for the mapping function, and took me less than 10 minutes to write in its entirety, including a trip to the kitchen for an extra cookie, recompiling 3 times and looking at the qsort(3) manpage because I just can't remember C function pointer declaration syntax (though it took me less time than usual, maybe I am learning?). So really, years of "begging" for something any mildly competent coder could whip up between bathroom breaks or during a lull in their twitter feed?

Or maybe we want a version with blocks instead? Another 2 minutes, because I am a klutz:


#import <Foundation/Foundation.h>

typedef id (^mappingblock)( id arg );

NSArray *map( NSArray *array, mappingblock theBlock )
{
  NSMutableArray *result=[NSMutableArray array];
  for ( id object in array ) {
    id objresult=theBlock( object );
    if ( objresult ) {
       [result addObject:objresult];
    }
  }
  return result;
}

int main(int argc, char *argv[]) {
  NSArray *source=@[ @"apple.com", @"objective.st", @"metaobject.com" ];
  NSLog(@"%@",map(source, ^id ( id domain ) {
    return [[[NSURL alloc] initWithScheme:@"http" host:domain path:@"/"] autorelease];
        }));
}

Of course, we've also had collect for a good decade or so, which turns the client code into the following, much more readable version (Objective-Smalltalk syntax):
NSURL collect URLWithScheme:'http' host:#('objective.st' 'metaobject.com') each path:'/'.

As I wrote in my previous post, we seem to be regressing to a mindset about computer languages that harkens back to the days of BASIC, where everything was baked into the language, and things not baked into the language or provided by the language vendor do not exist.

Rob goes on the write "The mapping could be performed in parallel [..]", for example like parcollect? And then "This is the heart of good functional programming." No. This is the heart of good programming.

Having processed that shock, I fly over a discussion of filter (select) and stumble over the next whopper:

It’s all about the types

Again...huh?? Our map implementation certainly didn't need (static) types for the list, and all the Smalltalkers and LISPers that have been gleefully using higher order techniques for 40-50 years without static types must also not have gotten the memo.

We [..] started to think about the power of functions to separate intent from implementation. [..] Soon we’ll explore some more of these transforming functions and see what they can do for us. Until then, stop mutating. Evolve.
All modern programming separates intent from implementation. Functions are a fairly limited and primitive way of doing so. Limiting power in this fashion can be useful, but please don't confuse the power of higher order programming with the limitations of functional programming, they are quite distinct.

Thursday, June 26, 2014

How to Swiftly Destroy a $370 Million Dollar Rocket with Overflow "Protection"

Apple's new Swift programming language has been heavily promoted as being a safer alternative to Objective-C, with a much stronger emphasis on static typing, for example. While I am dubious about the additional safety of static typing, I argue that it produces far more safyness than actual safety, this post is going to look at a different feature: overflow protection.

Overflow protection means that when an arithmetic operation on an integer exceeds the maximum value for that integer type, the value doesn't wrap around as it does on most CPU ALUs, and by extension C. Instead the program signals an exception and since Swift has no exception handling the program crashes.

While this looks a little like the James Bond anti theft device in For Your Eyes Only, which just blows up the car, the justification is that the program should be protected from operating on values that have become bogus. While I understand the reasoning, I am dubious that it really is safer to have every arithmetic operation on integers and every conversion from higher precision to lower in the entire program become a potential crash site, when before those operations could never crash (except for division by zero).

While it would be interesting to see what evidence there is for this argument, I can give at least one very prominent example against it. On June 4th 1996, ESA's brand new Ariane 5 rocket blew up during launch, due to a software fault, with a total loss of US $370 million, apparently one of the most expensive software faults in history. What was that software fault? An overflow protection exception triggered by a floating point to (short) integer conversion.

The resulting core-dump/diagnostics were then interpreted by the next program in line as valid data, causing effectively random steering inputs that caused the rocket to break up (and self destruct when it detected it was breaking up).

What's interesting is that almost any other handling of the overflow apart from raising an exception would have been OK and saved the mission and $370 million. Silently truncating/clamping the value to the maximum permissible range (which some in the static typing community incorrectly claim was the problem) would have worked perfectly and was the actual solution used for other values.

Even wraparound might have worked, at least there would have been only one bogus transition after which values would have been mostly monotonic again. Certainly better than effectively random values.

Ideally, the integer would have just overflowed into a higher precision as in a dynamic language such as Smalltalk, or even Postscript. Even JavaScript's somewhat wonky idea that all numbers are floats, but some just don't know it yet would have been better in this particular case. Considering the limitations of the hardware those languages weren't options, but nowadays the required computational horsepower is there.

In Ada you at least could potentially trap the exception generated by overflow, but in Swift the only protection is to manually trace back the inputs of every arithmetic operation on integers and enforce ranges for all possible combinations of inputs that do not result in that operation overflowing. For any program with external inputs and even slightly complex data paths and arithmetic, I would venture to say that that is next to impossible.

The only viable method for avoiding arithmetic overflow is to not use integer arithmetic with any external input, ever. Hello JavaScript!

You can try the Ada code with GNAT, or online:

with Ada.Text_IO,Ada.Integer_Text_IO;
use Ada.Text_IO,Ada.Integer_Text_IO;
procedure Hello is
  b : FLOAT;
  a : INTEGER;
begin
  b:=3123123.0;
  b:=b*b;
  a:=INTEGER(b);
  
  Put("a=");
  Put(a);
end Hello;
You can watch your Swift playground crash using the following code:

var a = 2
var b:Int16
for i in 1..100 {
  a=a*2
  println(a)
  b=Int16(a)
}
Note that neither the Ada nor Swift compilers have static checks that detect the overflow, even when all the information is statically available, for example in the following Swift code:

var a:UInt8
a = 254
println(a)
a += 2
println(a)
What's even worse is that the -Ofast flag will remove the checks, the integer will just wrap around. Optimization flags in general should not change visible program behavior, except for performance. Or maybe this is good, since it looks like we need that flag to get decent performance at all, we also remove the overflow crashers...

Discuss here or on Hacker News.