Tuesday, March 19, 2019

LISP Macros, Delayed Evaluation and the Evolution of Smalltalk

At a recent Clojure Berlin Meetup, Veit Heller gave an interesting talk on Macros in Clojure. The meetup was very enjoyable, and the talk also brought me a little closer to understanding the relationship between functions and macros and a bit of Smalltalk evolution that had so far eluded me.

The question, which has been bugging me for some time, is when do we actually need metaprogramming facilities like macros, and why? After all, we already have functions and methods for capturing and extracting common functionality. A facile answer is that "Macros extend the language", but so do functions, in their way. Another answer is that you have to use Macros when you can't make progress any other way, but that doesn't really answer the question either.

The reason the question is relevant is, of course, that although it is fun to play around with powerful mechanisms, we should always use the least powerful mechanism that will accomplish our goal, as it will be easier to program with, easier to understand, easier to analyse and build tools for, and easier to maintain.

Anyway, the answer in this case seemed to be that macros were needed in order to "delay evaluation", to send unevaluated parameters to the macros. A quick question to the presenter confirmed that this was the case for most of the examples. Which begs the question: if we had a generic mechanism for delaying evluation, could we have used plain functions (or methods) instead, and indeed the answer was that this was the case.

One of the examples was a way to build your own if, which most languages have built in, but Smalltalk famously implements in the class library: there is an ifTrue:ifFalse: message that takes two blocks (closures) as parameters. The True class evaluates the first block parameter and ignores the second, the False class evaluates the second block parameter and ignores the first.

The Clojure macro example worked almost exactly the same way, but where Smalltalk uses blocks to delay evaluation, the example used macros. So where LISP might use macros, Smalltalk uses blocks. That macros and blocks might be related was new to me, and took me a while to process. Once I had processed it, a bit of Smalltalk history that I had always struggled with, this bit about Smalltalk-76, suddenly made sense:



Why did it "have to" provide such a mechanism? It doesn't say. It says this mechanism was replaced by the equivalent blocks, but blocks/anonymous functions seem quite different from alternate argument-passing mechanisms. Huh?

With this new insight, it suddenly makes sense. Smalltalk-72 just had a token-stream, there were no "arguments" as such, the new method just took over parsing the token stream and picked up the paramters from there. In a sense, the ultimate macro system and ultimately powerful, but also quite unusable, incomprehensible, unmaintainable and not compilable. In that system, "arguments" are per-definition unevaluated and so you can do all the macro-like magic you want.

Dan's Smalltalk-76 effort was largely about compiling for better performance and having a stable, comprehensible and composable syntax. But there are times you still need unevaluated arguments, for example if you want to implement an if that only evaluates one of its branches, not both of them, without baking it into the language. Smalltalk did not have a macro mechanism, and it no longer had the Smalltalk-72 token-stream where un-evaluated "arguments" came for free, so yes, there "had" to be some sort of mechanism for unevaluated arguments.

Hence the open-colon syntax.

And we have a progression of: Smalltalk-72 token stream → Smalltalk-76 open colon parameters → Smalltalk-80 blocks.
All serving the purpose of enabling macro-like capabilities without actually having macros by providing a general language facility for passing un-evaluated parameters.

Aha!

Friday, March 8, 2019

Software-ICs, Binary Compatibility, and Objective-Swift

Swift recently achieved ABI stability, meaning that we can now ship Swift binaries without having to ship the corresponding Swift libraries. While it's been a long time coming, it's also great to have finally reached this point. However, it turns out that this does not mean you can reasonably ship binary Swift frameworks, for reasons described very well by Peter Steinberger of PSPDFKit and the good folks at instabug.

To reach this not-quite-there-yet state took almost 5 years, which is pretty much the total time NeXT shipped their hardware, and it mirrors the state with C++, which is still not generally suitable for binary distribution of libraries. Objective-C didn't have these problems, and as it turns out this is not a coincidence.

Software ICs

Objective-C was created specifically to implement the concept of Software-ICs. I briefly referenced the concept in a previous article, and also mentioned its relationship to the scripted components pattern, but the comments indicated that this is no longer a concept people are familiar with.

As the name suggests the intention was to bring the benefits the hardware world had reaped from the introduction of the Integrated Circuits to the software world.

It is probably hard to overstate the importance of ICs to the development of the computer industry. Instead of assembling computers from discrete components, you could now put entire subsystem onto one component, and then compose these subsystems to form systems. The interfaces are standardised pins, and the relationship between the outside interface and the complexity hidden inside can be staggering. Although the socket of the CPU I am writing is a beast, with 1151 pins, the chip inside has a staggering 2.1 billion transistors. With a ratio of one million to one, that's a very deep interface, even if you disregard the fact that the bulk of those pins are actually voltage supply and ground pins.

The important point is that you do not have to, and in fact cannot, look inside the IC. You get the pins, very much a binary interface, and the documentation, a specification sheet. With Software-ICs, the idea was the same: you get a binary, the interface and a specification sheet. Here are two BYTE articles that describe the concepts:


A lot of what they write seems quaint now, for example a MailFolder that inherits from Array(!), but the concepts are very relevant, particularly with a couple of decades worth of perspective and the new circumstances we find ourselves in.

Although the authors pretty much equate Software-ICs with objects and object-oriented programming, it is a slightly different form of object-oriented programming than the one we mostly use today. They do write about object/message programming, similar to Alan Kay's note that 'The big idea is "messaging"'.

With messaging as the interconnect, similar to Unix pipes, our interfaces are sufficiently well-defined and dynamic that we really can deliver our Software-ICs in binary form and be compatible, something our more static languages like C++ and Swift struggle with.

Objective-C is middleware with language features.

Message Oriented Middleware

Virtually all systems based on static languages eventually grow an additional, separate and more dynamic component mechanism. Windows has COM, IBM has SOM, Qt has signals and slots and the meta-object system, Be had BMessages etc.

In fact, the problem of binary compatibility of C++ objects was one of the reasons for creating COM:

Unlike C++, COM provides a stable application binary interface (ABI) that does not change between compiler releases.
COM has been incredibly successful, it enables(-ed?) much of the Windows and Office ecosystems. In fact, there is even a COM implementation on macOS: CFPlugin, part of CoreFoundation.

CFPlugIn provides a standard architecture for application extensions. With CFPlugIn, you can design your application as a host framework that uses a set of executable code modules called plug-ins to provide certain well-defined areas of functionality. This approach allows third-party developers to add features to your application without requiring access to your source code. You can also bundle together plug-ins for multiple platforms and let CFPlugIn transparently load the appropriate plug-in at runtime. You can use CFPlugIn to add plug-in capability to, or write a plug-in for, your application.
That COM implementation is still in use, for example for writing Spotlight importers. However, there are, er, issues:
Creating a new Spotlight importer is tricky because they are based on CFPlugIn, and CFPlugIn is… well, how to say this diplomatically?… super ugly )-: One option here is to use Xcode 9 to create your plug-in based on the old template. Honestly though, I don’t recommend that because the old template… again, diplomatically… well, let’s just say that the old template lets the true nature of CFPlugIn shine through! (-:
Having written both Spotlight importers and even some COM component on Windows (I think it was just for testing), I can confirm that COM's success is not due to the elegance or ease-of-use of the implementation, but due to the fact that having an interoperable, stable binary interface is incredibly enabling for a platform.

That said, all this talk of COM is a bit confusing, because we already have NSBundle.

Apple uses bundles to represent apps, frameworks, plug-ins, and many other specific types of content.
So NSBundle already does everything a CFPlugin does and a lot more, but is really just a tiny wrapper around a directory that may contain a dynamic shared library. All the interfacing, introspection and binary compatibility features come automagically with Objective-C. In fact, NeXT had a Windows product called d'OLE that pretty automagically turned Objective-C libraries into COM-comptible OLE servers (.NET has similar capabilities). Again, this is not a coincidence, the Software-IC concept that Objective-C is based on is predicated on exactly this sort of interoperation scenario.

Objective-C is middleware with language features.

Frameworks and Microservices

To me, the idea of a Software-IC is actually somewhat higher level than a single object, I tend to see it at the level of a framework, which just so happens to provide all the trappings of a self-contained Software-IC: a binary, headers to define the interface and hopefully some documentation, which could even be provided in the Resources directory of the bundle. In addition, frameworks are instances of NSBundle, so they aren't limited to being linked into an application, they can also be loaded dynamically.

I use that capability in Objective-Smalltalk, particularly together with the stsh the Smalltalk Scripting Shell. By loading frameworks, this shell can easily be transformed into an application-specific scripting language. An example of this is pdfsh, a shell for examining an manipulating PDF files using EGOS, the Extensible Graphical Object System.


#!/usr/local/bin/stsh
#-<void>pdfsh:<ref>file
framework:EGOS_Cocoa load.
pdf := MPWPDFDocument alloc initWithData: file value.
shell runInteractiveLoop

The same binary framework is also used in in PdfCompress, PostView and BookLightning. With this framework, my record for creating a drag-and-drop applicaton to do something useful with a PDF file was 5 minutes, and the only reason I was so slow was that I thought I had remembered the PDF dictionary entry...and had not.

Framework-oriented programming is awesome, alas it was very much deprecated by Apple for quite some time, in fact even impossible on iOS until dynamic libraries were allowed. Even now, though, the idea is that you create an app, which consists of all the source-code needed to create it (exception: Apple code!), even if some of that code may be organised into framework units that otherwise don't have much meaning to the build.

Apps, however are not Software-ICs, they aren't the right packaging technology for reuse (AppleScript notwithstanding). And so iOS and macOS development shops routinely get themselves into big messes, also known as the Big Ball of Mud architectural pattern.

Of course, there are reasons that things didn't quite work out the way we would have liked. Certainly Apple's initial Mac OS X System Architecture book showed a much more flexible arrangement, with groups of applications able to share a set of frameworks, for example. However, DLL hell is a thing, and so we got a much more restricted approach where every app is a little fortress and frameworks in general and binary frameworks in particular are really something for Apple to provide and for the rest to use. However, the fact that we didn't manage to get this right doesn't mean that the need went away.

Swift has been making this worse, by strongly "suggesting" that everything be compiled together and leading to such wonderful oxymorons as "whole module optimisation in debug mode", meaning without optimisation. That and not having a binary modularity story for going on half a decade. The reason for compiling whole modules together is that the modularity mechanism is, practically speaking, very much source-code based, with generics and specialisation etc. (Ironically, Swift also does some pretty crazy things to enable separate compilation, but that hasn't really panned out so far).

On the other hand, Swift's compiler is so slow that teams are rediscovering forms of framework-oriented programming as a self-defense mechanism. In order to get feedback cycles down from ludicrously bad to just plain awful, they split up their projects into independent frameworks that they then compile and run independently during development. So in a somewhat roundabout way, Swift is encouraging good development practices.

I find it somewhat interesting that the industry is rediscovering variants of the Software-IC, in this case on the backend in the form of Microservices. Why do I say that Microservices are a form of Software-IC? Well, they are a binary unit of deployability, fairly loosely coupled and dynamically typed. In fact, Fred George, one of the people who came up with the idea refers to them as Smalltalk objects:

Of course, there are issues with this approach, one being that reliable method calls are replaced with unreliable network calls. Stepping back for a second should make it clear that the purported benefits of Microservices also largely apply to Software-ICs. At least real Software-ICs. Objective-C made the mistake of equating Software-ICs with objects, and while the concepts are similar with quite a bit of overlap, they are not quite the same. You certainly can use Objective-C to build and connect Software-ICs if you want to do that. It will also help you in this endeavour, but of course you have to know that this is something you want. It doesn't do this automatically and over time the usage of Objective-C has shifted to just a regular old object-oriented language, something it is OK but not that brilliant at.

Interoperability

One of the interesting aspects of Microservices is that they are language-agnostic, it doesn't matter what language a particular services is written in, as long as they can somehow communicate via HTTP(S). This is another similarity to Software-ICs (and other middleware such as COM, SOM, etc.): there is a fairly narrowly defined, simple interface, and as long as you can somehow service that interface, you can play.

Microservices are pretty good at this, Unix filters probably the most extreme example and just about every language and every kind of application on Windows can talk to and via COM. NeXT only ever sold 50000 computers, but in a short number of years the NeXT community had bridges to just about every language imaginable. There were a number of Objective- languages, including Objective-Fortran. Apple alone has around 140K employees (though probably a large number of those in retail), and there are over 2.8 million iOS developers, yet the only language integration for Swift I know of is the Python support, and that took significant effort, compiler changes and the original Swift creator, Chris Lattner.

This is not a coincidence. Swift is designed as a programming language, not as middleware with language features. Therefore its modularity features are an add-on to the language, and try to transport the full richness of that programming model. And Swift's programming model is very rich.

Objective-Swift

The middlewares I've talked about use the opposite approach, from the outside in. For SOM, it is described as such:
SOM allows classes of objects to be defined in one programming language and used in another, and it allows libraries of such classes to be updated without requiring client code to be recompiled.
So you define interfaces separately from their implementations. I am guessing this is part of the reason we have @interface in Objective-C. Having to write things down twice can be a pain (and I've worked on projects that auto-generated Objective-C headers from implementation files), but having a concrete manifestation of the interface that precedes the implementation is also very valuable. (One of the reasons TDD is so useful is that it also forces you to think about the interface to your object before you implement it).

In Swift, a class is a single implementation-focused entity, with its interface at best a second-class and second-order effect. This makes writing components more convenient (no need to auto-generate headers...), but connecting components is more complicated.

Which brings us back to that other complication, the lack of stable binary compatibility for libraries and frameworks. One consequence of this is to write frameworks exclusively in Objective-C, which was particularly necessary before ABI stability had been reached. The other workaround, if you have Swift code, is to have an Objective-C wrapper as an interface to your framework. The fact that Swift interoperates transparently with the Objective-C runtime makes this fairly straightforward.

Did I mention that Objective-C is middleware with language features?

So maybe this supposed "workaround" is actually the solution? Have Objective-C or Objective- as our message-oriented middleware, the way it was always intended? Maybe with a bit of a tweak so that it loses most of the C legacy and gains support for pipes and filters, REST/Microservices and other architectural patterns?

Just sayin'.