Monday, November 11, 2019

What Alan Kay Got Wrong About Objects

One of the anonymous reviewers of my recently published Storage Combinators paper (pdf) complained that hiding disk-based, remote, and local abstractions behind a common interface was a bad idea, citing Jim Waldo's A Note on Distributed Computing.

Having read both this and the related 8 Fallacies of Distributed Computing a while back, I didn't see how this would apply, and re-reading confirmed my vague recollections: these are about the problems of scaling things up from the local case to the distributed case, whereas Storage Combinators and In-Process REST are about scaling things down from the distributed case to the local case. Particularly the Waldo paper is also very specifically about objects and messages, REST is a different beast.

And of course scaling things down happens to be time-honored tradtition with a pretty good track record:

In computer terms, Smalltalk is a recursion on the notion of computer itself. Instead of dividing "computer stuff" into things each less strong than the whole—like data structures, procedures, and functions which are the usual paraphernalia of programming languages—each Smalltalk object is a recursion on the entire possibilities of the computer. Thus its semantics are a bit like having thousands and thousands of computers all hooked together by a very fast network.
Mind you, I think this is absolutely brilliant: in order to get something that will scale up, you simply start with something large and then scale it down!.

But of course, this actually did not happen. As we all experienced scaling local objects and messaging up to the distributed case did not (CORBA, SOAP,...), and as Waldo explains, cannot, in fact, work. What gives?

My guess is that the method described wasn't actually used: when Alan came up with his version of objects, there were no networks with thousands of computers. And so Alan could not actually look at how they communicated, he had to imagine it, it was a Gedankenexperiment. And thus objects and messages were not a scaled-down version of an actual larger thing, they were a scaled down version of an imagined larger thing.

Today, we do have a large network of computers, with not just thousands but billions of nodes. And they communicate via HTTP using the REST architectural style, not via distributed objects and messages.

So maybe if we took that communication model and scaled it down, we might be able to do even better than objects and messages, which already did pretty brilliantly. Hence In-Process REST, Polymorphic Identifiers and Storage Combinators, and yes, the results look pretty good so far!

The big idea is "messaging" -- that is what the kernal of Smalltalk/Squeak is all about (and it's something that was never quite completed in our Xerox PARC phase). The Japanese have a small word -- ma -- for "that which is in between" -- perhaps the nearest English equivalent is "interstitial". The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. Think of the internet -- to live, it (a) has to allow many different kinds of ideas and realizations that are beyond any single standard and (b) to allow varying degrees of safe interoperability between these ideas.

So of course Alan is right after all, just not about objects and messages, which are too specific: "ma", or "interstitialness" or "connector" is the big idea, messaging is just one incarnation of that idea.

5 comments:

SD said...

An HTTP request *is* a message.

Anonymous said...

The part I'm not following on is the REST vs Objects and Messages..

Where REST is just leveraging HTTP messages for your API.
A HTTP server is an Object
A HTTP requests and responses are just Messages.

So REST *is* sending messages to objects.

Marcel Weiher said...

Good points. In addition, the underlying TCP/IP protocols are also sending messages, and of course the In-Process REST APIs are also implemented as messages.

And if you go further, it's all Lambda Calculus, or Turing machines. Or NAND gates.

The crucial point is that the "is implemented by" relation is not the identity function. (In-Procses) REST built on top of messaging, just as it is built on top of NAND gates, but it imposes additional constraints, and these constraints are crucial.

Just like Unix pipes and filters are built on top of C functions, but require these C functions to be used in very particular ways in oder to participate in the pipes/filter model.

Gambler said...

>Today, we do have a large network of computers, with not just thousands but billions of nodes. And they communicate via HTTP using the REST architectural style, not via distributed objects and messages.

You're conflating the Web and REST-based web services. The Web is a content delivery system with billions of nodes. Attempts to implement actual communication networks on top of it resulted in laughably bad architecture with layers upon layers of ugly ducktape masking security holes, performance bottleneck and missing crucial features.

The situation is so bad that to hold the whole thing together Google is currently redesigning HTTP for the second time and fusing it with the next two protocol layers. It's a giant, ugly mess. If anything, the whole ordeal shows that Kay was right in his criticism of the Web in mid-90s.

As far as REST, I have seen maybe two real-life web services that actually tried to implement "hypermedia" and "state transfer" ideas from Fielding's dissertation. 98% of so-called RESTful APIs out there are just RPC via JSON over HTTP.

Marcel Weiher said...

@Gambler,

interesting perspective, but certainly the Google HTTP redesigns are a red herring at best as they don't affect the architectural model whatsoever. They are pure performance optimisations, and those performance optimisations are squarely aimed at the web-browser use case and of a magnitude that matters mostly for operators working at google scale.

Given that, the difference you claim between a content delivery system (does that not com and a communication network (what sort of communication?) at the very least requires some crisper definitions.

Finally, you seem to be fixated on the HATEOS aspect, which I have to admit I don't see as very relevant for machine - machine communication, as you can either fix the URIs or fix the keys where the expected URIs are to be found. Potayto / potahto. And unlike with a human observer, unexpected locations tend to not make much sense.

The big one you seem to be completely missing is the separation between URIs and verbs, and the fact that there is a very small and largely well-defined set of verbs. That's where the magic happens.