Saturday, April 27, 2019

What's Going Down at the TIOBE Index? Swift, Surprisingly

Last month I expressed my surprise at the fact that Objective-C was recovering its rankings in the TIOBE index, not quite to the lofty #3 spot it enjoyed a while ago, but to a solid 10, once again surpassing Swift, which had dropped to #17.

This month, Swift has dropped to #19 almost looking like it's going to fall out of the top 20 altogether.

Strange times.

Wednesday, April 3, 2019

Accessors Have Message Obsession

Just came across and older post by Nat Pryce on Message Obsession, which he describes as the opposite end of a spectrum from Primitive Obsession.

The example is a set of commands for moving a robot:


-moveNorth.
-moveSouth.
-moveWest.
-moveEast.

Although the duplication is annoying, the bigger problem is that there are two things, the verb "move" and a direction argument, mushed together into the message name. And that can cause further problems down the road:
"It’s awkward to work with this kind of interface because you can’t pass around, store or perform calculations on the direction at all."

He argues, convincingly IMHO, that the combined messages should be replaced by a single move: message with a separate direction argument. The current fashion would be to make direction an enum, but he (wisely, IMHO) turns it into a class that can encode different directions:
-move:direction.

class Direction {
  ...
}

So far so good. However...

...we have this message obsessions at a massively larger scale with accessors.


-attribute.
-setAttribute:newValue.

Every single attribute of every single class gets its own accessor or accessor pair, again with the action (get/set) mushed together with the name of the attribute to work on. The solution is the same as for the directions in Nat's example: there are only two actual messages, with reified identifiers.
-get:identifier.
-set:identifier to:value.

These, of course, correspond to the GET and PUT HTTP verbs. Properties, now available in a number of mainstream languages, are supposed to address this issue, but they only really address to 2:1 problem (getter and setter for an attribute). The much bigger N:2 problem (method pair for every attribute) remains unaddressed, and particularly you also cannot pass around, store or perform calculations on the identifier.

And it turns out that passing those identifiers around performing calculations on them is tremendously powerful, even if you don't have language support. Without language support, the interface between the world of reified identifiers and objects can be a bit awkward.

Saturday, March 23, 2019

What's up at the TIOBE Index? Surprisingly, Objective-C

When Apple introduced Swift, Objective-C quickly dropped down from its number 3 spot in the TIOBE index. Way down. And it certainly seemed obvious that from that day on, this was the only direction it would ever go.

Imagine my surprise when I looked earlier this March and found it back up, no, not in the lofty heights it used to occupy, but at least in tenth place (up from 14th a year earlier), and actually surpassing Swift again, which dropped by almost half in its percent rating and from 12th to 17th place in the rankings.

What's going on here?

Tuesday, March 19, 2019

LISP Macros, Delayed Evaluation and the Evolution of Smalltalk

At a recent Clojure Berlin Meetup, Veit Heller gave an interesting talk on Macros in Clojure. The meetup was very enjoyable, and the talk also brought me a little closer to understanding the relationship between functions and macros and a bit of Smalltalk evolution that had so far eluded me.

The question, which has been bugging me for some time, is when do we actually need metaprogramming facilities like macros, and why? After all, we already have functions and methods for capturing and extracting common functionality. A facile answer is that "Macros extend the language", but so do functions, in their way. Another answer is that you have to use Macros when you can't make progress any other way, but that doesn't really answer the question either.

The reason the question is relevant is, of course, that although it is fun to play around with powerful mechanisms, we should always use the least powerful mechanism that will accomplish our goal, as it will be easier to program with, easier to understand, easier to analyse and build tools for, and easier to maintain.

Anyway, the answer in this case seemed to be that macros were needed in order to "delay evaluation", to send unevaluated parameters to the macros. A quick question to the presenter confirmed that this was the case for most of the examples. Which begs the question: if we had a generic mechanism for delaying evluation, could we have used plain functions (or methods) instead, and indeed the answer was that this was the case.

One of the examples was a way to build your own if, which most languages have built in, but Smalltalk famously implements in the class library: there is an ifTrue:ifFalse: message that takes two blocks (closures) as parameters. The True class evaluates the first block parameter and ignores the second, the False class evaluates the second block parameter and ignores the first.

The Clojure macro example worked almost exactly the same way, but where Smalltalk uses blocks to delay evaluation, the example used macros. So where LISP might use macros, Smalltalk uses blocks. That macros and blocks might be related was new to me, and took me a while to process. Once I had processed it, a bit of Smalltalk history that I had always struggled with, this bit about Smalltalk-76, suddenly made sense:



Why did it "have to" provide such a mechanism? It doesn't say. It says this mechanism was replaced by the equivalent blocks, but blocks/anonymous functions seem quite different from alternate argument-passing mechanisms. Huh?

With this new insight, it suddenly makes sense. Smalltalk-72 just had a token-stream, there were no "arguments" as such, the new method just took over parsing the token stream and picked up the paramters from there. In a sense, the ultimate macro system and ultimately powerful, but also quite unusable, incomprehensible, unmaintainable and not compilable. In that system, "arguments" are per-definition unevaluated and so you can do all the macro-like magic you want.

Dan's Smalltalk-76 effort was largely about compiling for better performance and having a stable, comprehensible and composable syntax. But there are times you still need unevaluated arguments, for example if you want to implement an if that only evaluates one of its branches, not both of them, without baking it into the language. Smalltalk did not have a macro mechanism, and it no longer had the Smalltalk-72 token-stream where un-evaluated "arguments" came for free, so yes, there "had" to be some sort of mechanism for unevaluated arguments.

Hence the open-colon syntax.

And we have a progression of: Smalltalk-72 token stream → Smalltalk-76 open colon parameters → Smalltalk-80 blocks.
All serving the purpose of enabling macro-like capabilities without actually having macros by providing a general language facility for passing un-evaluated parameters.

Aha!

Friday, March 8, 2019

Software-ICs, Binary Compatibility, and Objective-Swift

Swift recently achieved ABI stability, meaning that we can now ship Swift binaries without having to ship the corresponding Swift libraries. While it's been a long time coming, it's also great to have finally reached this point. However, it turns out that this does not mean you can reasonably ship binary Swift frameworks, for reasons described very well by Peter Steinberger of PSPDFKit and the good folks at instabug.

To reach this not-quite-there-yet state took almost 5 years, which is pretty much the total time NeXT shipped their hardware, and it mirrors the state with C++, which is still not generally suitable for binary distribution of libraries. Objective-C didn't have these problems, and as it turns out this is not a coincidence.

Software ICs

Objective-C was created specifically to implement the concept of Software-ICs. I briefly referenced the concept in a previous article, and also mentioned its relationship to the scripted components pattern, but the comments indicated that this is no longer a concept people are familiar with.

As the name suggests the intention was to bring the benefits the hardware world had reaped from the introduction of the Integrated Circuits to the software world.

It is probably hard to overstate the importance of ICs to the development of the computer industry. Instead of assembling computers from discrete components, you could now put entire subsystem onto one component, and then compose these subsystems to form systems. The interfaces are standardised pins, and the relationship between the outside interface and the complexity hidden inside can be staggering. Although the socket of the CPU I am writing is a beast, with 1151 pins, the chip inside has a staggering 2.1 billion transistors. With a ratio of one million to one, that's a very deep interface, even if you disregard the fact that the bulk of those pins are actually voltage supply and ground pins.

The important point is that you do not have to, and in fact cannot, look inside the IC. You get the pins, very much a binary interface, and the documentation, a specification sheet. With Software-ICs, the idea was the same: you get a binary, the interface and a specification sheet. Here are two BYTE articles that describe the concepts:


A lot of what they write seems quaint now, for example a MailFolder that inherits from Array(!), but the concepts are very relevant, particularly with a couple of decades worth of perspective and the new circumstances we find ourselves in.

Although the authors pretty much equate Software-ICs with objects and object-oriented programming, it is a slightly different form of object-oriented programming than the one we mostly use today. They do write about object/message programming, similar to Alan Kay's note that 'The big idea is "messaging"'.

With messaging as the interconnect, similar to Unix pipes, our interfaces are sufficiently well-defined and dynamic that we really can deliver our Software-ICs in binary form and be compatible, something our more static languages like C++ and Swift struggle with.

Objective-C is middleware with language features.

Message Oriented Middleware

Virtually all systems based on static languages eventually grow an additional, separate and more dynamic component mechanism. Windows has COM, IBM has SOM, Qt has signals and slots and the meta-object system, Be had BMessages etc.

In fact, the problem of binary compatibility of C++ objects was one of the reasons for creating COM:

Unlike C++, COM provides a stable application binary interface (ABI) that does not change between compiler releases.
COM has been incredibly successful, it enables(-ed?) much of the Windows and Office ecosystems. In fact, there is even a COM implementation on macOS: CFPlugin, part of CoreFoundation.

CFPlugIn provides a standard architecture for application extensions. With CFPlugIn, you can design your application as a host framework that uses a set of executable code modules called plug-ins to provide certain well-defined areas of functionality. This approach allows third-party developers to add features to your application without requiring access to your source code. You can also bundle together plug-ins for multiple platforms and let CFPlugIn transparently load the appropriate plug-in at runtime. You can use CFPlugIn to add plug-in capability to, or write a plug-in for, your application.
That COM implementation is still in use, for example for writing Spotlight importers. However, there are, er, issues:
Creating a new Spotlight importer is tricky because they are based on CFPlugIn, and CFPlugIn is… well, how to say this diplomatically?… super ugly )-: One option here is to use Xcode 9 to create your plug-in based on the old template. Honestly though, I don’t recommend that because the old template… again, diplomatically… well, let’s just say that the old template lets the true nature of CFPlugIn shine through! (-:
Having written both Spotlight importers and even some COM component on Windows (I think it was just for testing), I can confirm that COM's success is not due to the elegance or ease-of-use of the implementation, but due to the fact that having an interoperable, stable binary interface is incredibly enabling for a platform.

That said, all this talk of COM is a bit confusing, because we already have NSBundle.

Apple uses bundles to represent apps, frameworks, plug-ins, and many other specific types of content.
So NSBundle already does everything a CFPlugin does and a lot more, but is really just a tiny wrapper around a directory that may contain a dynamic shared library. All the interfacing, introspection and binary compatibility features come automagically with Objective-C. In fact, NeXT had a Windows product called d'OLE that pretty automagically turned Objective-C libraries into COM-comptible OLE servers (.NET has similar capabilities). Again, this is not a coincidence, the Software-IC concept that Objective-C is based on is predicated on exactly this sort of interoperation scenario.

Objective-C is middleware with language features.

Frameworks and Microservices

To me, the idea of a Software-IC is actually somewhat higher level than a single object, I tend to see it at the level of a framework, which just so happens to provide all the trappings of a self-contained Software-IC: a binary, headers to define the interface and hopefully some documentation, which could even be provided in the Resources directory of the bundle. In addition, frameworks are instances of NSBundle, so they aren't limited to being linked into an application, they can also be loaded dynamically.

I use that capability in Objective-Smalltalk, particularly together with the stsh the Smalltalk Scripting Shell. By loading frameworks, this shell can easily be transformed into an application-specific scripting language. An example of this is pdfsh, a shell for examining an manipulating PDF files using EGOS, the Extensible Graphical Object System.


#!/usr/local/bin/stsh
#-<void>pdfsh:<ref>file
framework:EGOS_Cocoa load.
pdf := MPWPDFDocument alloc initWithData: file value.
shell runInteractiveLoop

The same binary framework is also used in in PdfCompress, PostView and BookLightning. With this framework, my record for creating a drag-and-drop applicaton to do something useful with a PDF file was 5 minutes, and the only reason I was so slow was that I thought I had remembered the PDF dictionary entry...and had not.

Framework-oriented programming is awesome, alas it was very much deprecated by Apple for quite some time, in fact even impossible on iOS until dynamic libraries were allowed. Even now, though, the idea is that you create an app, which consists of all the source-code needed to create it (exception: Apple code!), even if some of that code may be organised into framework units that otherwise don't have much meaning to the build.

Apps, however are not Software-ICs, they aren't the right packaging technology for reuse (AppleScript notwithstanding). And so iOS and macOS development shops routinely get themselves into big messes, also known as the Big Ball of Mud architectural pattern.

Of course, there are reasons that things didn't quite work out the way we would have liked. Certainly Apple's initial Mac OS X System Architecture book showed a much more flexible arrangement, with groups of applications able to share a set of frameworks, for example. However, DLL hell is a thing, and so we got a much more restricted approach where every app is a little fortress and frameworks in general and binary frameworks in particular are really something for Apple to provide and for the rest to use. However, the fact that we didn't manage to get this right doesn't mean that the need went away.

Swift has been making this worse, by strongly "suggesting" that everything be compiled together and leading to such wonderful oxymorons as "whole module optimisation in debug mode", meaning without optimisation. That and not having a binary modularity story for going on half a decade. The reason for compiling whole modules together is that the modularity mechanism is, practically speaking, very much source-code based, with generics and specialisation etc. (Ironically, Swift also does some pretty crazy things to enable separate compilation, but that hasn't really panned out so far).

On the other hand, Swift's compiler is so slow that teams are rediscovering forms of framework-oriented programming as a self-defense mechanism. In order to get feedback cycles down from ludicrously bad to just plain awful, they split up their projects into independent frameworks that they then compile and run independently during development. So in a somewhat roundabout way, Swift is encouraging good development practices.

I find it somewhat interesting that the industry is rediscovering variants of the Software-IC, in this case on the backend in the form of Microservices. Why do I say that Microservices are a form of Software-IC? Well, they are a binary unit of deployability, fairly loosely coupled and dynamically typed. In fact, Fred George, one of the people who came up with the idea refers to them as Smalltalk objects:

Of course, there are issues with this approach, one being that reliable method calls are replaced with unreliable network calls. Stepping back for a second should make it clear that the purported benefits of Microservices also largely apply to Software-ICs. At least real Software-ICs. Objective-C made the mistake of equating Software-ICs with objects, and while the concepts are similar with quite a bit of overlap, they are not quite the same. You certainly can use Objective-C to build and connect Software-ICs if you want to do that. It will also help you in this endeavour, but of course you have to know that this is something you want. It doesn't do this automatically and over time the usage of Objective-C has shifted to just a regular old object-oriented language, something it is OK but not that brilliant at.

Interoperability

One of the interesting aspects of Microservices is that they are language-agnostic, it doesn't matter what language a particular services is written in, as long as they can somehow communicate via HTTP(S). This is another similarity to Software-ICs (and other middleware such as COM, SOM, etc.): there is a fairly narrowly defined, simple interface, and as long as you can somehow service that interface, you can play.

Microservices are pretty good at this, Unix filters probably the most extreme example and just about every language and every kind of application on Windows can talk to and via COM. NeXT only ever sold 50000 computers, but in a short number of years the NeXT community had bridges to just about every language imaginable. There were a number of Objective- languages, including Objective-Fortran. Apple alone has around 140K employees (though probably a large number of those in retail), and there are over 2.8 million iOS developers, yet the only language integration for Swift I know of is the Python support, and that took significant effort, compiler changes and the original Swift creator, Chris Lattner.

This is not a coincidence. Swift is designed as a programming language, not as middleware with language features. Therefore its modularity features are an add-on to the language, and try to transport the full richness of that programming model. And Swift's programming model is very rich.

Objective-Swift

The middlewares I've talked about use the opposite approach, from the outside in. For SOM, it is described as such:
SOM allows classes of objects to be defined in one programming language and used in another, and it allows libraries of such classes to be updated without requiring client code to be recompiled.
So you define interfaces separately from their implementations. I am guessing this is part of the reason we have @interface in Objective-C. Having to write things down twice can be a pain (and I've worked on projects that auto-generated Objective-C headers from implementation files), but having a concrete manifestation of the interface that precedes the implementation is also very valuable. (One of the reasons TDD is so useful is that it also forces you to think about the interface to your object before you implement it).

In Swift, a class is a single implementation-focused entity, with its interface at best a second-class and second-order effect. This makes writing components more convenient (no need to auto-generate headers...), but connecting components is more complicated.

Which brings us back to that other complication, the lack of stable binary compatibility for libraries and frameworks. One consequence of this is to write frameworks exclusively in Objective-C, which was particularly necessary before ABI stability had been reached. The other workaround, if you have Swift code, is to have an Objective-C wrapper as an interface to your framework. The fact that Swift interoperates transparently with the Objective-C runtime makes this fairly straightforward.

Did I mention that Objective-C is middleware with language features?

So maybe this supposed "workaround" is actually the solution? Have Objective-C or Objective- as our message-oriented middleware, the way it was always intended? Maybe with a bit of a tweak so that it loses most of the C legacy and gains support for pipes and filters, REST/Microservices and other architectural patterns?

Just sayin'.

Wednesday, February 20, 2019

An Unexpected Benefit of Uniform Interfaces

In the previous post, My Objective-Smalltalk Tipping Point: Scheme Handler for a Method Browser, I described how I had reached a point where Objective-Smalltalk code is just so much better than the equivalent Objective-C that the idea of porting back some Objective-Smalltalk to Objective-C in order to better integrate with the surrounding Objective-C code seems a bit abhorrent.

So instead I have chosen to just integrate the code into that Objective-C code base. So load the code:


NSData *classdefSchemeCode=[self frameworkResource:@"classdef-method-browser-scheme" category:@"stsh"];
[interpreter evaluateScriptString:[classdefSchemeCode stringValue]];

Then instantiate the class after finding it reflectively via NSClassFromString()
-(void)awakeFromNib
{
	...
    self.methodStore = [NSClassFromString(@"ClassBrowser") store];
}

So far, so easy. Of course, the big problem with bridged code usually comes now: the compiler doesn't know about code loaded at runtime, so you need to somehow duplicate the class's interface and somehow convince the compiler that the newly loaded class conforms to that interface.

But wait! This is a scheme-handler, which is a store, meaning it doesn't really have any unique interface of its own, but rather just implements the MPWStorage protocol. So all I have to do is the following:


@property (nonatomic, strong)   id  methodStore;

And I'm good to go!

This is unexpected in two ways: first, the integration pain I was expecting just didn't appear. Happy. Second, and maybe more importantly, the benefit of uniform interfaces, which I thought should appear, actually did appear!

So very happy.

Friday, February 15, 2019

My Objective-Smalltalk Tipping Point: Scheme Handler for a Method Browser

The psychological tipping point for a programming language comes when you really, really don't want to go back to the previous language (usually the implementation language). I remember this feeling from many years ago when I was cobbling together an Objective-C implementation (this was pre-NeXT and pre gcc-Objective-C). At some point, enough of Objective-C was working, and the feel so much more pleasant, that I wanted to only implement new stuff with the new language, no matter how many warts the implementation still had (many).

It seems I have now reached that point with Objective-Smalltalk. Now it was pretty much always the case that I preferred the Objective-Smalltalk code to equivalent Objective-C code, but since it was mostly scripts and other "final" code, the comparison never really came up. With a somewhat workable if incomplete class and scheme (and filter) definition syntax in place, and therefore Objective-Smalltalk now at least theoretically capable of delivering reusable code, that is no longer the case.

Specifically, I have a little Smalltalk-inspired method browser that I use for ObjST-based live coding environments (AppLive for Apps and SiteBuilder for websites).

Yes: "Programmer UI". I'll clean that up later. What I am currently cleaning up is the interface between the browser and the "method store", which is horrendous. It is also tied to a specific, property-list based implementation of the method store.

The idea there is to allow external editing of the classes/methods. A browser for hierarchically nested structures...could this be a job for scheme handlers? Why yes, glad you asked!


#!/usr/local/bin/stsh
#-methodbrowser:classdef


scheme ClassBrowser  {
  var dictionary.
  -initWithDictionary:aDict {
     self setDictionary:aDict.
     self.
  }
  -classDefs {
     self dictionary at:'methodDict'.
  }
  /. { 
     |= {
       self classDefs allKeys.
     }
  }

  /:className/:which/:methodName  { 
     |= {
       self classDefs at:className | at:which | at:methodName.
     }
     =| {
       self classDefs at:className | at:which | at:methodName put:newValue.
     }
  }

  /:className/:which { 
     |= {
       self classDefs at:className | at:which | allKeys.
     }
  }

}

scheme:browser := ClassBrowser alloc initWithDictionary: classdef value propertyList.
stdout do println: browser:. each. 
shell runInteractiveLoop.

Despite the fact that there are still some issues to resolve, for example this code makes it clear that something needs to be done about navigation vs. final access, it still strikes me as a clear and succinct expression of what is going on. Now I sort of need to rewrite it in Objective-C, because both AppLive and SiteBuilder are Objective-C projects.

And I really don't want to.

I really, really don't want to. The idea of recasting this logic just strikes me as abhorrent, so much that the somewhat daunting prospect of significantly improving my Objective-Smalltalk native compilation facilities looks much more attractive.

That's the tipping point.







For reference, here's the original Objective-C code. What, you didn't believe me that it's horrendous? It also does a few things that the Objective-Smalltalk code doesn't do yet, but those aren't that significant.


//
//  MethodDict.h
//  MPWTalk
//
//  Created by Marcel Weiher on 10/16/11.
//  Copyright (c) 2012 Marcel Weiher. All rights reserved.
//

#import <Foundation/Foundation.h>

@protocol MethodDict

-(NSArray*)instanceMethodsForClass:(NSString*)className;
-(NSArray*)classMethodsForClass:(NSString*)className;

-(NSString*)fullNameForMethodName:(NSString*)shortName ofClass:(NSString*)className;
-(NSString*)methodForClass:(NSString*)className methodName:(NSString*)methodName;
-(void)setClassMethod:(NSString*)methodBody name:(NSString*)methodName  forClass:(NSString*)className;
-(void)setInstanceMethod:(NSString*)methodBody name:(NSString*)methodName  forClass:(NSString*)className;

-(void)deleteInstanceMethodName:(NSString*)methodName forClass:(NSString*)className;
-(void)deleteClassMethodName:(NSString*)methodName forClass:(NSString*)className;

-(NSMutableDictionary*)addClassWithName:(NSString*)newClassName;
-(void)deleteClass:(NSString*)className;

-(NSString*)instanceMethodForClass:(NSString*)className methodName:(NSString*)methodName;
-(NSString*)classMethodForClass:(NSString*)className methodName:(NSString*)methodName;

@end



@interface MethodDict : NSObject
{
    NSMutableDictionary *dict;
}


- (NSDictionary *)dict;
-initWithDict:(NSDictionary*)newDict;


-(NSArray*)classes;


@end



//
//  MethodDict.m
//  MPWTalk
//
//  Created by Marcel Weiher on 10/16/11.
//  Copyright (c) 2012 Marcel Weiher. All rights reserved.
//

#import "MethodDict.h"
#import <MPWFoundation/MPWFoundation.h>
#import <ObjectiveSmalltalk/MethodHeader.h>

@implementation NSString(methodName)

-methodName
{
    MPWMethodHeader *header=[MPWMethodHeader methodHeaderWithString:self];
    return [header methodName];
}

@end

@implementation MethodDict

objectAccessor(NSMutableDictionary, dict, setDict)


-initWithDict:(NSDictionary*)newDict
{
    self = [super init];
    [self setDict:[[newDict mutableCopy] autorelease]];
    return self;
}

-(NSData*)asXml
{
    NSData *data=[NSPropertyListSerialization dataFromPropertyList:[self dict] format:NSPropertyListXMLFormat_v1_0 errorDescription:nil];
    return data;
}

-(NSArray*)classes
{
    return [[[self dict] allKeys] sortedArrayUsingSelector:@selector(compare:)];
}

-(NSMutableDictionary*)classDictForName:(NSString*)className
{
    return [[self dict] objectForKey:className];
}

-(void)deleteClass:(NSString*)className
{
    return [[self dict] removeObjectForKey:className];
}

-(NSMutableDictionary*)methodDictForClass:(NSString*)className classMethods:(BOOL)isClassMethod
{
    NSString *key=isClassMethod ? @"classMethods" : @"instanceMethods";
    
    return [[[self dict] objectForKey:className] objectForKey:key];
}



-(NSArray*)methdodsForClass:(NSString*)className getClassMethods:(BOOL)classMethods
{
    NSDictionary *methodDict=[self methodDictForClass:className classMethods:classMethods];
    NSArray* methodKeys = [methodDict allKeys];
    if ( [methodKeys count]) {
        return [(NSArray*)[[methodKeys collect] methodName] sortedArrayUsingSelector:@selector(compare:)];
    }
    return [NSArray array];
}

-(NSArray*)instanceMethodsForClass:(NSString*)className
{
    return  [self methdodsForClass:className getClassMethods:NO];
}

-(NSArray*)classMethodsForClass:(NSString*)className
{
    return  [self methdodsForClass:className getClassMethods:YES];
}


-(NSString*)fullNameForMethodName:(NSString*)shortName ofClass:(NSString*)className
{
    NSArray *fullNames = [[self methodDictForClass:className classMethods:NO] allKeys];
    fullNames=[fullNames arrayByAddingObjectsFromArray:[[self methodDictForClass:className classMethods:YES] allKeys]];
    for ( NSString *fullName in fullNames ) {
        if ( [[fullName methodName] isEqual:shortName] ) {
            return fullName;
        }
    }
    return nil;
}

-(NSString*)instanceMethodForClass:(NSString*)className methodName:(NSString*)methodName
{
    
    return [[self methodDictForClass:className classMethods:NO] objectForKey:[self fullNameForMethodName:methodName ofClass:className]];
}

-(NSString*)classMethodForClass:(NSString*)className methodName:(NSString*)methodName
{
    
    return [[self methodDictForClass:className classMethods:YES] objectForKey:[self fullNameForMethodName:methodName ofClass:className]];
}

-(NSString*)methodForClass:(NSString*)className methodName:(NSString*)methodName
{
    
    return [self instanceMethodForClass:className methodName:methodName];
}


-(void)setMethod:(NSString*)methodBody name:(NSString*)methodName  forClass:(NSString*)className isClassMethod:(BOOL)isClassMethod
{
    NSMutableDictionary *methodDict = [self methodDictForClass:className classMethods:isClassMethod];
    if ( !methodDict ) {
        [self addClassWithName:className];
        methodDict = [self methodDictForClass:className classMethods:isClassMethod];
    }
    [methodDict setObject:methodBody forKey:methodName];
}

-(void)setClassMethod:(NSString*)methodBody name:(NSString*)methodName  forClass:(NSString*)className
{
    [self setMethod:methodBody name:methodName forClass:className isClassMethod:YES];
}

-(void)setInstanceMethod:(NSString*)methodBody name:(NSString*)methodName  forClass:(NSString*)className
{
    [self setMethod:methodBody name:methodName forClass:className isClassMethod:NO];
}

-(NSMutableDictionary*)addClassWithName:(NSString*)newClassName
{
    NSMutableDictionary *classDict=[self classDictForName:newClassName];
    if ( !classDict) {
        classDict=[NSMutableDictionary dictionary];
        classDict[@"instanceMethods"]=[NSMutableDictionary dictionary];
        classDict[@"classMethods"]=[NSMutableDictionary dictionary];
        dict[newClassName]=classDict;
    }
    return classDict;
}


-(void)deleteMethodName:(NSString*)methodName forClass:(NSString*)className isClassMethod:(BOOL)isClassMethod
{
    NSMutableDictionary *methodDict = [self methodDictForClass:className classMethods:isClassMethod];

    [methodDict removeObjectForKey:[self fullNameForMethodName:methodName ofClass:className]];
}

-(void)deleteInstanceMethodName:(NSString*)methodName forClass:(NSString*)className
{
    [[self methodDictForClass:className classMethods:NO] removeObjectForKey:[self fullNameForMethodName:methodName ofClass:className]];
}

-(void)deleteClassMethodName:(NSString*)methodName forClass:(NSString*)className
{
    [[self methodDictForClass:className classMethods:YES] removeObjectForKey:[self fullNameForMethodName:methodName ofClass:className]];
}

-(NSString*)description
{
    return [NSString stringWithFormat:@"<%@:%p: %@>",[self class],self,dict];
}

@end

Sunday, February 10, 2019

Why Architecture Oriented Programming Matters

On re-reading John Hughes influential Why Functional Programming Matters, two things stood out for me.

The first was the claim that, "...a functional programmer is an order of magnitude more productive than his or her conventional counterpart, because functional programs are an order of magnitude shorter." That's a bold claim, though he cleverly attributes this claim not to himself but to unspecified others: "Functional programmers argue that...".

If there is evidence for this 10x claim, I'd like to see it. The only somewhat systematic assessment of "language productivity" is the Caper Jones language-level evaluation, which is computed on the lines of code needed to achieve a function point worth of software. In this evaluation, Haskell achieved a level of 7.5, whereas Smalltalk was rated 15, so twice as productive. While I don't see this is conclusive, it certainly doesn't support a claim of vastly superior productivity.

That little niggle out of way, I think he does make some very insightful and important points in the second section. Talking about structured programming he concludes that:

With the benefit of hindsight, it’s clear that these properties of structured programs, although helpful, do not go to the heart of the matter. The most important difference between structured and unstructured programs is that structured programs are designed in a modular way. Modular design brings with it great productivity improvements. First of all, small modules can be coded quickly and easily. Second, general-purpose modules can be reused, leading to faster development of subsequent programs. Third, the modules of a program can be tested independently, helping to reduce the time spent debugging.
He goes on:
However, there is a very important point that is often missed. When writing a modular program to solve a problem, one first divides the problem into subproblems, then solves the subproblems, and finally combines the solutions.
And then comes the zinger:
The ways in which one can divide up the original problem depend directly on the ways in which one can glue solutions together. Therefore, to increase one’s ability to modularize a problem conceptually, one must provide new kinds of glue in the programming language.
Yes, I put that in bold. I'd also put a <blink>-tag on it, but fortunately for everyone involved that tag is obsolete. Anyway, he then makes some partly good, partly debatable points about the benefits of the two kinds of glue he says FP provides: function composition and lazy evaluation.

For me, the key point here is not those specific benefits, but the number 2. He just made the important point that "one must provide new kinds of glue" and then the amount of "new kinds of glue" is the smallest number that actually justifies using the plural. That seems a bit on the low side, particularly because the number also seems to be fixed.

I'd venture to say that we need a lot more different kinds of glue, and we need our kinds of glue to be extensible, to be user-defined. But first, what is this "glue"? Do we have some other term for it? Maybe Alan Kay can help?

The Japanese have a small word -- ma -- for "that which is in between" -- perhaps the nearest English equivalent is "interstitial". The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. Think of the internet -- to live, it (a) has to allow many different kinds of ideas and realizations that are beyond any single standard and (b) to allow varying degrees of safe interoperability between these ideas.
Alan Kay: prototypes vs classes was: Re: Sun's HotSpot

OK, that's an additional insight along the same lines, but doesn't really help us. Fortunately the Software Architecture community has something for us, the idea of a Connector.

Connectors are the locus of relations among components. They mediate interactions but are not “things” to be hooked up (they are, rather, the hookers-up).
Mary Shaw: Procedure Calls Are the Assembly Language of Software Interconnection: Connectors Deserve First-Class Status

The subtitle of that paper by Mary Shaw is the solution: connectors deserve first-class-status. Connectors are the "ma" that goes in-between, the glue that we need "lots more" of ("lots" > 2); and when we give them first class status, we can create more, put them in libraries, and adapt them to our needs.

That's why I think Architecture Oriented Programming matters, and why I am creating Objective-Smalltalk.

Friday, February 8, 2019

A small (and objective) taste of Objective-Smalltalk

I've been making good progress on Objective-Smalltalk recently. Apart from the port to GNUstep that allowed me to run the official site on it (shrugging off the HN hug of death in the process), I've also been concentrating on not just pushing the concepts further, but also on adding some of the more mundane bits that are just needed for a programming language.

And so my programs have been getting longer and more useful, and I am starting to actually see the effect of "I'd rather write this on Objective-Smalltalk than Objective-C". And with that, I thought I'd share one of these slightly larger examples, show how it works, what's cool and possibly a bit weird, and where work is still needed (lots!).

The code I am showing is a script that implements a generic scheme handler for sqlite databases and then uses that scheme handler to access the database given on the command line. When you run it, in this case with the sample database Chinook.db, it allows you to interact with the database using URIs using the db scheme. For example, db:. lists the available tables:


> db:. 
( "albums","sqlite_sequence","artists","customers","employees","genres","invoices",
 "invoice_items","media_types","playlists","playlist_track","tracks","sqlite_stat1") 

You can then access entire tables, for example db:albums would show all the albums, or you can access a specific album:
> db:albums/3
{ "AlbumId" = 4;
"Title" = "Let There Be Rock";
"ArtistId" = 1;
} 

With that short intro and without much further ado, here's the code :


#!/usr/local/bin/stsh
#-sqlite:<ref>dbref

framework:FMDB load.


class ColumnInfo {
  var name.
  var type.
  -description {
      "Column: {var:self/name} type: {var:self/type}".
  }
}

class TableInfo  {
  var name.
  var columns.
  -description {
    cd := self columns description.
    "Table {var:self/name} columns: {cd}".
  }
}

class SQLiteScheme : MPWScheme {
  var db.

  -initWithPath: dbPath {
     self setDb:(FMDatabase databaseWithPath:dbPath).
     self db open.
     self.
  }

  -dictionariesForResultSet:resultSet
  {
    results := NSMutableArray array.
    { resultSet next } whileTrue: { results addObject:resultSet resultDictionary. }.
    results.
  }

  -dictionariesForQuery:query {
     self dictionariesForResultSet:(self db executeQuery:query).
  }

  /. { 
     |= {
       resultSet := self dictionariesForQuery: 'select name from sqlite_master where [type] = "table" '.
       resultSet collect at:'name'.
     }
  }

  /:table/count { 
     |= { self dictionariesForQuery: "select count(*) from {table}" | firstObject | at:'count(*)'. }
  }

  /:table/:index { 
     |= { self dictionariesForQuery: "select * from {table}" | at: index. }
  }

  /:table { 
     |= { self dictionariesForQuery: "select * from {table}". }
  }

  /:table/:column/:index { 
     |= { self dictionariesForQuery: "select * from {table}" | at: index.  }
  }

  /:table/where/:column/:value { 
     |= { self dictionariesForQuery: "select * from {table} where {column} = {value}".  }
  }

  /:table/column/:column { 
     |= { self dictionariesForQuery: "select {column} from {table}"| collect | at:column. } 
  }

  /schema/:table {
     |= {
        resultSet := self dictionariesForQuery: "PRAGMA table_info({table})".
	    columns := resultSet collect: { :colDict | 
            #ColumnInfo{
				#'name': (colDict at:'name') ,
				#'type': (colDict at:'type')
			}.
        }.
        #TableInfo{ #'name': table, #'columns': columns }.
     }
  } 

  -tables {
	 db:. collect: { :table| db:schema/{table}. }.
  }
  -<void>logTables {
     stdout do println: scheme:db tables each.	
  }
}

extension NSObject {
  -initWithDictionary:aDict {
    aDict allKeys do:{ :key |
      self setValue: (aDict at:key) forKey:key.
    }.
    self.
  }
}


scheme:db := SQLiteScheme alloc initWithPath: dbref path.
stdout println: db:schema/artists
shell runInteractiveLoop.


Let's walk through the code in detail, starting with the header:
#!/usr/local/bin/stsh
#-sqlite:<ref>dbref

This is a normal Unix shell script invoking stsh, the Smalltalk Shell. The Smalltalk Shell is a bigger topic for another day, but for now let's focus on the second line, which looks like a method declaration, and that's exactly what it is! In order to ease the transition between small scripts and larger systems (successful scripts tend to get larger, and successful large systems evolve from successful small systems), scripts have a dual nature, being at the same time callable from the Unix command line and also usable as a method (or filter) from a program.

Since this script is interactive, that part is not actually that important, but a nice side effect is that the declaration of a parameter gets us automatic command-line parameter parsing, conversion, and error checking. Specifically, stsh knows that the script takes a single parameter of type <ref> (a reference, so a filename or URL) and will put that in the dbref variable as a reference. If the script is invoked without that parameter, it will exit with an error message, all without any further work by the script author. These declarations are optional, without them parameters will go into an args array without further interpretation.

Next up, we load a dependency, Gus Mueller's wonderful FMDB wrapper for SQLite.


framework:FMDB load.

The framework scheme looks for frameworks on the default framework path, and the load message is sent to the NSBundle that is returned.

The next bit is fairly straightforward, defining the ColumnInfo class with two instance variables, name and type, and a -descritpion method.


class ColumnInfo {
  var name.
  var type.
  -description {
      "Column: {var:self/name} type: {var:self/type}".
  }
}

Again, this is very straightforward, with maybe the missing superclass specification being slightly unusual. Different constructs may have different implicit superclasses, for class it is assumed to be NSObject. The description method, introduced by "-" just like in Objective-C, uses string interpolation with curly braces. (I currently need to use fully qualified names like var:self/name to access instance variables, that should be fixed in the near future). It also doesn't have a return statement or the like, a method return can be specified by just writing out the return value.

To me, this has the great effect of putting the focus on the essential "this is the description" rather than on the incidental, procedural "this is how you build the description". It is obviously only a very small instance of this shift, but I think even this small examples speaks to what that shift can look like in the large.

The way instance variables are defined is far from being done, but for now the var syntax does the job. The TableInfo class follows the same pattern as ColumnInfo, and of course these two classes are used to represent the metadata of the database.

So on to the main attraction, the scheme-handler itself, which is just a plain old class inheriting from MPWScheme, with an instance variable and an initialisation method:


class SQLiteScheme : MPWScheme {
  var db.

  -initWithPath: dbPath {
     self setDb:(FMDatabase databaseWithPath:dbPath).
     self db open.
     self.
  }

Having advanced language features largely defined as/by plain old classes goes back to the need for a stable starting point. However, it has turned out to be a little bit more than that, because the mapping to classes is not just the trivial one of "since this written in on OO language, obviously the implementation of features is somehow done with classes". Instead, the classes map onto the language features very much in an Open Implementation kind of way, except that in this case it is Open Language Implementation.

That means that unlike a typical MOP, the classes actually make sense outside the specific language implementation, making their features usable from other languages, albeit less conveniently. Being easily accessible from other languages is important for an architectural language.

With this mapping, a very narrow set of syntactic language mechanism can be used to map a large and extensible (thus infinite) set of semantic features into the languages. This is of course similar to language features like procedures, methods and classes, but is expanded to things that usually haven't been as extensible.

The next two methods handle the interface to FMDB, they are mostly straightforward and, I think, understandable to both Smalltalk and Objective-C programmers without much explanation.


  -dictionariesForResultSet:resultSet
  {
    results := NSMutableArray array.
    { resultSet next } whileTrue: { results addObject:resultSet resultDictionary. }.
    results.
  }

  -dictionariesForQuery:query {
     self dictionariesForResultSet:(self db executeQuery:query).
  }

Smalltalk programmers may balk a little at the use of curly braces rather than square brackets to denote blocks. To me, this is a very easy concession to "the mainstream"; I have bigger fish to fry. To Objective-C programmers, the fact that the condition of the while-loop is implemented as a message sent to a block rather than as syntax might take a little getting used to, but I don't think it presents any fundamental difficulties.

Next up we have some property path definitions, the meat of the scheme handler. Each property path lets you define code that will run for a specific subset of the scheme's namespace, with the subset defined by the property path's URI pattern. As the name implies, property paths can be regarded as a generalisation of Objective-C properties, extended to handle both entire sets of properties, sub-paths and the combination of both.


  /. { 
     |= {
       resultSet := self dictionariesForQuery: 'select name from sqlite_master where [type] = "table" '.
       resultSet collect at:'name'.
     }
  }

The first property path definition is fairly straightforward as it only applies to a single path, the period (so the db:. example from above). Property path definitions start with the forward slash ("/"), similar to the way that instance methods start with "-" and class methods with "+" in Objective-C (and Objetive-Smalltalk). The slash seemed natural to indicate the definition of paths/URIs.

Like C# or Swift property definitions, you need to be able to deal with (at least) "get" and/or "set" access to a property. I really dislike having random keywords like "get" or "set" for program structure, I prefer to see names reserved for things that have domain meaning. So instead of keywords, I am using constraint connector syntax: "|=" means the left hand side is constrained to be the same as the right hand side (aka "get"). "=|" means the right hand side is constrained to be the same as the left hand side (aka "set"). The idea is that the "left hand side" in this case is the interface, the outside of the object/scheme handler, whereas the "right hand side" is the inside of the object, with properties mediating between the outside and the inside of the object.

As most everything, this is currently experimental, but so far I like it more than I expected to, and again, it shifts us away from being action oriented to describing relationships. For example, delegating both get and set to some other object could then be described by using the bidirectional constraint connector: /myProperty =|= var:delegate/otherroperty.

Getting the result set is a straightforward message-send with the SQL query as a constant, non-interpolated string (single quotes, double quotes is for interpolation). We then need to extract the name of the table from the return dictionaries, which we do via the collect HOM and the Smalltalk-y -at: message, which in this case maps to Foundation's -objectForKey:. The next property paths map URIs to queries on the tables. Unlike the previous example, which had a constant, single element path and so was largely equivalent to a classic property, these all have variable path elements, multiple path segments or both.


  /:table/count { 
     |= { self dictionariesForQuery: "select count(*) from {table}" | firstObject | at:'count(*)'. }
  }

  /:table/:index { 
     |= { self dictionariesForQuery: "select * from {table}" | at: index. }
  }

  /:table { 
     |= { self dictionariesForQuery: "select * from {table}". }
  }

Starting at the back, /:table returns the data from the entire table specified in the URI using the parameter :table. The leading semicolon means that this path segment is a parameter that will match any single string and deliver it the method as the parameter name used, in this case "table". Wildcards are also possible.

Yes, the SQL query is performed using simple string interpolation without any sanitisation. DON'T DO THAT. At least not in production code. For experimenting in an isolated setting it's OK.

The second query retrieves a specific row of the table specified. The pipe "operator" is for method chaining with keyword syntax without having to bracket like crazy:


self dictionariesForQuery: "select count(*) from {table}" | firstObject | at:'count(*)'
((self dictionariesForQuery: "select count(*) from {table}") firstObject) at:'count(*)'

I find the "pipe" version to be much easier to both just visually scan and to understand, because it replaces nested (recursive) evaluation with linear piping. And yes, it is at least partly a by-product of integrating pipes/filters, which is a part of the larger goal of smoothly integrating multiple architectural styles. That this integration would lead to an improvement in the procedural part was an unexpected but welcome side effect.

The first property path, /:table/count returns the size of the given table, using the optimised SQL code select count(*). This shows an interesting effect of property paths. In a standard ORM, getting the count of a table might look something like this: db.artists.count. Implemented naively, this code retrieves the entire "artists" table, converts that to an array and then counts the array, which is incredibly inefficient. And of course, this was/is a real problem of many ORMs, not just a made up example.

The reason it is a real problem is that it isn't trivial to solve, due to the fact that OOPLs are not structurally polymorphic. If I have something like db.artists.count, there has to be some intermediate object returned by artists so I can send it the count message. The natural thing for that is the artists table, but that is inefficient. I can of course solve this by returning some clever proxy that doesn't actually materialise the table unless it has to, or I can have count handled completely separately, but neither of these solutions are straightforward, which is why this has traditionally been a problem.

With property paths, the problem just goes away, because any scheme handler (or object) has control over its sub-structure to an arbitrary depth.

Queries are handled in a similar matter, so db:albums/where/ArtistId/6 retrieves the two albums by band Apocalyptica. This is obviously very manual, for any specific database you'd probably want to specialise this generic scheme handler to give you named relationships and also to return actual objects, rather than just dictionaries. A step in that direction is the /schema/:table property path:


  /schema/:table {
     |= {
        resultSet := self dictionariesForQuery: "PRAGMA table_info({table})".
	    columns := resultSet collect: { :colDict | 
            #ColumnInfo{
				#'name': (colDict at:'name') ,
				#'type': (colDict at:'type')
			}.
        }.
        #TableInfo{ #'name': table, #'columns': columns }.
     }
  } 

This property path returns the SQL schema in terms of the objects we defined at the top. First is a simple query of the SQLite table that holds the schema information, returning an array of dictionaries. These individual dictionaries are then converted to ColumnInfo objects using object literals.

Similar to defining the -description method above as simple the parametrized string literal instead of as instructions to build the result string, object literals allow us to simple write down general objects instead of constructing them. The example inside the collect defines a ColumnInfo object literal with the name and type attributes set from the column dictionary retrieved from the database.

Similarly, the final TableInfo is defined by its name and the column info objects. Object literals are a fairly trivial extension of Objective-Smalltalk dictionary literals, #{ #'key': value }, with a class-name specified between the "#" and the opening curly brace. Being able to just write down objects is, I think, one of the better and under-appreciated features of WebObjects .wod files (though it's not 100% the same thing), as well as QML and I think also part of what makes React "declarative".

Not entirely coincidentally, the "configurations" of architectural description languages can also be viewed as literal object definitions.

With that information in hand, and with the Objective-Smalltalk runtime providing class definition objects that can be used to install objects with utheir methods in the runtime, we now have enough information to create some classes straight from the SQL table definitions, without going through the intermediate steps of generating source code, adding that to a project and compiling it.

That isn't implemented, and it's also something that you don't really want, but it's a stepping stone towards creating a general mechanism for orthogonal modular persistence. The final two utility methods are not really all that noteworthy, except that they do show how expressive and yet straightforward even ordinary Objective-Smalltalk code is.


  -tables {
	 db:. collect: { :table| db:schema/{table}. }.
  }
  -<void>logTables {
     stdout do println: scheme:db tables each.	
  }

The -tables method just gets the all the schema information for all the tables. The -logTables methods prints all the tables to stdout, but individually, not as an array. Finally, there is a class extension to NSObject that supports the literal syntax on all objects and the script code that actually initialises the scheme with a database and starts an interactive session. This last feature has also been useful in Smalltalk scripting: creating specialized shells that configure themselves and then run the normal interactive prompt.

So that's it!

It's not a huge revelation, yet, but I do hope this example gives at least a tiny glimpse of what Objective-Smalltalk already is and of what it is poised to become. There is a lot that I couldn't cover here, for example that scheme-handlers aren't primarily for interactive exploration, but for composition. I also only mentioned pipes-and-filters in passing, and of course there "is" a lot more that just "isn't" there, quite yet.

As always, but even more than usual, I would love to get feedback! And of course the code is on github

Wednesday, February 6, 2019

Why Objective-Smalltalk is (not) a Smalltalk

One of the main goals of Objective-Smalltalk is to make it possible to express programs in mixtures of diverse architectural styles, and variations of those styles, naturally and comprehensibly. That needs a mechanism that abstracts from any specific style, while at the same time accommodating all, or at least a wide variety of styles.

I don't know how to do that.

Yet.

A Smalltalk

Since I don't know how to do that yet, I can't build what I need, and since I can't build what I need I can't experiment with it and learn how to do what I need to do.

Instead of waiting for divine inspiration, designing something from thin air (that will obviously be perfect) or just throwing my hands up in despair, I need to pick some starting point from which I can start iterating. In terms of styles, there are a few to choose from, for example call-return, pipes-and-filters and REST. Call-return seems like an obvious choice, because it is something that is familiar, has widespread support and is very capable of implementing other styles.

Objective-C would have been a pleasant choice: I am very familiar with it and it is very suitable for building architectural and language adapters. This suitability is not an accident either, connectivity was the primary design goal, and the designers succeeded admirably given the constraints.

However, Objective-C also has significant drawbacks as a starting point: having already merged two languages, Smalltalk and C, it is rather large and unwieldy, with weird overlaps and other oddities. Being a superset of C, it also pretty much demands being compiled, whereas I want at least the option of interpretation.

WebScript would be an alternative, but it is not just dead, but also proprietary. It also very closely mimics the somewhat baroque Objective-C syntax, but without the actual reasons for those syntactic compromises or the benefits that this sacrifice brings in Objective-C.

These and most other existing language choices would mean extending an existing language, grammar and implementation, meaning that the architectural features of the language would be almost certainly be 2nd class citizens, being tacked on wherever they can fit. That's not a good starting point (see: Objective-C).

So it really had to be a brand new language, one whose syntax and semantics would be under full control, and a syntactic starting point for the procedural part of that language. For this, I don't know of a better choice than the Smalltalk (keyword) message syntax. The syntax itself is tiny, with almost no reserved words or special characters reserved by the language, and almost all "language" features implemented as objects and messages without special syntax.

Having the procedural base as trivial as possible to implement is important, because I don't want to spend too much effort on it. I have (much) bigger fish to fry. Smalltalk also integrates well conceptually and syntactically with Cocoa and Objective-C, my preferred environment, a point not lost on the plethora of Smalltalk-based scripting languages available for Cocoa and GNUstep. And having a rich environment to integrate with is important for a language intended to connect existing pieces, which is what an architectural language does, or should do.

Not a Smalltalk

So given all the arguments for Smalltalk, surely Objective-Smalltalk is based on one of the existing Smalltalks, such as Squeak, enjoying the great development environment, malleable infrastructure etc.

Not so fast.

Although Smalltalk fits very well conceptually and syntactically, it doesn't fit at all well technically. It generally runs in its own world, the image, requires a complex and sophisticated VM, with all the integration headaches that entails (FFI, JNI, etc.), and comes with and requires a GC. Integrating multiple tracing GCs is essentially an unsolved problem, so that's a bit of a downer for a language that wants to be able to glue existing pieces together.

The philosophy of current Smalltalks is the exact opposite: it wants to be the entire world. This is understandable: when Smalltalk was created there simply wasn't a "rest of the world" to talk to, and there certainly wasn't room for it on the same machine, an Alto with 128-512KB of RAM and a 2.5MB removable hard disk.

A current Smalltalk has a lot of code, and communities that think it's just the greatest thing on earth, so resistance to change is significant and reasonable, both to smaller changes, because they just aren't worth it in that context, and to larger changes because they make things just too different. However, I want to make both large and small changes and not be hobbled by linguistic backwards compatibility.

[..] when ST hit the larger world, it was pretty much taken as "something just to be learned", as though it were Pascal or Algol. Smalltalk-80 never really was mutated into the next better versions of OOP. Given the current low state of programming in general, I think this is a real mistake.
Alan Kay: prototypes vs classes was: Re: Sun's HotSpot

So Objective-Smalltalk takes cues from Smalltalk where this is helpful, but it is not really a new version of Smalltalk. It's not a "better old thing" (>45 years), but a (probably worse) "new thing", and for that reason it has to strike out on its own.

Thursday, January 10, 2019

Objective-Smalltalk on Linux via GNUstep and Docker

Although Objective-Smalltalk's architectural orientation should be an ideal fit for The Cloud™, this has been largely theoretical in recent times as Objective-Smalltalk only ran on macOS and iOS.

Having used both GNUstep and Cocotron to port, for example, MusselWind to Linux in the past, I knew this wasn't an insurmountable probl...er opportunity. However, dealing with VMs and OSes, and configurations getting everything to compile was always painful, and usually not reproducible.

Enter Docker for Mac. I had obviously heard quite a bit about Docker, but so far never had an opportunity to actually try it out in anger. Well, maybe not "anger", more "mild chagrin", because Docker is not really meant for running Linux binaries on your Mac. However, it does that job really, really well. Instead of futzing around with VMs, dealing with special guest OS tools in order to get window sizes to match, Retina not working, displays not resizing etc. you just get a login onto a Linux "box" right in your Terminal window, no muss no fuss. Nice. And mounting directories from the Mac is also trivial. Niiiice.

The other part that makes this different is the whole idea of reproducibility via Dockerfiles. Because a container is at least conceptually ephemeral (practically you can keep it around for quite a bit), you don't capture your knowledge in the state of the VM you are working on. Instead, you accumulate the knowledge in the Dockerfile(s) used to construct your image.

The Dockerfile for creating a GNUstep image/container that is capable of compiling + running Objective-Smalltalk can be found here:

https://github.com/mpw/MPWFoundation/tree/master/GNUstep/gnustep-combined
The main Dockerfile loads first loads a bunch of packages via apt-get:
FROM ubuntu

ENV LC_ALL en_US.UTF-8
ENV LANG en_US.UTF-8
ENV TZ 'Europe/Berlin'
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone


RUN apt-get update && apt-get install -y \
    git \
    make \
    ssh \
    sudo \
    curl \
    inetutils-ping \
    vim build-essential  \
    clang llvm libblocksruntime-dev libkqueue-dev libpthread-workqueue-dev libxml2-dev cmake \
    libffi-dev \
    libreadline6-dev \
    libedit-dev \
    libmicrohttpd-dev \
    gnutls-dev 


RUN useradd -ms /bin/bash gnustep

COPY install-gnustep-clang /home/gnustep/
RUN chmod u+x /home/gnustep/install-gnustep-clang
RUN /home/gnustep/install-gnustep-clang  

COPY bashrc /home/gnustep/.bashrc
COPY profile /home/gnustep/.profile
COPY bashrc /root/.bashrc
COPY profile /root/.profile
COPY bashrc /.bashrc
COPY profile /.profile

COPY build-gnustep-clang /home/gnustep/build-gnustep-clang
RUN  mkdir -p /home/gnustep/patches/libobjc2-1.8.1/
COPY patches/libobjc2-1.8.1/objcxx_eh.cc /home/gnustep/patches/libobjc2-1.8.1/objcxx_eh.cc

RUN chmod u+x /home/gnustep/build-gnustep-clang
RUN /home/gnustep/build-gnustep-clang  


COPY build-gnustep-clang /home/gnustep/build-gnustep-clang
RUN  mkdir -p /home/gnustep/patches/libobjc2-1.8.1/
COPY patches/libobjc2-1.8.1/objcxx_eh.cc /home/gnustep/patches/libobjc2-1.8.1/objcxx_eh.cc

RUN chmod u+x /home/gnustep/build-gnustep-clang
RUN /home/gnustep/build-gnustep-clang  

CMD ["bash"]

It adds a user "gnustep", then runs a script that will downloads specific version of the gnustep and libobjc2 sources using a script, patches one of those sources which wouldn't compile for me and finally builds and installs the whole thing using another script, both adapted from Tobias Lensing's post.


#!/bin/bash
#


cd /home/gnustep/

echo Installing gnustep-make
export CC=clang
echo compiler is $CC

tar zxf gnustep-make-2.7.0.tar.gz
cd gnustep-make-2.7.0
./configure
make install
cd ..

echo 
echo 
echo ======================
echo Installing libobjc2


tar zxf libobjc2-1.8.1.tar.gz
cp patches/libobjc2-1.8.1/* libobjc2-1.8.1/
cd libobjc2-1.8.1
mkdir Build
cd Build
# cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang -DTESTS=OFF -DLLVM_OPTS=OFF  ..
cmake -DTESTS=OFF .. 
make install
cd ../..


cd gnustep-make-2.7.0
make clean
./configure --with-library-combo=ng-gnu-gnu
make install
cd ..

source /usr/local/share/GNUstep/Makefiles/GNUstep.sh
tar zxf gnustep-base-1.25.1.tar.gz 
cd gnustep-base-1.25.1
./configure
make installn
cd ..

echo Installation script finished successfully

One of the tricky bits about this is that you need to install the gnustep-make package twice, first to get libobjc2 to install, the second time with libobjc2 installed so all the other packages find the right runtime. The apt-get packages did not work for me, because I needed newer compiler/runtime features.

Once you have the image, you can start it up and log into it. I then su -l to the gnustep account and work with the MPWFoundation, ObjectiveSmalltalk and ObjectiveHTTPD mounted from my local filesystem, rather than having separate copies inside the container.

Not everything is ported, but basic expressions work, as does messaging back and forth between Objective-Smalltalk and Objective-C, Higher Order Messaging, etc. Considering some of the runtime trickery involved, I was surprised how straightforward it was to get this far.

In the future, I expect to also build a runtime image that has the libraries and executable pre-built.

Friday, January 4, 2019

Swifter Sieving of Primes with Software ICs

I've been working much more on Objective-Smalltalk again, which is great, but that also made me think more intently on the motivations, because a lot of what is there is probably not quite obvious without knowing why and how it got there.

One of the obvious motivations has been Objective-C, and in particular the notion of Software-ICs, components that are connected via dynamic messages. This is actually an instance of the Scripted Components (pdf) pattern, with the interesting twist that unlike most of the instances of this pattern, there is only a single programming language for both the scripts and the components, that language being Objective-C.

In a sense, Objective-C is not really a language, it is middleware with some language features. And therein lies much confusion, because people try to use it as "just" a language. In particular, they try to use the "Objective" parts as an algorithmic programming language, something it is eminently unsuited for. This is what the "C" part is for, which is somewhere around 90% of the actual language.

To back off for a bit, a Scripted Component system is one where you have components that are written in one language, often C or C++, that are then glued together using a more dynamic scripting language, for example Unix shell, Visual Basic. One of the more extreme examples of this pattern is computational steering of supercomputer applications using Tcl(!). Which brings up the point of performance: one of the interesting aspects of the Scripted Components pattern is that it combines the excellent performance of the component language with the flexibility and interactivity of the scripting language.

Which brings me back to Objective-C, almost. No one at the supercomputer centers that used Tcl for steering would have had the idea of implementing their actual computation in Tcl. No, implementing the components is what you use C or C++ or maybe FORTRAN for. You then combine, parametrise and maybe invoke those components from the scripting language.

Yet when it comes to Objective-C, that is exactly what people do: they implement the algorithmic parts in the scripting/middleware part of Objective-C and then wonder why it comes out looking like the following (taken from Vincent's post):


NSArray<NSNumber *> *sieve(NSArray<NSNumber *> *sorted) {
  if ([sorted count] == 0) { return [NSArray new]; }
  NSUInteger head = sorted[0].unsignedIntegerValue;
  NSArray *tail = [sorted subarrayWithRange:NSMakeRange(1, sorted.count - 1)];
  NSIndexSet *indexes = [tail indexesOfObjectsPassingTest:^BOOL(NSNumber *element, NSUInteger idx, BOOL *stop) {
    return element.unsignedIntegerValue % head > 0;
  }];
  NSArray *sieved = [tail objectsAtIndexes:indexes];
  return [@[@(head)] arrayByAddingObjectsFromArray:sieve(sieved)];
}

NSMutableArray *numbers = [NSMutableArray new];
for (NSUInteger i = 2; i <= 100; i++) {
  [numbers addObject: @(i)];
}
NSArray *primes = sieve(numbers);
NSLog(@"%@ -> %@", numbers, primes);

Yes, this code doesn't make sense. Or it makes about as much sense as implementing your computational fluid dynamics simulation in Tcl and then scripting it using C. And yes, the Swift code definitely looks a bit nicer:


func sieve(_ sorted: [Int]) -> [Int] {
	guard !sorted.isEmpty else { return [] }
	let (head, tail) = (sorted[0], sorted[1..&now lt;sorted.count])
	return [head] + sieve(tail.filter { $0 % head > 0 })
}

let numbers = Array(2...100)
let primes = sieve(numbers)
print("\(numbers) -> \(primes)")

The first thing I noticed was that I couldn't even tell what this was doing, it took me a bit of close reading of the code to realise that my confusion was due to the fact that although this does compute primes, it isn't actually the Sieve of Eratosthenes!

The hallmark of the Sieve is that it computes primes while avoiding expensive division (and multiplication by anything other than 2, a simple shift left) operations. It does this by allocating an array with the candidate numbers (it starts from 2, because all numbers are multiples of 1) and then eliminating any known non-primes (multiples of some other number) by stepping through the array at a certain step-size. To eliminate all multiples of 2, step through the array with a step size of 2. To eliminate all multiples of 3, step through the array with a step size of 3, and so on. This was great on small microcomputers that did not have multiplication or division operations in hardware.

The algorithm presented here just checks all remaining numbers if they are divisible by the current number using a modulo operation.

But the fact that this is the wrong algorithm wasn't really what caught my eye, rather that was the unidiomatic and to me, quite frankly, rather bizarre way of using Objective-C: this example uses Objective-C Foundation objects to implement algorithms on integers.

It is not surprising that this code looks weird, because it is. Objective-C is a hybrid language intended for the construction and connection of Software-ICs. You implement the algorithms inside the Software-IC using C and then package it up (and connect) with the dynamic messaging provided by the "Objective" part.

So let's do that! First, we grab a C implementation of the (real) Sieve of Eratosthenes algorithm, for example from the first Duck Go Go search result here:


#include <stdio.h>

#define LIMIT 1500000 /*size of integers array*/
#define PRIMES 100000 /*size of primes array*/

int main(){
    int i,j,numbers[LIMIT];
    int primes[PRIMES];

    /*fill the array with natural numbers*/
    for (i=0;i<limit;i++){
        numbers[i]=i+2;
    }

    /*sieve the non-primes*/
    for (i=0;i<limit;i++){
        if (numbers[i]!=-1){
            for (j=2*numbers[i]-2;j<limit;j+=numbers[i])
                numbers[j]=-1;
        }
    }

    /*transfer the primes to their own array*/
    j = 0;
    for (i=0;i<limit&&j<primes;i++)
        if (numbers[i]!=-1)
            primes[j++] = numbers[i];

    /*print*/
    for (i=0;i<primes;i++)
        printf("%dn",primes[i]);

return 0;
}

This somewhat longer than the previous examples, but it does have the distinct advantage of actually being a Sieve of Eratosthenes :-). The way it works is that it first intialises an array of size N with the integers from 1 to N. The Sieve then knocks out multiples of n for n in 1 to N. The "knocking out" is performed by setting the array element to -1. Finally, all the non-negative numbers are retrieved from the array. These are the primes, which are then printed.

Sine Objective-C is a strict superset of C, we could simply use this code as-is. However, it does lack some conveniences, for example the array sizes are fixed, the code isn't reusable and for example printing is very manual. So using the idea of the Software-IC, we notice that an array of integers is the central data structure here. Apple's Foundation doesn't provide such a beast, but that isn't a big deal, we can easily write one ourselves, or more precisely reuse the one provided in MPWFoundation.

As this is going to be running all over the array, we probably want to break open the encapsulation boundary of the Software-IC and implement the core of the algorithm as an extension, er, category of MPWIntArray.

Another slight improvement is breaking the Sieve algorithm up into smaller, named pieces that tell you what is going on, and using conveniences such as -select: to filter the results. Printing of the contents is also something that is provided.


@import MPWFoundation;

@implementation MPWIntArray(primes)

-(void)knockoutMultiplesOf:(int)n
{
    int *numbers=data;
    if (n > 0){
       for (int j=2*n-2,max=[self count];j<max;j+=n) {
           numbers[j]=-1;
       }
    }
}

-(void)sievePrimes
{
    for (int i=0,max=[self count];i<max;i++){
        [self knockoutMultiplesOf:data[i]];
    }
}

@end

int main(int argc, char *argv[])
{
    int n=argc > 1 ? atoi(argv[1]) : 100000;
    MPWIntArray *numbers=[[@(2) to:@(n-2)] asArray];
    [numbers sievePrimes];

    MPWIntArray *primes = [numbers select:^(int potentialPrime) {
        return (BOOL)(potentialPrime > 0);
    }];
    printf("primes: %s\n",[[primes description] UTF8String]);
    printf("number of primes: %d\n",[primes count]);
}

One of the benefits of the Software-IC or Scripted Components style is performance, otherwise you wouldn't want to use it in supercomputer settings. In this particular case I ran the Swift version against the Software-IC version (I didn't bother with the original Objective-C version). The performance difference is so remarkable that it was actually difficult to find values for the N parameter that gave at least somewhat reasonable/comparable results for both variants:
N Swift Sieve Software-IC Sieve Ratio
100000 0,501 0,006 83,5
200000 1,78 0,007 254,3
300000 3,78 0,008 472,5
400000 6,43 0,009 714,4
500000 9,74 0,010 974,0
Sieve times
I couldn't go much higher than N=500,000 with the Swift version because at N=500,000 it was already using around 8GB of memory and rising quickly. At that point you start to see paging activity on a reasonably busy machine, particularly one with Xcode open and I didn't want to get my machine "benchmark-clean". At those same N values, the Software-IC version isn't really getting started yet, the execution time actually being dominated by startup costs and random fluctuations. In fact, launching it with N=1 also takes 6-8ms the same as for N=500,000.

So yes, using Swift to do computations is certainly a better choice than abusing the middleware part of Objective-C to do those computations. On the other hand, used properly for constructing a Software-IC and then providing an interface to it, Objective-C does quite well. Which doesn't mean we couldn't do better for both parts: yes, somewhat better algorithmics but more importantly better middleware and less muddle about which is which. It shouldn't come as a complete shock that that is the (rough) idea behind Objective-Smalltalk, but more on that later.

UPDATE:
My little post triggered a small twitter discussion that was really much better than the post itself. Although my point was really mostly about the scripted components pattern and how it relates to Objective-C and its Software-ICs, I focused a bit too much on the performance aspect (occupational hazard). It also comes off a bit too much as "Swift vs. Objective-C", so to make that clear: the Objective-C code initially presented is probably slowest, and I am reasonably certain that you can write a C-like Sieve using Swift with roughly comparable performance.

For me, the bigger point was that this code was a good illustration of the misunderstanding and misapplication of Objective-C. I am also decidedly not saying that this misapplication is due ignorance of the people doing the misapplication, I think it has to do with the fact that the tension between scripted coomponents and unified language is something we haven't figured out yet.

Anyway, Helge managed to put the point very succinctly:

Also insightful as alway, Joe Groff:

And a little bit of "hey, I didn't know that":

And last not least: apologies again to Vincent for picking on him, just a little bit :-)