Showing posts with label REST. Show all posts
Showing posts with label REST. Show all posts

Monday, May 8, 2023

Setting up Hetzner ARM instances with and for Objective-S

The recent introduction of reasonably-priced ARM64 VPS instances by Hetzner was accompanied by a big smile and sigh of relief on my part, as I had previously made the decision to prioritize ARM with Objective-S, for example the native-compiler is currently ARM64-only, but the simple and low-cost VPS providers like Digital Ocean were sticking to x86 exclusively.

Although it is possible to operate in a mixed ARM/x86 environment, the added complexity is not something I want as a default, which is why I also switched the hosting of the Objective-S site from DO to the Oracle cloud (on their "free forever" tier), as it was the only way to host on ARM without incurring monthly charges upwards of $40. With a number of alternatives spanning the spectrum, I now felt it

I've long had a strong hunch that there is both room and a strong need for something between the "we'll just hack together a few simple shell scripts" of the (very good!) Deployment from Scratch and the aircraft carrier that is Kubernetes.

With the external pieces finally in place, it's time to follow that hunch, and what better way than to control the Hetzner server API using Objective-S?

Talking to the API

Perusing the documentation, we see that the base URL for talking to the API is https://api.hetzner.cloud/v1/. So let's set up an API scheme handler for talking to the Hetzner API, and also set up the authentication header and indicate that we will be using JSON:


scheme:https setHeaders: #{ 
    #Content-Type: 'application/json'
    #Authorization: "Bearer {keychain:password/hetzner-api/metaobject}",
       }.
scheme:api := ref:https://api.hetzner.cloud/v1 asScheme.

It's not a lot of code, but there is quite a bit going on: first, the token is stored in the macOS keychain, accessed via keychain:password/hetzner-api/metaobject. This is interpolated into the Bearer string inside a dictionary literal. The api: scheme is now available for talking to the Hetzner API, so for example api:servers will be sent as https://api.hetzner.cloud/v1/servers.

That setup now allows us to define a simple class that allows us to interact with the API:


class HetznerCloud {
   var api.
   -schemeNames { [ 'api' ]. }
   -images {
	api:images.
   }
   -types {
	api:server_types.
   }
} 

It currently has two user-facing methods: -images, which lists the kinds of images that are available and -types, which lists the server types. The method bodies may appear to be a little short, but that really is all that's needed. The -schameNames method makes the api: scheme handler available within method bodies of this class.

Below is an excerpt of an interactive st-shell session first asking the API for image types and then for server types:


] cloud images
{ "images" = ( { "id" = 3;
"description" = "CentOS 7";
"created_from" = ;
"bound_to" = ;
"rapid_deploy" = true;
"deprecated" = ;
"os_flavor" = "centos";
"type" = "system";
"protection" = { "delete" = false;
} ;
"image_size" = ;
"labels" = { } ;
"deleted" = ;
"architecture" = "x86";
"created" = "2018-01-15T11:34:45+00:00";
"os_version" = "7";
"disk_size" = 5;
"status" = "available";
...
] cloud types
...
{ "memory" = 4;
"prices" = ( { "price_monthly" = { "net" = "3.2900000000";
"gross" = "3.9151000000000000";
} ;
...
} ;
} ) ;
"storage_type" = "local";
"id" = 45;
"cpu_type" = "shared";
"disk" = 40;
"deprecated" = ;
"architecture" = "arm";
"description" = "CAX11";
"name" = "cax11";
"cores" = 2;
}
...
 

The "CAX11" instance type is the entry-level ARM64 instance that we want to use.

Creating a server

Creating a VPS is accomplished by POSTing a dictionary describing the desired properties of the server to the servers endpoint:
extension HetznerCloud {
   -baseDefinition {
	#{ 
	    #location: 'fsn1',
	    #public_net: #{
                #enable_ipv4: true,
                #enable_ipv6: false,
           }
	}.
   }
   -armServerDefinition {
	#{
           #name:  'objst-2',
           #image: '103908070',
           #ssh_keys: ['marcel@naraht.local' ],
           #server_type: 'cax11',
	} , self baseDefinition.
   }
   -create {
	  ref:api:servers post: self armServerDefinition  asJSON.
   }
}

The -create sends the post: message directly to the reference of the endpoint.

Interacting with servers

Once we have a server, we probably want to interact with it in some way, at the very least to be able to delete it again. Although we could do this using methods of the cloud API taking an extra server_id parameter, it is nicer to create a separate server abstraction that lets us interact with the server and encapsulates the necessary information.

The HetznerHost is initialized with a server response from which it uses the ip address and the server id, the latter to define a server: scheme handler. The fact that it's a subclass of MPWRemoteHost will become relevant later.


class HetznerHost : MPWRemoteHost {
   var hostDict.
   var id.
   var server.

   +withDictionary:theServer {
	self alloc initWithDictionary:theServer.
   }
   -initWithDictionary:theServer {
       self := super initWithName:(theServer at:'public_net' | at:'ipv4' | at:'ip') user:'root'.
       self setHostDict:theServer.
       self setId: theServer['id'].
       self setServer: ref:api:/servers/{this:id} asScheme.

       self.
     }
     -schemeNames { ['server']. }
     -status { this:hostDict at:'status'. }
     -delete {
         ref:server:/ delete.

     }
}

The DELETE is handled similarly to the POST above, by sending a delete message to the root reference of the server: scheme.

We get server instances with a GET from the API's servers endpoint, the same one we POSTed to create the server. The collect HOM makes it straightforward to map from the dictionaries returned by the APU to actual server objects:


extension HetznerCloud {
   -servers {
	HetznerHost collect withDictionary: (api:servers at:'servers') each.
   }
}

At this point, you're probably thinking that having a class representing servers, with its own scheme-handler to boot, is a bit of overkill if all we are going to do is send a DELETE. And you'd be right, so here are some of the other capabilities:
extension HetznerHost {
     -actions { api:servers/{this:id}/actions value.  }
     -liveStatus { server:status. }
     -refresh {
         self setHostDict: (server:/ value at:'server').
     }
     -shutdown {
         ref:server:actions/shutdown post:#{}.
     }
     -start {
         ref:server:actions/poweron post:#{}.
     }
     -reinstall:osName {
         ref:server:actions/rebuild post: #{ #image: osName }.
     }
     -reinstall {
         self reinstall:'ubuntu-20.04'.
     }
}

With this, we have complete lifecycle control over the server, with a surprisingly small amount of surprisingly straightforward code, thanks to Objective-S abstractions such as Polymorphic Identifiers, Storage Combinators and Higher Order Messaging.

What's more, this control is available both immediately in script form, as well as for reuse in other applications as objects.

Installing Objective-S

Now that we can create, start, stop and destroy virtual servers, it would be nice to actually do something with them. For example: run Objective-S and Objective-S-based web-servers.

This is where the MPWRemoteHost comes in. This is what it says on the tin: a representation of a remote host, very rudimentary for now. One of the few things it knows how to do is set up an ssh connection to that remote host to execute commands and transfer files via SFTP. The latter is surfaced as a store, so you can create files on a remote host as easily as assigning to a local variable:


dest:hello.txt := 'Hello world!'.

Copying files is similar:
dest:hello.txt := file:hello.txt.

The script copies a tar archive containing both GNUstep and the Objective-S libraries, which it then untars into the '/usr' directory of the target machine. In addition it transfers the interactive Objective-S shell st, the runsite command that serves ".sited" bundles via HTTP, and a .bashrc that sets up some needed environment variables.
extension MPWHost { 
 -installObjS {
	scheme:dest := self store.
	filenames := [ 'ObjS-GNUstep-installed.tgz', 'st', '.bashrc', 'runsite' ].
	filenames do: { :filename | 
	     dest:{filename} := file:{filename}.
	}.
	self run:'chmod a+x st runsite';
	     run:'cd /usr ; tar zxf ~/ObjS-GNUstep-installed.tgz';
	     run:'mv st /usr/local/bin';
	     run:'mv runsite /usr/local/bin'.
   }
}
host := MPWHost host:hostip user:'root'.
host installObjS.

As this is an extension to MPWHost, which is the superclass of the MPWRemoteHost we used as the base for our HetznerHost, the server objects we use have the ability to install Objective-S on them. Neat.

And so do the server objects for the very similar script controlling DO droplets.

Conclusion

When I started out on this little excursion, my goal was not to demonstrate anything about Objective-S, I only needed to be able to use these cloud systems, and my hunch was that Objective-S would be good for the task.

It turned out even better than my hunch had suggested: the various features and characteristics of Objective-S, such as Polymorphic Identifiers, first class references, nested scheme handlers, and Higher Order Messaging, really work together quite seamlessly to allow interaction with both a REST API and with a remote host to be expressed compactly and naturally. In addition, it manages to naturally bridge the gap between ad-hoc scripting and proper modelling, remaining hackable without creating a mess.

It's working...

Tuesday, August 9, 2022

Native-GUI distributed system in a tweet

If I've been a bit quiet recently it's not due to lack of progress, but rather the very opposite: so much progress in Objective-S land hat my head is spinning and I am having a hard time both processing it all and seeing where it goes.

But sometimes you need to pause, reflect, and show your work, in whatever intermediate state it currently is. So without further ado, here is the distributed system, with GUI, in a tweet:

It pops up a window with a text field, and stores whatever the user enters in an S3 bucket. It continues to do this until the user closes the window, at which point the program exits.

Of course, it's not much of a distributed system, particularly because it doesn't actually include the code for the S3 simulator.

Anyway, despite fitting in a tweet, the Objective-S script is actually not code golf, although it may appear as such to someone not familiar with Objective-S.

Instead, it is a straightforward definition and composition of the elements required:

  1. A storage combinator for interacting with data in S3.
  2. A text field inside a window, defined as object literals.
  3. A connection between the text field and a specific S3 bucket.
That's it, and it is no coincidence that the structure of the system maps directly onto the structure of the code. Let's look at the parts in detail.

S3 via Storage Combinator

The first line of the script sets up an S3 scheme handler so we can interact with the S3 buckets almost as if they were local variables. For example the following assignment statement stores the text 'Hello World!' in the "msg.txt" file of "bucket1":

   s3:bucket1/msg.txt ← 'Hello World!'
Retrieving it works similarly:

   stdout println: s3:bucket1/msg.txt
The URL of our S3 simulator is http://defiant.local:2345/, so running on host defiant in the local network, addressed by Bonjour and listening on port 2345. As Objective-S supports Polymorphic Identifiers (pdf), this URL is a directly evaluable identifier in the language. Alas, that directness poses a problem, because writing down an identifier in most programming languages yields the value of the variable the identifier identifies, and Objective-S is no exception. In the case of http://defiant.local:2345/, that value is the directory listing of the root of the S3 server, encoded as the following XML response:

<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Owner><ID>123</ID><DisplayName>FakeS3</DisplayName></Owner>
<Buckets>
<Bucket>
<Name>bucket1</Name>
<CreationDate>2022-08-10T15:18:32.000Z</CreationDate>
</Bucket>
</Buckets>
</ListAllMyBucketsResult>
That's not really what we want, we want to refer to the URL itself. The ref: allows us to do this by preventing evaluation and thus returning the reference itself, very similar to the & operator that creates pointers in C.

Except that an Objective-S reference (or more precisely, a binding) is much richer than a C pointer. One of its many capabilities is that it can be turned into a store by sending it the -asScheme message. This new store uses the reference it was created from as its base URL, all the references it receives are evaluated relative to this base reference.

The upshot is that with the s3: scheme handler defined and installed as described, the expression s3:bucket1/msg.txt evaluates to http://defiant.local:2345/bucket1/msg.txt.

This way of defining shorthands has proven extremely useful for making complex references usable and modular, and is an extremely common pattern in Objective-S code.

Declarative GUI with object literals

Next, we need to define the GUI: a window with a text field. With object literals, this is pretty trivial. Object literals are similar to dictionary literals, except that you get to define the class of the instance defined by the key/value pairs, instead of it always being a dictionary.

For example, the following literal defines a text field with certain dimensions and assigns it to the text local variable:

   text ← #NSTextField{ #stringValue:'',#frame:(10@45 extent:180@24) }.
And a window that contains the text field we just defined:
   window ← #NSWindow{ #frame:(300@300 extent:200@105),#title:'S3', #views:#[text]}.
It would have been nice to define the text field inline in its window definition, but we currently still need a variable so we can connect the text field (see next section).

Connecting components

Now that we have a text field (in a window) and somewhere to store the data, we need to connect these two components. Typically, this would involve defining some procedure(s), callback(s) or some extra-linguistics mechanism to mediate or define that connection. In Objective-S, we just connect the components:

   text → ref:s3:bucket1/msg.txt.
That's it.

The right-arrow "→" is a polymorphic connection "operator". The complete connection is actually significantly more complex:

  1. From a port of the source component
  2. To a role of the mediating connector compatible with that source port
  3. To a role of the mediating connector compatible with the target object's port
  4. To that compatible port of the target component
If you want, you can actually specify all these intermediate steps, but most of the time you don't have to, as the machinery can figure out what ports and roles are compatible. In this case, even the actual connector was determined automatically.

If we didn't want a remote S3 bucket, we could also have stored the data in a local file, for example:

   text → ref:file:/tmp/msg.txt.
That treats the file like a variable, replacing the entire contents of the file with the text that was entered. Speaking of variables, we could of course also store the text in a local variable:

   text → ref:var:message.
In our simple example that doesn't make a lot of sense because the variable isn't visible anywhere and will disappear once the script terminates, but in a larger application it could then trigger further processing.

Alternatively, we could also append the individual messages to a stream, for example to stdout:

   text → stdout.
So every time the user hits return in the text field, the content of the text field is written to the console. Or appended to a file, by connecting to the stream associated with the file rather the file reference itself:
   text → ref:file:/tmp/msg.txt outputStream.
This doesn't have to be a single stream sink, it can be a complex processing pipeline.

I hope this makes it clear, or at least strongly hints, that this is not the usual low-code/no-code trick of achieving compact code by creating super-specialised components and mechanisms that work well for a specific application, but immediately break down when pushed beyond the demo.

What it is instead is a new way of creating components, defining their interfaces and then gluing them together in a very straightforward fashion.

Eval/apply vs. connect and run

Having constructed our system by configuring and connecting components, what's left is running it. CLIApp is a subclass of NSApplication that knows how to run without an associated app wrapper or Info.plist file. It is actually instantiated by the stui script runner before the script is started, with the instance dropped into the app variable for the script.

This is where we leave our brave new world of connected components and return (or connect with) the call/return world, similar to the way Cocoa's auto-generated main with call to NSApplicationMain() works.

The difference between eval/apply (call/return) and connect/run is actually quite profound, but more on that in another post.

Of course, we didn't leave call/return behind, it is still present and useful for certain tasks, such as transforming an element into something slightly different. However, for constructing systems, having components that can be defined, configured and connected directly ("declaratively") is far superior to doing so procedurally, even than the fluent APIs that have recently popped up and that have been mislabeled as "declarative".

This project is turning out even better than I expected. I am stoked.

Monday, June 20, 2022

Blackbird: A reference architecture for local-first connected mobile apps

Wow, what a mouthful! Although this architecture has featured in a number of my other writings, I haven't really described it in detail by itself. Which is a shame, because I think it works really well and is quite simple, a case of Sophisticated Simplicity.

Why a reference architecture?

The motivation for creating and now presenting this reference architecture is that the way we build connected mobile apps is broken, and none of the proposed solutions appear to help. How are they broken? They are overly complex, require way too much code, perform poorly and are unreliable.

Very broadly speaking, these problems can be traced to the misuse of procedural abstraction for a problem-space that is broadly state-based, and can be solved by adapting a state-based architectural style such as in-process REST and combining it with well-known styles such as MVC.

More specifically, MVC has been misapplied by combining UI updates with the model updates, a practice that becomes especially egregious with asynchronous call-backs. In addition, data is pushed to the UI, rather than having the UI pull data when and as needed. Asynchronous code is modelled using call/return and call-backs, leading to call-back hell, needless and arduous transformation of any dependent code into asynchronous code (see "what color is your function") that is also much harder to read, discouraging appropriate abstractions.

Backend communication is also an issue, with newer async/await implementations not really being much of an improvement over callback-based ones, and arguably worse in terms of actual readability. (They seem readable, but what actually happens is different  enough that the simplicity is deceptive).

Overview

The overall architecture has four fundamental components:
  1. The model
  2. The UI
  3. The backend
  4. The persistence
The main objective of the architecture is to keep these components in sync with each other, so the whole thing somewhat resembles a control loop architecture: something disturbs the system, for example the user did something in the UI, and the system responds by re-establishing equilibrium.

The model is the central component, it connects/coordinates all the pieces and is also the only one directly connected to more than one piece. In keeping with hexagonal architecture, the model is also supposed to be the only place with significant logic, the remainder of the system should be as minimal, transparent and dumb as possible.

memory-model := persistence.
persistence  |= memory-model.
ui          =|= memory-model. 
backend     =|= memory-model.

Graphically:

Elements

Blackbird depends crucially on a number of architectural elements: first are stores of the in-process REST architectural style. These can be thought of as in-process HTTP servers (without the HTTP, of course) or composable dictionaries. The core store protocol implements the GET, PUT and DELETE verbs as messages.

The role of URLs in REST is taken by Polymorphic Identifiers. These are objects that can reference identify values in the store, but are not direct pointers. For example, they need to be a able to reference objects that aren't there yet.

Polymorphic Identifiers can be application-specific, for example they might consist just of a numeric id,

MVC

For me, the key part of the MVC architectural style is the decoupling of input processing and resultant output processing. That is, under MVC, the view (or a controller) make some change to the model and then processing stops. At some undefined later time (could be synchronous, but does not have to be) the Model informs the UI that it has changed using some kind of notification mechanism.

In Smalltalk MVC, this is a dependents list maintained in the model that interested views register with. All these views are then sent a #changed message when the model has changed. In Cocoa, this can be accomplished using NSNotificationCenter, but really any kind of broadcast mechanism will do.

It is then the views' responsibility to update themselves by interrogating the model.

For views, Cocoa largely automates this: on receipt of the notification, the view just needs invalidate itself, the system then automatically schedules it for redrawing the next time through the event loop.

The reason the decoupling is important to maintain is that the update notification can come for any other reason, including a different user interaction, a backend request completing or even some sort of notification or push event coming in remotely.

With the decoupled M-V update mechanism, all these different kinds of events are handled identically, and thus the UI only ever needs to deal with the local model. The UI is therefore almost entirely decoupled from network communications, we thus have a local-first application that is also largely testable locally.

Blackbird refines the MVC view update mechanism by adding the polymorphic identifier of the modified item in question and placing those PIs in a queue. The queue decouples model and view even more than in the basic MVC model, for example it become fairly trivial to make the queue writable from any thread, but empty only onto the main thread for view updates. In addition, providing update notifications is no longer synchronous, the updater just writes an entry into the queue and can then continue, it doesn't wait for the UI to finish its update.

Decoupling via a queue in this way is almost sufficient for making sure that high-speed model updates don't overwhelm the UI or slow down the model. Both these performance problems are fairly rampant, as an example of the first, the Microsoft Office installer saturates both CPUs on a dual core machine just painting its progress bar, because it massively overdraws.

An example of the second was one of the real performance puzzlers of my career: an installer that was extremely slow, despite both CPU and disk being mostly idle. The problem turned out to be that the developers of that installer not only insisted on displaying every single file name the installer was writing (bad enough), but also flushing the window to screen to make sure the user got a chance to see it (worse). This then interacted with a behavior of Apple's CoreGraphics, which disallows screen flushes at a rate greater than the screen refresh rate, and will simply throttle such requests. You really want to decouple your UI from your model updates and let the UI process updates at its pace.

Having polymorphic identifiers in the queue makes it possible for the UI to catch up on its own terms, and also to remove updates that are no longer relevant, for example discarding duplicate updates of the same element.

The polymorphic identifier can also be used by views in order to determine whether they need to update themselves, by matching against the polymorphic identifier they are currently handling.

Backend communication

Almost every REST backend communication code I have seen in mobile applications has created "convenient" cover methods for every operation of every endpoint accessed by the application, possibly automatically generated.

This ignores the fact that REST only has a few verbs, combined with a great number of identifiers (URLs). In Blackbird, there is a single channel for backend communication: a queue that takes a polymorphic identifier and an http verb. The polymorphic identifier is translated to a URL of the target backend system, the resulting request executed and when the result returns it is placed in the central store using the provided polymorphic identifier.

After the item has been stored, an MVC notification with the polymorphic identifier in question is enqueued as per above.

The queue for backend operations is essentially the same one we described for model-view communication above, for example also with the ability to deduplicate requests correctly so only the final version of an object gets sent if there are multiple updates. The remainder of the processing is performed in pipes-and-filters architectural style using polymorphic write streams.

If the backend needs to communicate with the client, it can send URLs via a socket or other mechanism that tells the client to pull that data via its normal request channels, implementing the same pull-constraint as in the rest of the system.

One aspect of this part of the architecture is that backend requests are reified and explicit, rather than implicitly encoded on the call-stack and its potentially asynchronous continuations. This means it is straightforward for the UI to give the user appropriate feedback for communication failures on the slow or disrupted network connections that are the norm on mobile networks, as well as avoid accidental duplicate requests.

Despite this extra visibility and introspection, the code required to implement backend communications is drastically reduced. Last not least, the code is isolated: network code can operate independently of the UI just as well as the UI can operate independently of the network code.

Persistence

Persistence is handled by stacked stores (storage combinators).

The application is hooked up to the top of the storage stack, the CachingStore, which looks to the application exactly like the DictStore (an in-memory store). If a read request cannot be found in the cache, the data is instead read from disk, converted from JSON by a mapping store.

For testing the rest of the app (rather than the storage stack), it is perfectly fine to just use the in-memory store instead of the disk store, as it has the same interface and behaves the same, except being faster and non-persistent.

Writes use the same asynchronous queues as the rest of the system, with the writer getting the polymorphic identifiers of objects to write and then retrieving the relevant object(s) from the in-memory store before persisting. Since they use the same mechanism, they also benefit from the same uniquing properties, so when the I/O subsystem gets overloaded it will adapt by dropping redundant writes.

Consequences

With the Blackbird reference architecture, we not only replace complex, bulky code with much less and much simpler code, we also get to reuse that same code in all parts of the system while making the pieces of the system highly independent of each other and optimising performance.

In addition, the combination of REST-like stores that can be composed with constraint- and event-based communication patterns makes the architecture highly decoupled. In essence it allows the kind of decoupling we see in well-implemented microservices architectures, but on mobile apps without having to run multiple processes (which is often not allowed).

Sunday, July 25, 2021

Deleting Code to Double the Performance of my Trivial Objective-S Tasks Backend

About two months ago, I showed a trivial tasks backend for a hypothetical ToDoMVC app. At the time, I noted that the performance was pretty insane for something written in a (slow) scripting language: 7K requests per second when fetching a single task.

That was using an encoder method (that writes key/value pairs to the JSON encoder) written in Objective-S, and I wondered how much faster it would go if that was no longer the case. Twice as fast, it turns out.

Yesterday, I wrote about tuning the Objective-S's SQLite insert performance to around 130M rows/minute, coincidentally also for a simple tasks schema. One part of that performance story was the fact that the encoder method (writing key/value pairs to the SQLite encoder) was generated by pasting together Objective-C blocks and installing the whole thing as an Objective-C method. No interpretation, except for calling a series of blocks stored in an NSArray. I had completely forgotten about the hand-written Objective-S encoder method in the back-end's Task class! Since generation is automatic, but won't override an already existing method, all I had to do in order to get the better performance was delete the old method.


> wrk -c 1 -t 1 http://localhost:8082/tasks 
Running 10s test @ http://localhost:8082/tasks
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    66.60us    9.69us   1.08ms   96.72%
    Req/Sec    14.95k   405.55    15.18k    98.02%
  150275 requests in 10.10s, 30.67MB read
Requests/sec:  14879.22
Transfer/sec:      3.04MB
> curl  http://localhost:8082/tasks         
[{"id":1,"done":0,"title":"Clean Room"},{"id":2,"done":1,"title":"Check Twitter"}]%

More than twice the performance, and that while fetching two tasks instead of just one, so around 30K tasks/second! (And yes, I checked that I wasn't hitting a 404...).

So what's the performance if we actually fetch more than a minimal number of tasks? For 128 tasks, 64x more than before, it's still around 9K requests/s, so most of the time so far was per-request overhead. At this point we are serving a little over 1M tasks/s:


> wrk -c 1 -t 1 'http://localhost:8082/tasks/' 
Running 10s test @ http://localhost:8082/tasks/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   112.13us   76.17us   5.57ms   99.63%
    Req/Sec     9.05k   397.99     9.21k    97.03%
  90923 requests in 10.10s, 483.41MB read
Requests/sec:   9002.44
Transfer/sec:     47.86MB

If memory serves, that was around the rate we were seeing with the Wunderlist backend when we had a couple of million users, not that these are comparable in any meaningful way. For 1024 tasks there's a significant drop to slightly above 1.8K requests/s, with the task-rate almost doubling to 1.8M/s:


> wrk -c 1 -t 1 'http://localhost:8082/tasks/' 
Running 10s test @ http://localhost:8082/tasks/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   552.06us   62.77us   1.84ms   81.08%
    Req/Sec     1.82k    52.95     1.89k    90.10%
  18267 requests in 10.10s, 778.36MB read
Requests/sec:   1808.59
Transfer/sec:     77.06MB

UPDATE:
Of course, those larger request sizes also see a much larger increase in performance than 2x. With the old code, the 128 task case clocks in at 147 requests/s and the 1024 task case at 18 requests/s, at which point it's a 100x improvement. So gives you an idea just how slow my Objective-S interpreter is.

Wednesday, June 30, 2021

Don't Generate Glue...Exterminate!!

Today I saw the news of github's release of the "AI Autopilot". As far as I can tell, it's an impressive piece of engineering that shouldn't exist. I mean, "Paste Code from Stack Overflow as a Service" was supposed to be a joke, not a product spec.

As of this writing, the first example given on the product page is some code to call a REST service that does sentiment analysis, which the AI helpfully completes. For reference, this is the code:


#!/usr/bin/env ts-node

import { fetch } from "fetch-h2";

// Determine whether the sentiment of text is positive
// Use a web service
async function isPositive(text: string): Promise<boolean> {
  const response = await fetch(`http://text-processing.com/api/sentiment/`, {
    method: "POST",
    body: `text=${text}`,
    headers: {
      "Content-Type": "application/x-www-form-urlencoded",
    },
  });
  const json = await response.json();
  return json.label === "pos";
}

Here is the same script in Objective-S:
#!env stsh
#-sentiment:text
((ref:http://text-processing.com/api/sentiment/ postForm:#{ #text: text }) at:'label') = 'pos'

And once you have it, reuse it. And keep those Daleks at bay.

Thursday, May 20, 2021

A far too simple (hardcoded) tasks backend in Objective-S

Recently there was a question as to what one should use to create a backend for an iOS/macOS app these days. I couldn't resist mentioning Objective-S, and just to check for myself whether that's feasible, I quickly jotted down the following tiny backend that returns a hardcoded list of tasks via HTTP:
#!env stsh
framework:ObjectiveHTTPD load.

class Task {
   var <bool> done.
   var title.
   -description { "Task: {this:title} done: {this:done}". }
}

taskList ← #( #Task{ #title: 'Clean my room', #done: false }, #Task{ #title: 'Check twitter feed', #done: true } ).

scheme todo {
   var taskList.
   /tasks { 
      |= { 
         this:taskList.
      }
   }
}.

todo := #todo{ #taskList: taskList }.
server := #MPWSchemeHttpServer{ #scheme: todo, #port: 8082 }.
server start.
shell runInteractiveLoop.

After loading the HTTP framework, we define a Task and a list of two example tasks. Then we define a scheme with a single path, just /tasks, which returns said tasks list. We then instantiate the scheme and serve it via HTTP on port 8082. Since this is a shell script and starting the server does not block, we finally start up the REPL.

Details such as coding the tasks as JSON, accessing a single task and modifying tasks are left as exercises for the reader.

Saturday, December 14, 2019

The Four Stages of Objective-Smalltalk

One of the features that can be confusing about Objective-Smalltalk is that it actually has several parts that are each significant on their own, so frequently will focus on just one of these (which is fine!), but without realising that the other parts also exist, which is unfortunate as they are all valuable and complement each other. In fact, they can be described as stages that are (logically) built on top of each other.

1. WebScript 2 / "Shasta"

Objective-C has always had great integration with other languages, particularly with a plethora of scripting languages, from Tcl to Python and Ruby to Lisp and Scheme and their variants etc. This is due not just to the fact that the runtime is dynamic, but also that it is simple and C-based not just in terms of being implemented in C, but being a peer to C.

However, all of these suffer from having two somewhat disparate languages, with competing object models, runtimes, storage strategies etc. One language that did not have these issues was WebScript, part of WebObjects and essentially Objective-C-Script. The language was interpreted, a peer in which you could even implement categories on existing Objective-C objects, and so syntactically compatible that often you could just copy-paste code between the two. So close to the ideal scripting language for that environment.

However, the fact that Objective-C is already a hybrid with some ugly compromises means that these compromises often no longer make sense at all in the WebScript environment. For example, Objective-C strings need an added "@" character because plain double quotes are already taken by C strings, but there are no C strings in WebScripts. Primitive types like int can be declared, but are really objects, the declaration is a dummy, a NOP. Square brackets for message sends are needed in Objective-C to distinguish messages from the rest of the C syntax, but the that's also irrelevant in WebScript. And so on.

So the first stage of Objective-Smalltalk was/is to have all the good aspects of WebScript, but without the syntactic weirdness needed to match the syntactic weirdness of Objective-C that was needed because Objective-C was jammed into C. I am not the only one who figured out the obvious fact that such a language is, essentially, a variant of Smalltalk, and I do believe this pretty much matches what Brent Simmons called Shasta.

Implementation-wise, this works very similarly to WebScript in that everything in the language is an object and gets converted to/from primitives when sending or receiving messages as needed.

This is great for a much more interactive programming model than what we have/had (and the one we have seems to be deteriorating as we speak):

And not just for isolated fragments, but for interacting with and tweaking full applications as they are running:

2. Objective-C without the C

Of course, getting rid of the (syntactic) weirdnesses of Objective-C in our scripting language means that it is no longer (syntactically) compatible with Objective-C. Which is a shame.

It is a shame because this syntactic equivalence between Objective-C and WebScript meant that you could easily move code between them. Have a script that has become stable and you want to reuse it? Copy and paste that code into an Objective-C file and you're good to go. Need it faster? Same. Have some Objective-C code that you want to explore, create variants of etc? Paste it into WebScript. Such a smooth integration between scripting and "programming" is rare and valuable.

The "obvious" solution is to have a native AOT-compiled version of this scripting language and use it to replace Objective-C. Many if not all other scripting languages have struggled mightily with becoming a compiled language, either not getting there at all or requiring JIT compilers of enormous size, complexity, engineering effort and attack surface.

Since the semantic model of our scripting language ist just Objective-C, we know that we can AOT-compile this language with a fairly straightforward compiler, probably a lot simpler than even the C/Objective-C compilers currently used, and plugging into the existing toolchain. Which is nice.

The idea seems so obvious, but apparently it wasn't.

Everything so far would, taken together, make for a really nice replacement for Objective-C with a much more productive and, let's face it, fun developer experience. However, even given the advantages of a simpler language, smoothly integrated scripting/programming and instant builds, it's not really clear that yet another OO language is really sufficient, for example the Etoilé project or the eero language never went anywhere, despite both being very nice.

3. Beyond just Objects: Architecture Oriented Programming

Ever since my Diplomarbeit, Approaches to Composition and Refinement in Object-Oriented Design back in 1997, I've been interested in Software Architecture and Architecture Description Languages (ADLs) as a way of overcoming the problems we have when constructing larger pieces of software.

One thing I noticed very early is that the elements of an ADL closely match up with and generalise the elements of a programming language, for example an object-oriented language: object generalises to component, message to connector. So it seemed that any specific pogramming language is just a specialisation or instantiation of a more general "architecture language".

To explore this idea, I needed a language that was amenable to experimentation, by being both malleable enough as to allow a metasystem that can abstract away from objects and messages and simple/small enough to make experimentation feasible. A simple variant of Smalltalk would do the trick. More mature variants tend to push you towards building with what is there, rather than abstracting from it, they "...eat their young" (Alan Kay).

So Objective-Smalltalk fits the bill perfectly as a substrate for architecture-oriented programming. In fact, its being built on/with Objective-C, which came into being largely to connect the C/Unix world with the Smalltalk world, means it is already off to a good start.

What to build? How about not reinventing the wheel and simply picking the (arguably) 3 most successful/popular architectural styles:

  • OO (subsuming the other call/return styles)
  • Unix Pipes and Filters
  • REST
Again, surprisingly, at least to me, even these specific styles appear to align reasonably well with the elements we have in a programming language. OO is already well-developed in (Objective-)Smalltalk, dataflow maps to Smalltalk's assignment operator, which needed to be made polymorphic anyway, and REST at least partially maps to non-message identifiers, which also are not polymorphic in Smalltalk.

Having now built all of these abstractions into Objective-Smalltalk, I have to admit again to my surprise how well they work and work together. Yes, it was my thesis, and yes, I can now see confirmation bias everywhere, but it was also a bit of a long-shot.

4. Architecture Oriented Metaprogramming

The architectural styles described above are implemented in frameworks and their interfaces hard-coded into the language implementation. However, with three examples , it should now be feasible to create linguistic support for defining the architectural styles in the language itself, allowing users to define and refine their own architectural styles. This is ongoing work.

What now?

One of the key takeaways from this is that each stage is already quite useful, and probably a worthy project all by itself, it just gets Even Better™ with the addition of later stages. Another is that I need to get back to getting stage ready, as it wasn't actually needed for stage 3, at least not initially.

Monday, November 11, 2019

What Alan Kay Got Wrong About Objects

One of the anonymous reviewers of my recently published Storage Combinators paper (pdf) complained that hiding disk-based, remote, and local abstractions behind a common interface was a bad idea, citing Jim Waldo's A Note on Distributed Computing.

Having read both this and the related 8 Fallacies of Distributed Computing a while back, I didn't see how this would apply, and re-reading confirmed my vague recollections: these are about the problems of scaling things up from the local case to the distributed case, whereas Storage Combinators and In-Process REST are about scaling things down from the distributed case to the local case. Particularly the Waldo paper is also very specifically about objects and messages, REST is a different beast.

And of course scaling things down happens to be time-honored tradtition with a pretty good track record:

In computer terms, Smalltalk is a recursion on the notion of computer itself. Instead of dividing "computer stuff" into things each less strong than the whole—like data structures, procedures, and functions which are the usual paraphernalia of programming languages—each Smalltalk object is a recursion on the entire possibilities of the computer. Thus its semantics are a bit like having thousands and thousands of computers all hooked together by a very fast network.
Mind you, I think this is absolutely brilliant: in order to get something that will scale up, you simply start with something large and then scale it down!.

But of course, this actually did not happen. As we all experienced scaling local objects and messaging up to the distributed case did not (CORBA, SOAP,...), and as Waldo explains, cannot, in fact, work. What gives?

My guess is that the method described wasn't actually used: when Alan came up with his version of objects, there were no networks with thousands of computers. And so Alan could not actually look at how they communicated, he had to imagine it, it was a Gedankenexperiment. And thus objects and messages were not a scaled-down version of an actual larger thing, they were a scaled down version of an imagined larger thing.

Today, we do have a large network of computers, with not just thousands but billions of nodes. And they communicate via HTTP using the REST architectural style, not via distributed objects and messages.

So maybe if we took that communication model and scaled it down, we might be able to do even better than objects and messages, which already did pretty brilliantly. Hence In-Process REST, Polymorphic Identifiers and Storage Combinators, and yes, the results look pretty good so far!

The big idea is "messaging" -- that is what the kernal of Smalltalk/Squeak is all about (and it's something that was never quite completed in our Xerox PARC phase). The Japanese have a small word -- ma -- for "that which is in between" -- perhaps the nearest English equivalent is "interstitial". The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. Think of the internet -- to live, it (a) has to allow many different kinds of ideas and realizations that are beyond any single standard and (b) to allow varying degrees of safe interoperability between these ideas.

So of course Alan is right after all, just not about objects and messages, which are too specific: "ma", or "interstitialness" or "connector" is the big idea, messaging is just one incarnation of that idea.