tag:blogger.com,1999:blog-83973117663192152182024-03-11T04:24:18.053+01:00metablogMarcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.comBlogger176125tag:blogger.com,1999:blog-8397311766319215218.post-64614219688718768672024-01-03T10:29:00.000+01:002024-01-03T10:29:07.967+01:00DAPLs: Domain Agnostic Programming LanguagesEvery once in a while, you get an insight that hits you like a truck. Or maybe a ton of bricks.
Or a truck carrying a ton of bricks. Developing <a href="https://objective.st">Objective-S<a> has
delivered a bunch of these, but one of the biggest was that our <em>General Purpose Programming
Languages</em> are nothing of the sort. They are Domain Specific Languages for the domain
of algorithms. See also: ALGOL. To move forward, programming languages will have to support
more than just this one architectural style.<p>
Alas, communicating this insight has been...<em>challenging</em>. One method was branding
<a href="https://objective.st">Objective-S<a> as "the first general purpose programming language".
This did not always go over well.<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">This doesn't really sound much like a discussion...<br><br>I understand you're trying to make a fun rhetorical point with the framing, but I have to say it just makes me uninterested in any serious engagement.<br><br>Makes it seem like you don't care to even acknowledge existing PL work. =/</p>— Chandler Carruth (@chandlerc1024) <a href="https://twitter.com/chandlerc1024/status/1391653250529513473?ref_src=twsrc%5Etfw">May 10, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
Or
<figure>
<blockquote>
It looks interesting. But stuff like this: <br>
> Objective-S is the first general purpose programming language <br>
Those kind of statements annoy me to be honest. Is it really true? Or is it over-the-top marketing hype?
For me – and I'm sure I'm not the only person who feels this way – it creates a negative first impression.
</blockquote>
<figcaption>
<cite><a href="https://news.ycombinator.com/item?id=27100887">skissane on May 9, 2021 on: Objective-S: architecture-oriented language based ...</a></cite>
</figcaption>
</figure>
<p>
Of course, there is no such thing as bad publicity, but this had a bit too much of a lunatic-fringe
vibe, no matter how correct the insight, and no matter how ill-fitting the moniker "general purpose
language" really is for our call/return-oriented algorithm-DSLs.<p>
As Richard Feynman once put it, "One of the miseries of life is that everyone names everything a litte bit wrong, and so it makes everything a little harder to understand in the world than it would be if it were named differently". Calling our algorithm-DSLs "general purpose" implies that we have solved the problem of
generality, when we have not, and that the only real alternative is to be more specific, hence DSLs.
But DSLs also don't really work that well, because the successful ones almost invariable grow
non-domain-specific features, just in a haphazard way. Or they need to be combined to cover different
fields, so we get language workbenches that allow us to define lots of little DSLs and combine them.<p>
This all points to the fact that our problem is not being too general, but too specific. Our algorithm-DSLs
just aren't very good at covering a lot of the problems programmers have to solve, though of course
they are Turing-complete and can get us there, somehow.<p>
Riffing off those ideas, and leaving aside the minefield of incorrect but entrenched terminology, I
propose the term <em>Domain Agnostic Programming Language</em>. Because any sufficiently powerful
DSL can be bent out of shape sufficiently for any purpose, just like our algorithm-DSLs can. They
just aren't a good fit. And so Objective-S is not the first general purpose language, it is the
first, and almost certainly the worst, DAPL. And hopefully its programming environment will be
dapper.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-38553818380833302932023-06-07T10:44:00.001+02:002023-06-08T23:54:07.022+02:00Mojo is a much better "Objective-C without the C" than Swift ever wasOne of the primary things that people don't understand about Objective-C is that it is a solution of
the <a href="https://thebottomline.as.ucsb.edu/2018/10/julia-a-solution-to-the-two-language-programming-problem">two language problem</a>, or more precisely a generalisation of the two language problem to
the scripted component pattern.<p>
The scripted component pattern itself is a (common) solution to the problem, first identified in the
70s that <a href="https://dl.acm.org/doi/10.1145/390016.808431">programming-in-the-large is not the same as programming-in-the-small</a>, that module implementation
languages are not necessarily suitable as module interconnection languages.<p>
And so we have all sorts of flexible connection languages, often interpreted (aka glue, scripting, and orchestration languages),
starting with the Unix shell, in addition to fast, compiled component languages such as C, C++ and
Rust, and a system will usually incorporate at least one of each kind.<p>
But then you run into the two language problem: you have to deal with these two distinct languages, with
how they integrate, and with the boundaries of the integration often not matching up very well with the
boundaries of the problem you're trying to solve.<p>
Objective-C <a href="https://blog.metaobject.com/2019/03/software-ics-binary-compatibility-and.html">solved</a> the two language problem by just jamming the two languages into one: Smalltalk for the
scripting/integration and C for the component language. Interoperability is smooth and at the statement
level, thougha there is some
friction due to overlaps caused by integrating two existing languages that were not designed to be
integrated.<p>
<a href="https://www.modular.com/mojo">Mojo</a> essentially uses the Objective-C approach of jamming the two languages into one. Except it
doesn't repeat Objective-C's mistake of using the component language as the base (which, inexplicably,
Swift didn't just repeat, but actually doubled down on by largely deprecating objects). The reason this
is a mistake is that it turns out that the connection language is actually the more general one, the
component language is a specialisation of the connection language.<p>
With this realisation, Mojo's approach of making the connection language the base language make sense.
In addition, the fact that the component language is a specialisation also means that you don't
actually need to jam a full second language into your base, a few syntactic markers to to indicate
the specialisations are sufficient.<p>
This is pretty much exactly stage 2 of <a href="https://blog.metaobject.com/2019/12/the-4-stages-of-objective-smalltalk.html">the 4 stages of Objective-S</a>, so I think they are using exactly the right approach for this. Except of course for the use of Python as the base instead of Smalltalk, which is a pragmatic
choice given what they are trying to accomplish, but means your connection language is unduly limited.<p>
<a href="https://objective.st/">Objective-S</a> has the same basic structure, but with a much more
capable connection language as the base.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-58632985145931448482023-05-08T12:19:00.001+02:002023-05-08T12:19:31.015+02:00Setting up Hetzner ARM instances with and for Objective-SThe recent <a href="https://www.hetzner.com/press-release/arm64-cloud/">introduction</a>
of reasonably-priced ARM64 VPS instances by Hetzner was accompanied by a big smile and
sigh of relief on my part, as I had previously made the decision to prioritize ARM with
Objective-S, for example the native-compiler is currently ARM64-only, but the simple and
low-cost VPS providers like Digital Ocean were sticking to x86 exclusively.<p>
Although it is possible to operate in a mixed ARM/x86 environment, the added complexity
is not something I want as a default, which is why I also switched the hosting of the
<a href="http://objective.st/">Objective-S site</a> from DO to the Oracle cloud (on their
"free forever" tier), as it was the only way to host on ARM without incurring monthly
charges upwards of $40. With a number of alternatives spanning the spectrum, I now felt
it <p>
I've long had a strong hunch that there is both room and a strong need for something
between the "we'll just hack together a few simple shell scripts" of the (very good!)
<a href="https://deploymentfromscratch.com/">Deployment from Scratch</a> and the
<a href="https://2020.programming-conference.org/home/salon-2020">aircraft carrier</a>
that is Kubernetes.<p>
With the external pieces finally in place, it's time to follow that hunch, and what
better way than to control the Hetzner server API using Objective-S?<p>
<h3>Talking to the API</h3>
Perusing the <a href="https://docs.hetzner.cloud/#overview">documentation</a>, we see
that the base URL for talking to the API is <code>https://api.hetzner.cloud/v1/</code>.
So let's set up an API scheme handler for talking to the Hetzner API, and also set
up the authentication header and indicate that we will be using JSON:<p>
<hr><code><pre>
scheme:https setHeaders: #{
#Content-Type: 'application/json'
#Authorization: "Bearer {keychain:password/hetzner-api/metaobject}",
}.
scheme:api := ref:https://api.hetzner.cloud/v1 asScheme.
</pre></code><hr>
It's not a lot of code, but there is quite a bit going on: first, the token is
stored in the macOS keychain, accessed via <code>keychain:password/hetzner-api/metaobject</code>.
This is interpolated into the Bearer string inside a dictionary literal. The <code>api:</code> scheme
is now available for talking to the Hetzner API, so for example <code>api:servers</code> will
be sent as <code>https://api.hetzner.cloud/v1/servers</code>. <p>
That setup now allows us to define a simple class that allows us to interact with the API:
<hr><code><pre>
class HetznerCloud {
var api.
-schemeNames { [ 'api' ]. }
-images {
api:images.
}
-types {
api:server_types.
}
}
</pre></code><hr>
It currently has two user-facing methods: <code>-images</code>, which lists the kinds of
images that are available and <code>-types</code>, which lists the server types. The method
bodies may appear to be a little short, but that really is all that's needed. The <code>-schameNames</code>
method makes the <code>api:</code> scheme handler available within method bodies of this class.<p>
Below is an excerpt of an interactive st-shell session first asking the API for image types and then for
server types:<p>
<hr><code><pre>
] cloud images
{ "images" = ( { "id" = 3;
"description" = "CentOS 7";
"created_from" = ;
"bound_to" = ;
"rapid_deploy" = true;
"deprecated" = ;
"os_flavor" = "centos";
"type" = "system";
"protection" = { "delete" = false;
} ;
"image_size" = ;
"labels" = { } ;
"deleted" = ;
"architecture" = "x86";
"created" = "2018-01-15T11:34:45+00:00";
"os_version" = "7";
"disk_size" = 5;
"status" = "available";
...
] cloud types
...
{ "memory" = 4;
"prices" = ( { "price_monthly" = { "net" = "3.2900000000";
"gross" = "3.9151000000000000";
} ;
...
} ;
} ) ;
"storage_type" = "local";
"id" = 45;
"cpu_type" = "shared";
"disk" = 40;
"deprecated" = ;
"architecture" = "arm";
"description" = "CAX11";
"name" = "cax11";
"cores" = 2;
}
...
</pre></code><hr>
The "CAX11" instance type is the entry-level ARM64 instance that we want to use.
<h3>Creating a server</h3>
Creating a VPS is accomplished by POSTing a dictionary describing the desired
properties of the server to the <code>servers</code> endpoint:
<hr><code><pre>
extension HetznerCloud {
-baseDefinition {
#{
#location: 'fsn1',
#public_net: #{
#enable_ipv4: true,
#enable_ipv6: false,
}
}.
}
-armServerDefinition {
#{
#name: 'objst-2',
#image: '103908070',
#ssh_keys: ['marcel@naraht.local' ],
#server_type: 'cax11',
} , self baseDefinition.
}
-create {
ref:api:servers post: self armServerDefinition asJSON.
}
}
</pre></code><hr>
The <code>-create</code> sends the <code>post:</code> message directly
to the reference of the endpoint. <p>
<h3>Interacting with servers</h3>
Once we have a server, we probably want to interact with it in some way,
at the very least to be able to delete it again. Although we could do
this using methods of the cloud API taking an extra <code>server_id</code>
parameter, it is nicer to create a separate server abstraction that
lets us interact with the server and encapsulates the necessary information.<p>
The <code>HetznerHost</code> is initialized with a server response from which it
uses the ip address and the server id, the latter to define a <code>server:</code>
scheme handler. The fact that it's a subclass of <code>MPWRemoteHost</code> will
become relevant later.<p>
<hr><code><pre>
class HetznerHost : MPWRemoteHost {
var hostDict.
var id.
var server.
+withDictionary:theServer {
self alloc initWithDictionary:theServer.
}
-initWithDictionary:theServer {
self := super initWithName:(theServer at:'public_net' | at:'ipv4' | at:'ip') user:'root'.
self setHostDict:theServer.
self setId: theServer['id'].
self setServer: ref:api:/servers/{this:id} asScheme.
self.
}
-schemeNames { ['server']. }
-status { this:hostDict at:'status'. }
-delete {
ref:server:/ delete.
}
}
</pre></code><hr>
The DELETE is handled similarly to the POST above, by sending a <code>delete</code> message to
the root reference of the <code>server:</code> scheme.<p>
We get server instances with a GET from the API's <code>servers</code> endpoint, the same
one we POSTed to create the server. The <code>collect</code> HOM makes it straightforward
to map from the dictionaries returned by the APU to actual server objects:<p>
<hr><code><pre>
extension HetznerCloud {
-servers {
HetznerHost collect withDictionary: (api:servers at:'servers') each.
}
}
</pre></code><hr>
At this point, you're probably thinking that having a class representing servers,
with its own scheme-handler to boot, is a bit of overkill if all we are going to
do is send a DELETE. And you'd be right, so here are some of the other capabilities:
<hr><code><pre>
extension HetznerHost {
-actions { api:servers/{this:id}/actions value. }
-liveStatus { server:status. }
-<void>refresh {
self setHostDict: (server:/ value at:'server').
}
-shutdown {
ref:server:actions/shutdown post:#{}.
}
-start {
ref:server:actions/poweron post:#{}.
}
-reinstall:osName {
ref:server:actions/rebuild post: #{ #image: osName }.
}
-reinstall {
self reinstall:'ubuntu-20.04'.
}
}
</pre></code><hr>
With this, we have complete lifecycle control over the server, with a surprisingly
small amount of surprisingly straightforward code, thanks to Objective-S abstractions
such as Polymorphic Identifiers, Storage Combinators and Higher Order Messaging.<p>
What's more, this control is available both immediately in script form, as well
as for reuse in other applications as objects.<p>
<h3>Installing Objective-S</h3>
Now that we can create, start, stop and destroy virtual servers, it would be nice
to actually do something with them. For example: run Objective-S and Objective-S-based
web-servers.<p>
This is where the <code>MPWRemoteHost</code> comes in. This is what it says on the tin:
a representation of a remote host, very rudimentary for now. One of the few things it
knows how to do is set up an ssh connection to that remote host to execute commands
and transfer files via SFTP. The latter is surfaced as a store, so you can create
files on a remote host as easily as assigning to a local variable:
<hr><code><pre>
dest:hello.txt := 'Hello world!'.
</pre></code><hr>
Copying files is similar:
<hr><code><pre>
dest:hello.txt := file:hello.txt.
</pre></code><hr>
The script copies a tar archive containing both GNUstep and the Objective-S libraries,
which it then untars into the <code>'/usr'</code> directory of the target machine. In
addition it transfers the interactive Objective-S shell <code>st</code>, the
<code>runsite</code> command that serves ".sited" bundles via HTTP, and a <code>.bashrc</code>
that sets up some needed environment variables.
<hr><code><pre>
extension MPWHost {
-installObjS {
scheme:dest := self store.
filenames := [ 'ObjS-GNUstep-installed.tgz', 'st', '.bashrc', 'runsite' ].
filenames do: { :filename |
dest:{filename} := file:{filename}.
}.
self run:'chmod a+x st runsite';
run:'cd /usr ; tar zxf ~/ObjS-GNUstep-installed.tgz';
run:'mv st /usr/local/bin';
run:'mv runsite /usr/local/bin'.
}
}
host := MPWHost host:hostip user:'root'.
host installObjS.
</pre></code><hr>
As this is an extension to <code>MPWHost</code>, which is the superclass
of the <code>MPWRemoteHost</code> we used as the base for our <code>HetznerHost</code>,
the server objects we use have the ability to install Objective-S on them. Neat.<p>
And so do the server objects for the very similar script controlling DO droplets.<p>
<h3>Conclusion</h3>
When I started out on this little excursion, my goal was not to demonstrate anything
about Objective-S, I only needed to be able to use these cloud systems, and my hunch
was that Objective-S would be good for the task.<p>
It turned out even better than my hunch had suggested: the various features and
characteristics of Objective-S, such as Polymorphic Identifiers, first class
references, nested scheme handlers, and Higher Order Messaging, really work together
quite seamlessly to allow interaction with both a REST API and with a remote host to
be expressed compactly and naturally. In addition, it manages to naturally bridge
the gap between ad-hoc scripting and proper modelling, remaining hackable without
creating a mess.<p>
It's working...<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-45008655595830497492023-01-13T19:02:00.001+01:002023-01-13T19:02:24.292+01:00Setting the Bozo Bit on AppleThe other day I was fighting once again with Apple Music. Not the service, the app. What I wanted
to do was simple: I have some practice recordings for my choir and voice lessons that I want on
my iPhone and Apple Watch. How hard could it be?<p>
Apple: hold my beer.<p>
These are sent via WhatsApp so the audio recordings are mp4 files, which for some bizarre reason
won't open in Music and instead open in QuickTime Player, despite definitely being audio files.<p>
OK, not a biggie, so export to m4a from QT Player. Transfer to the machine that has my audio library. Create
a new playlist, transfer some previous songs over, then try to drop the new m4a's onto the
open playlist. No go. Play around for a while, figure out that the entity that accepts the
drops is the TableView, not the surrounding view. So you can't drop the new files into the
empty space below the songs, you have to drop them onto the existing songs.<p>
Who programmed this? Who didn't pay attention to this when doing QA? Who approved it for
release? iTunes used to be if not the, then certainly a flagship app for Apple.<p>
OK, plug in the iPhone, as for some reason wireless transfers don't seem to be overly reliable.<p>
No Finder, I don't want to back...too late. Ok, do your backup. Waiting. Spinner. Waiting. Repeat.
After a while it says it's finished. Unplug and ... the songs are not there.<p>
I quit Music.app, relaunch it, and lo-and-behold, the songs are now no longer in the playlist in
Music.app either. Re-add them, carefully aiming for the table, sync again (hey, it remembered we
just did a backup and doesn't try again, kudos!), and now they show up.<p>
Whew! Only took 15 minutes or so, the last time I was futzing with it for over an hour and
the songs never synced. Or one did and two did not, which is obviously Much Better.<p>
How can such basic functionality be this incredibly broken? And of course this is just one
tiny example, there are legions others, as many others have reported.<p>
With this, I noticed that I hadn't actually expected better. I knew it <em>should</em>
be better but I hadn't expected Apple to actually make it work.<p>
In other words, I had set the <a href="https://en.wikipedia.org/wiki/Bozo_bit">Bozo Bit</a> on
Apple. By default, when Apple does something new these days, I fully and quietly expect it
to be broken. And I am surprised when they actually get something right, like Apple Silicon.
And it wasn't an angry reaction to anything, in fact, it wasn't even much
of conscious decision, more a gradual erosion of expectations.<p>
It Just Doesn't Work™.<p>
And that's sad.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com3tag:blogger.com,1999:blog-8397311766319215218.post-7712731084219784052022-08-09T19:25:00.001+02:002022-08-11T03:55:16.693+02:00Native-GUI distributed system in a tweetIf I've been a bit quiet recently it's not due to lack of progress, but rather the very opposite: so much progress in <a href="http://objective.st">Objective-S</a> land hat my head is spinning and I am having a
hard time both processing it all and seeing where it goes.<p>
<p>
But sometimes you need to pause, reflect, and show your work, in whatever intermediate state
it currently is. So without further ado, here is the distributed system, with GUI, in a tweet:<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">#!env stui<br>scheme:s3 ← ref:http://defiant.local:2345/ asScheme<br>text ← #NSTextField{ #stringValue:'',#frame:(10@45 extent:180@24) }.<br>window ← #NSWindow{ #frame:(300@300 extent:200@105),#title:'S3', #views:#[text]}.<br>text → ref:s3:bucket1/msg.txt.<br>app runFromCLI:window.</p>— Marcel Weiher 🇪🇺 (@mpweiher) <a href="https://twitter.com/mpweiher/status/1556975759524249600?ref_src=twsrc%5Etfw">August 9, 2022</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>
It pops up a window with a text field, and stores whatever the user enters in an S3 bucket.
It continues to do this until the user closes the window, at which point the program exits.<p>
Of course, it's not <em>much</em> of a distributed system, particularly because it doesn't
actually include the code for the S3 simulator.<p>
Anyway, despite fitting in a tweet, the Objective-S script is actually not <a href="https://en.wikipedia.org/wiki/Code_golf">code golf</a>, although it may appear as such to someone not familiar with Objective-S.<p>
Instead, it is a straightforward definition and composition of the elements required:<p>
<ol>
<li>A <a href="https://2019.splashcon.org/details/splash-2019-Onward-papers/7/Storage-Combinators">storage combinator</a> for interacting with data in S3.
</li>
<li>A text field inside a window, defined as object literals.
</li>
<li>A connection between the text field and a specific S3 bucket.
</li>
</ol>
That's it, and it is no coincidence that the structure of the system maps directly onto the structure
of the code. Let's look at the parts in detail.
<h3>S3 via Storage Combinator</h3>
The first line of the script sets up an S3 scheme handler so we can interact with the S3 buckets
almost as if they were local variables. For example the following assignment statement stores
the text 'Hello World!' in the "msg.txt" file of "bucket1":<p>
<code><pre> s3:bucket1/msg.txt ← 'Hello World!'</pre></code>
Retrieving it works similarly:<p>
<code><pre> stdout println: s3:bucket1/msg.txt</pre></code>
The URL of our S3 simulator is <code>http://defiant.local:2345/</code>, so running on host <code>defiant</code> in the local network, addressed by Bonjour and listening on port 2345. As Objective-S supports <a href="https://dl.acm.org/doi/10.1145/2508168.2508169?cid=81316491227">Polymorphic Identifiers</a> (<a href="https://www.hirschfeld.org/writings/media/WeiherHirschfeld_2013_PolymorphicIdentifiersUniformResourceAccessInObjectiveSmalltalk_AcmDL.pdf">pdf</a>),
this URL is a directly evaluable identifier in the language.
Alas, that directness poses a problem, because writing down an identifier in most programming
languages yields the
value of the variable the identifier identifies, and Objective-S is no exception. In the case of
<code>http://defiant.local:2345/</code>, that value is the directory listing of the root of the S3 server, encoded as the following XML response:<p>
<code>
<pre>
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Owner><ID>123</ID><DisplayName>FakeS3</DisplayName></Owner>
<Buckets>
<Bucket>
<Name>bucket1</Name>
<CreationDate>2022-08-10T15:18:32.000Z</CreationDate>
</Bucket>
</Buckets>
</ListAllMyBucketsResult>
</pre>
</code>
That's not really what we want, we want to refer to the URL itself. The <code>ref:</code> allows
us to do this by preventing evaluation and thus returning the reference itself, very similar to
the <code>&</code> operator that creates pointers in C.<p>
Except that an Objective-S reference (or more precisely, a <em>binding</em>) is much richer than
a C pointer. One of its many capabilities is that it can be turned into a store by sending it
the <code>-asScheme</code> message. This new store uses the reference it was created from as its
base URL, all the references it receives are evaluated relative to this base reference.<p>
The upshot is that with the <code>s3:</code> scheme handler defined and installed as described, the
expression <code>s3:bucket1/msg.txt</code> evaluates to
<code>http://defiant.local:2345/bucket1/msg.txt</code>.<p>
This way of defining shorthands has proven extremely useful for making complex references usable
and modular, and is an extremely common pattern in Objective-S code.<p>
<h3>Declarative GUI with object literals</h3>
Next, we need to define the GUI: a window with a text field. With object literals, this
is pretty trivial. Object literals are similar to dictionary literals, except that you
get to define the class of the instance defined by the key/value pairs, instead of it
always being a dictionary.<p>
For example, the following literal defines a text field with certain dimensions and assigns it
to the <code>text</code> local variable:
<code><pre> text ← #NSTextField{ #stringValue:'',#frame:(10@45 extent:180@24) }.</pre></code>
And a window that contains the text field we just defined:
<code><pre> window ← #NSWindow{ #frame:(300@300 extent:200@105),#title:'S3', #views:#[text]}.</pre></code>
It would have been nice to define the text field inline in its window definition, but we currently
still need a variable so we can connect the text field (see next section).
<h3>Connecting components</h3>
Now that we have a text field (in a window) and somewhere to store the data, we need to connect
these two components. Typically, this would involve defining some procedure(s), callback(s) or
some extra-linguistics mechanism to mediate or define that connection. In Objective-S,
we just connect the components:<p>
<code><pre> text → ref:s3:bucket1/msg.txt.</pre></code>
That's it.<p>
The right-arrow "→" is a polymorphic connection "operator". The complete connection is
actually significantly more complex:
<ol>
<li>From a port of the source component</li>
<li>To a role of the mediating connector compatible with that source port</li>
<li>To a role of the mediating connector compatible with the target object's port</li>
<li>To that compatible port of the target component</li>
</ol>
If you want, you can actually specify all these intermediate steps, but most of the time
you don't have to, as the machinery can figure out what ports and roles are compatible.
In this case, even the actual connector was determined automatically.<p>
If we didn't want a remote S3 bucket, we could also have stored the data in a local file,
for example:<p>
<code><pre> text → ref:file:/tmp/msg.txt.</pre></code>
That treats the file like a variable, replacing the entire contents of the file with
the text that was entered. Speaking of variables, we could of course also store the
text in a local variable:<p>
<code><pre> text → ref:var:message.</pre></code>
In our simple example that doesn't make a lot of sense because the variable
isn't visible anywhere and will disappear once the script terminates, but
in a larger application it could then trigger further processing.<p>
Alternatively, we could also append the individual messages to a stream, for
example to <code>stdout</code>:
<code><pre> text → stdout.</pre></code>
So every time the user hits return in the text field, the content of the text
field is written to the console. Or appended to a file, by connecting to the
stream associated with the file rather the file reference itself:
<code><pre> text → ref:file:/tmp/msg.txt outputStream.</pre></code>
This doesn't have to be a single stream sink, it can be a complex
processing pipeline.<p>
I hope this makes it clear, or at least strongly hints, that this
is not the usual low-code/no-code trick of achieving compact code
by creating super-specialised components and mechanisms that work
well for a specific application, but immediately break down when
pushed beyond the demo.<p>
What it is instead is a new way of creating components, defining
their interfaces and then <a href="https://blog.metaobject.com/2019/02/why-architecture-oriented-programming.html">gluing</a> them together in a very straightforward
fashion.<p>
<h3>Eval/apply vs. connect and run</h3>
Having constructed our system by configuring and connecting components, what's
left is running it. <code>CLIApp</code> is a subclass of <code>NSApplication</code>
that knows how to run without an associated app wrapper or <code>Info.plist</code>
file. It is actually instantiated by the <code>stui</code> script runner before
the script is started, with the instance dropped into the <code>app</code> variable
for the script.<p>
This is where we leave our brave new world of connected components and
return (or connect with) the call/return world, similar to the way Cocoa's auto-generated
<code>main</code> with call to <code>NSApplicationMain()</code> works.<p>
The difference between eval/apply (call/return) and connect/run is actually quite profound, but
more on that in another post.<p>
Of course, we didn't leave call/return behind, it is still present and useful for certain
tasks, such as transforming an element into something slightly different. However, for
constructing systems, having components that can be defined, configured and connected
directly ("declaratively") is far superior to doing so procedurally, even than the
fluent APIs that have recently popped up and that have been mislabeled as "declarative".<p>
This project is turning out even better than I expected. I am stoked.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-29057930088284843872022-06-20T23:15:00.001+02:002022-11-12T10:51:45.373+01:00Blackbird: A reference architecture for local-first connected mobile appsWow, what a mouthful! Although this architecture has featured in a number of my other writings,
I haven't really described it in detail by itself. Which is a shame, because I think it
works really well and is quite simple, a case of <a href="https://blog.metaobject.com/2014/04/sophisticated-simplicity.html">Sophisticated Simplicity</a>.<p>
<h3>Why a reference architecture?</h3>
The motivation for creating and now presenting this reference architecture is that the way we
build connected mobile apps is broken, and none of the proposed solutions appear to help.
How are they broken? They are
overly complex, require way too much code, perform poorly and are unreliable.<p>
Very broadly speaking, these problems can be traced to the misuse of procedural abstraction for
a problem-space that is broadly state-based, and can be solved by adapting a state-based
architectural style such as in-process REST and combining it with well-known styles such
as MVC.<p>
More specifically, MVC has been misapplied by combining UI updates with the model updates, a
practice that becomes especially egregious with asynchronous call-backs. In addition, data
is pushed to the UI, rather than having the UI pull data when and as needed.
Asynchronous code is modelled using call/return and call-backs, leading to call-back hell,
needless and arduous transformation of any dependent code into asynchronous code (see "what
color is your function") that is also much harder to read, discouraging appropriate
abstractions.<p>
Backend communication is also an issue, with newer async/await implementations not really
being much of an improvement over callback-based ones, and arguably worse in
terms of actual readability. (They seem readable, but what actually happens is different
enough that the simplicity is deceptive).
<h3>Overview</h3>
The overall architecture has four fundamental components:
<ol>
<li>The model</li>
<li>The UI</li>
<li>The backend</li>
<li>The persistence</li>
</ol>
The main objective of the architecture is to keep these components in sync with each other, so the whole
thing somewhat resembles a control loop architecture: something disturbs the system, for example
the user did something in the UI, and the system responds by re-establishing equilibrium.<p>
The model is the central component, it connects/coordinates all the pieces and is also the only one directly
connected to more than one piece. In keeping with hexagonal architecture, the model is also supposed to
be the only place with significant logic, the remainder of the system should be as minimal, transparent
and dumb as possible.<p>
<pre>
memory-model := persistence.
persistence |= memory-model.
ui =|= memory-model.
backend =|= memory-model.
</pre>
Graphically:<br>
<img height=250 src="https://www.dropbox.com/s/lsxucl6m5rn7q9c/overall-arch.png?raw=1">
<h3>Elements</h3>
Blackbird depends crucially on a number of architectural elements: first are <em>stores</em>
of the in-process REST architectural style. These can be thought of as in-process HTTP servers
(without the HTTP, of course) or composable dictionaries. The core store protocol implements
the GET, PUT and DELETE verbs as messages.<p>
The role of URLs in REST is taken by Polymorphic Identifiers. These are objects that can
reference identify values in the store, but are not direct pointers. For example, they
need to be a able to reference objects that aren't there yet.<p>
Polymorphic Identifiers can be application-specific, for example they might consist
just of a numeric id,
<h3>MVC</h3>
For me, the key part of the MVC architectural style is the decoupling of input processing
and resultant output processing. That is, under MVC, the view (or a controller) make
some change to the model and then processing stops. At some undefined later time
(could be synchronous, but does not have to be) the Model informs the UI that it
has changed using some kind of notification mechanism.<p>
In Smalltalk MVC, this is a
dependents list maintained in the model that interested views register with. All
these views are then sent a <code>#changed</code> message when the model has changed.
In Cocoa, this can be accomplished using <code>NSNotificationCenter</code>, but really
any kind of broadcast mechanism will do.<p>
It is then the views' responsibility to update themselves by interrogating the model.<p>
For views, Cocoa largely automates this: on receipt of the notification, the view just
needs invalidate itself, the system then automatically schedules it for redrawing the
next time through the event loop.<p>
The reason the decoupling is important to maintain is that the update
notification can come for any other reason, including a different user interaction,
a backend request completing or even some sort of notification or push event
coming in remotely.<p>
With the decoupled M-V update mechanism, all these different
kinds of events are handled identically, and thus the UI only ever needs to deal with
the local model. The UI is therefore almost entirely decoupled from network
communications, we thus have a local-first application that is also largely
testable locally.<p>
Blackbird refines the MVC view update mechanism by adding the polymorphic identifier
of the modified item in question and placing those PIs in a queue. The queue
decouples model and view even more than in the basic MVC model, for example it
become fairly trivial to make the queue writable from any thread, but empty only
onto the main thread for view updates. In addition, providing update notifications
is no longer synchronous, the updater just writes an entry into the queue and can
then continue, it doesn't wait for the UI to finish its update.<p>
Decoupling via a queue in this way is almost sufficient for making sure that
high-speed model updates don't overwhelm the UI or slow down the model. Both
these performance problems are fairly rampant, as an example of the first,
the Microsoft Office installer saturates both CPUs on a dual core machine
just painting its progress bar, because it massively overdraws.<p>
An example of the second was one of the real performance puzzlers of my
career: an installer that was extremely slow, despite both CPU and disk
being mostly idle. The problem turned out to be that the developers of
that installer not only insisted on displaying every single file name
the installer was writing (bad enough), but also flushing the window to
screen to make sure the user got a chance to see it (worse). This then
interacted with a behavior of Apple's CoreGraphics, which disallows
screen flushes at a rate greater than the screen refresh rate, and will
simply throttle such requests. You really want to decouple your UI
from your model updates and let the UI process updates at its pace.<p>
Having polymorphic identifiers in the queue makes it possible for the UI
to catch up on its own terms, and also to remove updates that are no longer
relevant, for example discarding duplicate updates of the same element.<p>
The polymorphic identifier can also be used by views in order to determine
whether they need to update themselves, by matching against the polymorphic
identifier they are currently handling.<p>
<h3>Backend communication</h3>
Almost every REST backend communication code I have seen in mobile applications has
created "convenient" cover methods for every operation of every endpoint
accessed by the application, possibly automatically generated.<p>
This ignores the fact that REST only has a few verbs, combined with a great number
of identifiers (URLs). In Blackbird, there is a single channel for backend communication:
a queue that takes a polymorphic identifier and an http verb. The polymorphic
identifier is translated to a URL of the target backend system, the resulting request
executed and when the result returns it is placed in the central store using the provided
polymorphic identifier.<p>
After the item has been stored, an MVC notification with the polymorphic identifier in
question is enqueued as per above.<p>
The queue for backend operations is essentially the same one we described for model-view
communication above, for example also with the ability to deduplicate requests correctly
so only the final version of an object gets sent if there are multiple updates. The remainder
of the processing is performed in pipes-and-filters architectural style using polymorphic
write streams.<p>
If the backend needs to communicate with the client, it can send URLs via a socket or
other mechanism that tells the client to pull that data via its normal request channels,
implementing the same pull-constraint as in the rest of the system.<p>
One aspect of this part of the architecture is that backend requests are reified and
explicit, rather than implicitly encoded on the call-stack and its potentially
asynchronous continuations. This means it is straightforward for the UI to give the
user appropriate feedback for communication failures on the slow or disrupted network
connections that are the norm on mobile networks, as well as avoid accidental duplicate
requests.<p>
Despite this extra visibility and introspection, the code required to implement backend
communications is drastically reduced. Last not least, the code is isolated: network code
can operate independently of the UI just as well as the UI can operate
independently of the network code.<p>
<h3>Persistence</h3>
Persistence is handled by stacked stores (storage combinators). <p>
<img height=250 src="https://www.dropbox.com/s/8go76u12du9e5if/disk-cache-json-aligned.png?raw=1">
<p>
The application is hooked up to the top of the storage stack, the CachingStore, which looks
to the application exactly like the DictStore (an in-memory store). If a read request cannot
be found in the cache, the data is instead read from disk, converted from JSON by a mapping
store.<p>
For testing the rest of the app (rather than the storage stack), it is perfectly fine to
just use the in-memory store instead of the disk store, as it has the same interface and
behaves the same, except being faster and non-persistent.<p>
Writes use the same asynchronous queues as the rest of the system, with the writer getting
the polymorphic identifiers of objects to write and then retrieving the relevant object(s)
from the in-memory store before persisting. Since they use the same mechanism, they also
benefit from the same uniquing properties, so when the I/O subsystem gets overloaded it
will adapt by dropping redundant writes.<p>
<img height=400 src="https://www.dropbox.com/s/h2kq2joy20lmy7f/async-writer.png?raw=1">
<p>
<h3>Consequences</h3>
With the Blackbird reference architecture, we not only replace complex, bulky code with much
less and much simpler code, we also get to reuse that same code in all parts of the system
while making the pieces of the system highly independent of each other and optimising
performance.<p>
In addition, the combination of REST-like stores that can be composed with constraint- and event-based
communication patterns makes the architecture highly decoupled. In essence it allows the
kind of decoupling we see in well-implemented microservices architectures, but on mobile
apps without having to run multiple processes (which is often not allowed).<p<
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-26845553452973944332021-07-29T09:07:00.001+02:002021-07-29T09:12:35.530+02:00Glue Code is the Success ConditionMy previous post titled <a href="https://blog.metaobject.com/2021/06/glue-dark-matter-of-software.html">Glue: the Dark Matter of Software</a> may have given the <a href="https://www.oreilly.com/radar/thinking-about-glue/">impression</a> that I see glue code as exclusively a problem. And I have to admit
that my follow-up (and reaction to Github's copilot) called <a href="https://blog.metaobject.com/2021/06/don-generate-glueexterminate.html">Don't Generate Glue...Exterminate</a> may not have done much to dissuade anyone of that
impression, but I just couldn't resist the Dalek reference.<p>
However, I think it is important to remember that the fact that we have so much glue is a symptom of one of our
biggest <em>successes</em> in software technology. Even as recently as the late 80s and early 90s, we just
didn't have all that much to glue together, and software reuse was the holy grail, the unobtainium of computing, both in its
desirability and unobtainability.<p>
Now we have reuse. Boy do we have <a href="https://stackoverflow.com/questions/48568097/how-to-count-the-number-of-installed-packages-including-dependencies">reuse</a>! We have so much reuse that we need tool support to manage all
the reuse. As far as I can tell, all new programming languages now come with such tooling, and are considered
incomplete until they have it.<p>
The price of success is having a new set of problems, problems you never dreamed of before.<p>
So how will we solve these problems? <p>
Data format adaptation, as suggested by the O'Reilly article? Yes. Model-driven approaches that allow us or our tools
and languages to generate a lot of the more obvious adapter code? Sounds good, why not?<p>
This one neat trick (click <a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ">here</a>!) that will automatically
solve all these problems? No.<p>
Simpler components, written with composability and minimization of dependencies in mind? Surely.
Education, so developers get better at writing code that composes well without turning into
architecture astronauts? Very much yes.<p>
However, my contention is that developers have a hard time with this in large part because our languages only
support implementing such glue, which is a start, but do not support <em>expressing</em> it directly, or
abstracting over it, encapsulating it, playing with it. So new linguistic
mechanisms like <a href="http://objective.st">Objective-S</a> are needed to help developers write better and
thus less glue code so we can better enjoy the fruits of our reusability success.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-7808379770122707832021-07-25T13:28:00.001+02:002021-07-25T15:34:00.598+02:00Deleting Code to Double the Performance of my Trivial Objective-S Tasks BackendAbout two months ago, I <a href="https://blog.metaobject.com/2021/06/towards-todomvc-backend-in-objective-s.html">showed</a>
a trivial tasks backend for a hypothetical ToDoMVC app. At the time, I noted that the performance was pretty insane
for something written in a (slow) scripting language: 7K requests per second when fetching a single task.<p>
That was using an encoder method (that writes key/value pairs to the JSON encoder) written in Objective-S, and I
wondered how much faster it would go if that was no longer the case. Twice as fast, it turns out.<p>
Yesterday, I wrote about tuning the Objective-S's SQLite insert performance to around 130M rows/minute, coincidentally
also for a simple tasks schema. One part of that performance story was the fact that the encoder method (writing key/value
pairs to the SQLite encoder) was generated by pasting together Objective-C blocks and installing the whole thing
as an Objective-C method. No interpretation, except for calling a series of blocks stored in an <code>NSArray</code>.
I had completely forgotten about the hand-written Objective-S encoder method in the back-end's <code>Task</code> class!
Since generation is automatic, but won't override an already existing method, all I had to do in order to get the
better performance was delete the old method.<p>
<hr>
<blockquote>
<code>
<pre>
> wrk -c 1 -t 1 http://localhost:8082/tasks
Running 10s test @ http://localhost:8082/tasks
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 66.60us 9.69us 1.08ms 96.72%
Req/Sec 14.95k 405.55 15.18k 98.02%
150275 requests in 10.10s, 30.67MB read
Requests/sec: 14879.22
Transfer/sec: 3.04MB
> curl http://localhost:8082/tasks
[{"id":1,"done":0,"title":"Clean Room"},{"id":2,"done":1,"title":"Check Twitter"}]%
</pre>
</code>
</blockquote>
<hr>
More than twice the performance, and that while fetching <em>two</em> tasks instead of just one, so around 30K tasks/second! (And yes, I checked that I wasn't hitting a 404...).<p>
So what's the performance if we actually fetch more than a minimal number of tasks? For 128 tasks, 64x more than before, it's still around 9K requests/s, so most of the time so far was per-request overhead. At this point we are serving a little over 1M tasks/s:<p>
<hr>
<blockquote>
<code>
<pre>
> wrk -c 1 -t 1 'http://localhost:8082/tasks/'
Running 10s test @ http://localhost:8082/tasks/
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 112.13us 76.17us 5.57ms 99.63%
Req/Sec 9.05k 397.99 9.21k 97.03%
90923 requests in 10.10s, 483.41MB read
Requests/sec: 9002.44
Transfer/sec: 47.86MB
</pre>
</code>
</blockquote>
<hr>
If memory serves, that was around the rate we were seeing with the <a href="https://www.infoq.com/news/2014/11/gotober-wunderlist-microservices/">Wunderlist backend</a> when we had a couple
of million users, not that these are comparable in any meaningful way.
For 1024 tasks there's a significant drop to slightly above 1.8K requests/s, with the task-rate almost doubling to 1.8M/s:<p>
<hr>
<blockquote>
<code>
<pre>
> wrk -c 1 -t 1 'http://localhost:8082/tasks/'
Running 10s test @ http://localhost:8082/tasks/
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 552.06us 62.77us 1.84ms 81.08%
Req/Sec 1.82k 52.95 1.89k 90.10%
18267 requests in 10.10s, 778.36MB read
Requests/sec: 1808.59
Transfer/sec: 77.06MB
</pre>
</code>
</blockquote>
<hr>
<p>UPDATE:<br>
Of course, those larger request sizes also see a much larger increase in performance than 2x. With the old code, the 128 task case clocks in at 147 requests/s and the 1024 task case at 18 requests/s, at which point it's a 100x improvement. So gives you an idea just how slow my Objective-S interpreter is.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com2tag:blogger.com,1999:blog-8397311766319215218.post-20836440651569989582021-07-24T22:10:00.001+02:002021-07-24T22:59:57.802+02:00Inserting 130M SQLite Rows per Minute...from a Scripting LanguageThe other week, I stumbled on the post <a href="https://avi.im/blag/2021/fast-sqlite-inserts/">Inserting One Billion Rows in SQLite Under A Minute</a>, which was a funny coincidence, as I was just in the process of giving my own SQLite/Objective-S
adapter a bit of tune-up. (The post's title later had "Towards" prepended, because the author wasn't close to hitting that goal).<p>
This SQLite adapater was a spin-off of my earlier <a href="https://blog.metaobject.com/2020/06/beyond-faster-json-support-for-iosmacos.html">article</a> <a href="https://blog.metaobject.com/2020/04/faster-json-support-for-iosmacos-part-8.html">series</a> on <a href="https://blog.metaobject.com/2020/04/faster-json-support-for-iosmacos-part-7.html">optimizing</a> JSON performance, itself triggered by
the ludicrously bad <a href="https://blog.metaobject.com/2020/04/somewhat-less-lethargic-json-support.html">performance</a> of Swift Coding at this rather simple and relevant task. To recap: Swift's JSON coder clocked in at about 10MB/s. By using
a streaming approach and a bit of tuning, we got that to around 200MB/s.<p>
Since then, I have worked on making Objective-S much more useful for UI work, with the object-literal syntax making
defining UIs as convenient as the various "declarative" functional approaches such as React or SwiftUI. Except it is
still using the same AppKit or UIKit objects we know and love, and doesn't force us to embrace the <a href="https://blog.metaobject.com/2018/12/uis-are-not-pure-functions-of-model.html">silly</a> notion that
the UI is a pure function of the model. Oh, and you get live previews that actually work. But more on that later.<p>
So I am slowly inching towards doing a <a href="https://todomvc.com">ToDoMVC</a>, a benchmark that feels
rather <a href="https://en.wikipedia.org/wiki/Wunderlist">natural</a> to me. While I am still very partial to
just dumping JSON files, and the previous article series hopefully showed that this approach is plenty fast
enough, I realize that a lot of people prefer a "real" database, especially on the <a href="https://blog.metaobject.com/2021/06/towards-todomvc-backend-in-objective-s.html">back-end</a>, and I wanted to build that as well. One of the many benchmarks I have for Objective-S is that it should
be possible to build a nicer Rails with it. (At this point in time I am pretty sure I will hit that benchmark).<p>
One of the ways to figure out if you have a good design is to stress-test it. One very useful stress-test is seeing
how fast it can go, because that will tell you if the thing you built is lean, or if you put in unnecessary layers
and indirections.<p>
This is particularly interesting in a Scripted Components (<a href="https://web.archive.org/web/20180702110347/https://www.inf.ed.ac.uk/teaching/courses/sapm/2011-2012/slides/scripting.pdf">pdf</a>) system that combines a relatively slow
but flexible interactive scripting language with fast, optimized components. The question is whether you can actually
combine the flexibility of the scripting language while reaping the benefits of the fast components, rather than
having to dive into adapting and optimizing the components for each use case, or just getting slow performance despite
the fast components. My hunch was that the streaming approach I have been using for a while now and that worked really
well for JSON and Objective-C would also do well in this more challenging setting.<p>
Spoiler alert: it did!<p>
<h3>The benchmark</h3>
The benchmark was a slightly modified version of the script that serves as a tasks backend. Like said sample
script it also creates a tasks database and inserts some example rows. Instead of inserting two rows,
it inserts 10 million. Or a hundred million.<p>
<hr>
<blockquote>
<code>
<pre>
#!env stsh
#-<void>taskbench:<ref>dbref
#
class Task {
var <int> id.
var <bool> done.
var <NSString> title.
-description { "<Task: title: {this:title} done: {this:done}>". }
+sqlForCreate {
'( [id] INTEGER PRIMARY KEY, [title] VARCHAR(220) NOT NULL, [done] INTEGER );'.
}
}.
scheme todo : MPWAbstractStore {
var db.
var tasksTable.
-initWithRef:ref {
this:db := (MPWStreamQLite alloc initWithPath:ref path).
this:tasksTable := #MPWSQLTable{ #db: this:db , #tableClass: Task, #name: 'tasks' }.
this:db open.
self.
}
-<void>createTable {
this:tasksTable create.
this:tasksTable := this:db tables at:'tasks'.
this:tasksTable createEncoderMethodForClass: Task.
}
-createTaskListToInsert:<int>log10ofSize {
baseList ← #( #Task{ #title: 'Clean Room', #done: false }, #Task{ #title: 'Check Twitter', #done: true } ).
...replicate ...
taskList.
}
-<void>insertTasks {
taskList := self createTaskListToInsert:6.
1 to:10 do: {
this:tasksTable insert:taskList.
}.
}
}.
todo := todo alloc initWithRef:dbref.
todo createTable.
todo insertTasks.
</pre>
</code>
</blockquote>
<hr>
(I have removed the body of the method that replicates the 2 tasks into the list of millions of tasks we need to insert.
It was bulky and not relevant.)<p>
In this sample we define the Task class and use that to create the SQL Table. We could also have simply created
the table and generated a Tasks class from that.<p>
Anyway, running this script yields the following result.
<hr>
<blockquote>
<code>
<pre>
> time ./taskbench-sqlite.st /tmp/tasks1.db
./taskbench-sqlite.st /tmp/tasks1.db 4.07s user 0.20s system 98% cpu 4.328 total
> ls -al /tmp/tasks1.db*
-rw-r--r-- 1 marcel wheel 214M Jul 24 20:11 /tmp/tasks1.db
> sqlite3 /tmp/tasks1.db 'select count(id) from tasks;'
10000000
</pre>
</code>
</blockquote>
<hr>
So we inserted 10M rows in 4.328 seconds, yielding several hundred megabytes of SQLite data. This would be 138M rows
had we let it run for a minute. Nice. For comparison, the original article's numbers were 11M rows/minute for
CPython, 40M rows/minute for PyPy and 181M rows/minute for Rust, though on a slower Intel MacBook
Pro whereas I was running this on an M1 Air. I compiled and ran the Rust version on my M1 Air and it did
100M rows in 21 seconds, so just a smidgen over twice as fast as my Objective-S script, though with
a simpler schema (CHAR(6) instead of VARCHAR(220)) and less data (1.5GB vs. 2.1GB for 100M rows).
<p>
<h3>Getting SQLite fast</h3>
The initial version of the script was far, far slower, and at first it was, er, "sub-optimal" use of SQLite
that was the main culprit, mostly inserting every row by itself without batching. When SQLite sees an
INSERT (or an UPDATE for that matter) that is not contained in a transaction, it will automatically wrap that
INSERT inside a generated transaction and commit that transaction after the INSERT is processed. Since
SQLite is very fastidious about ensuring that transactions get to disk atomically, this is slow. Very slow.<p>
The class handling SQLite inserts is a <a href="https://conf.researchr.org/details/dls-2019/dls-2019/7/Standard-Object-Out-Streaming-Objects-with-Polymorphic-Write-Streams">Polymorphic Write Stream</a>, so it knows what an array is.
When it encounters one, it sends itself the <code>beginArray</code> message, writes the contents of the array
and finishes by sending itself the <code>endArray</code> message. Since writing an array sort of implies that
you want to write all of it, this was a good place to insert the transactions:
<hr>
<blockquote>
<code>
<pre>
-(void)beginArray {
sqlite3_step(begin_transaction);
sqlite3_reset(begin_transaction);
}
-(void)endArray {
sqlite3_step(end_transaction);
sqlite3_reset(end_transaction);
}
</pre>
</code>
</blockquote>
<hr>
So now, if you want to write a bunch of objects as a single transaction, just write them as an array, as the
benchmark code does. There were some other minor issues, but after that less than 10% of the total time
were spent in SQLite, so it was time to optimize the caller, my code.<p>
<h3>Column keys and Cocoa Strings</h3>
At this point, my guess was that the biggest remaining slowdown would be my, er, "majestic" Objective-S
interpreter. I was wrong, it was Cocoa string handling. Not only was I creating the SQLite parameter
placeholder keys dynamically, so allocating new NSString objects for each column of each row, it also
happens that getting character data from an NSString object nowadays involves some very complex and slow
internal machinery using encoding conversion streams. <code>-UTF8String</code> is not your friend, and other
methods appear to fairly consistently use the same slow mechanism. I guess making NSString horribly slow is
one way to make other string handling look good in comparison.<p>
After a few transformations, the code would just look up the incoming NSString key in a dictionary that
mapped it to the SQLite parameter index. String-processing and character accessing averted.<p>
<h3>Jitting the encoder method. Without a JIT</h3>
One thing you might have noticed about the class definition in the benchmark code is that there is no
encoder method, it just defines its instance variables and some other utilities. So how is the class
data encoded for the <code>SQLTable</code>? KVC? No, that would be a bit slow, though it might make a good
fallback.<p>
The magic is the <code>createEncoderMethodForClass:</code> method. This method, as the name suggests,
creates an encoder method by pasting together a number of blocks, turns the top-level into
a method using <code>imp_implementationWithBlock()</code>, and then finally adds that method
to the class in question using <code>class_addMethod()</code>.<p>
<hr>
<blockquote>
<code>
<pre>
-(void)createEncoderMethodForClass:(Class)theClass
{
NSArray *ivars=[theClass allIvarNames];
if ( [[ivars lastObject] hasPrefix:@"_"]) {
ivars=(NSArray*)[[ivars collect] substringFromIndex:1];
}
NSMutableArray *copiers=[[NSMutableArray arrayWithCapacity:ivars.count] retain];
for (NSString *ivar in ivars) {
MPWPropertyBinding *accessor=[[MPWPropertyBinding valueForName:ivar] retain];
[ivar retain];
[accessor bindToClass:theClass];
id objBlock=^(id object, MPWFlattenStream* stream){
[stream writeObject:[accessor valueForTarget:object] forKey:ivar];
};
id intBlock=^(id object, MPWFlattenStream* stream){
[stream writeInteger:[accessor integerValueForTarget:object] forKey:ivar];
};
int typeCode = [accessor typeCode];
if ( typeCode == 'i' || typeCode == 'q' || typeCode == 'l' || typeCode == 'B' ) {
[copiers addObject:Block_copy(intBlock)];
} else {
[copiers addObject:Block_copy(objBlock)];
}
}
void (^encoder)( id object, MPWFlattenStream *writer) = Block_copy( ^void(id object, MPWFlattenStream *writer) {
for ( id block in copiers ) {
void (^encodeIvar)(id object, MPWFlattenStream *writer)=block;
encodeIvar(object, writer);
}
});
void (^encoderMethod)( id blockself, MPWFlattenStream *writer) = ^void(id blockself, MPWFlattenStream *writer) {
[writer writeDictionaryLikeObject:blockself withContentBlock:encoder];
};
IMP encoderMethodImp = imp_implementationWithBlock(encoderMethod);
class_addMethod(theClass, [self streamWriterMessage], encoderMethodImp, "v@:@" );
}
</pre>
</code>
</blockquote>
<hr>
What's kind of neat is that I didn't actually write that method for this particular use-case: I had
already created it for JSON-coding. Due to the fact that the JSON-encoder and the SQLite writer
are both Polymorphic Write Streams (as are the targets of the corresponding decoders/parsers),
the same method worked out of the box for both.<p>
(It should be noted that this encoder-generator currently does not handle all variety of data types;
this is intentional).
<h3>Getting the data out of Objective-S objects</h3>
The encoder method uses <code>MPWPropertyBinding</code> objects to efficiently access the instance
variables via the object's accessors, caching IMPs and converting data as necessary, so they are
both efficient and flexible. However, the actual accessors that Objective-S generated for its
instance variables were rather baroque, because they used the same basic mechanism used for
Objective-S methods, which can only deal with objects, not with primitive data types.<p>
In order to interoperate seamlessly with Objective-C, which expected methods that can
take data types other than objects, all non-object method arguments are converted
to objects on the way in, and return values are converted from objects to primitive
values on the way out.<p>
So even the accessors for primitive types such as the integer "id" or the boolean "done"
would have their values converted to and from objects by the interface machinery. As
I noted above, I was a bit surprised that this inefficiency was overshadowed by the
NSString-based key handling.<p>
In fact, one of the reason for pursuing the SQLite insert benchmark was to have a
reason for finally tackling this Rube-Goldberg mechanism. In the end, actually
addressing it turned out to be far less complex than I had feared, with the technique
being very similar to that used for the encoder-generator above, just simpler.<p>
Depending on the type, we use a different block that gets parameterised with the
offset to the instance variable. I show the setter-generator below, because
there the code for the object-case is actually different due to retain-count
handling:
<hr>
<blockquote>
<code>
<pre>
#define pointerToVarInObject( type, anObject ,offset) ((type*)(((char*)anObject) + offset))
#ifndef __clang_analyzer__
// This leaks because we are installing into the runtime, can't remove after
-(void)installInClass:(Class)aClass
{
SEL aSelector=NSSelectorFromString([self objcMessageName]);
const char *typeCode=NULL;
int ivarOffset = (int)[ivarDef offset];
IMP getterImp=NULL;
switch ( ivarDef.objcTypeCode ) {
case 'd':
case '@':
typeCode = "v@:@";
void (^objectSetterBlock)(id object,id arg) = ^void(id object,id arg) {
id *p=pointerToVarInObject(id,object,ivarOffset);
if ( *p != arg ) {
[*p release];
[arg retain];
*p=arg;
}
};
getterImp=imp_implementationWithBlock(objectSetterBlock);
break;
case 'i':
case 'l':
case 'B':
typeCode = "v@:l";
void (^intSetterBlock)(id object,long arg) = ^void(id object,long arg) {
*pointerToVarInObject(long,object,ivarOffset)=arg;
};
getterImp=imp_implementationWithBlock(intSetterBlock);
break;
default:
[NSException raise:@"invalidtype" format:@"Don't know how to generate set accessor for type '%c'",ivarDef.objcTypeCode];
break;
}
if ( getterImp && typeCode ) {
class_addMethod(aClass, aSelector, getterImp, typeCode );
}
}
</pre>
</code>
</blockquote>
<hr>
At this point, profiles were starting to approach around two thirds of the time being spent in <code>sqlite_</code> functions,
so the optimisation efforts were starting to get into a region of diminishing returns.<p>
<h3>Linear scan beats dictionary</h3>
One final noticeable point of obvious overhead was the (string) key to parameter index mapping, which the
optimizations above had left at a <code>NSDictionary</code> mapping from <code>NSString</code> to <code>NSNumber</code>.
As you probably know, <code>NSDictionary</code> isn't exactly the fastest. One idea was to replace that lookup
with a <a href="https://blog.metaobject.com/2020/04/equally-lethargic-json-support-for.html">MPWFastrStringTable</a>,
but that means either needing to solve the problem of fast access to <code>NSString</code> character data or changing the
protocol.<p>
So instead I decided to brute-force it: I store the actual pointers to the NSString objects in a C-Array indexed by
the SQLite parameter index. Before I do the other lookup, which I keep to be safe, I do a linear scan in that table
using the incoming string pointer. This little trick largely removed the parameter index lookup from my profiles.<p>
<h3>Conclusion</h3>
With those final tweaks, the code is probably quite close to as fast as it is going to get. Its slower performance
compared to the Rust code can be attributed to the fact that it is dealing with more data and a more complex
schema, as well as having to actually obtain data from materialized objects, whereas the Rust code just generates
the SQlite calls on-the-fly.<p>
All this is achieved from a slow, interpreted scripting language, with all the variable parts (data class, steering
code) defined in said slow scripting language. So while I look forward to the native compiler for Objective-S,
it is good to know that it isn't absolutely necessary for excellent performance, and that the basic design of
these APIs is sound.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-20866688574170422442021-06-30T12:01:00.001+02:002021-06-30T12:01:06.109+02:00Don't Generate Glue...Exterminate!!Today I saw the <a href="https://news.ycombinator.com/item?id=27676266&p=2">news</a> of github's release of the "AI Autopilot". As far as I can tell, it's an impressive piece
of engineering that shouldn't exist. I mean, "Paste Code from Stack Overflow as a Service" was supposed to be
a joke, not a product spec.<p>
<img src="http://icodeit.org/images/2016/05/stackoverflow-oreilly.png" alt="" title="" border="0" width="200" height="" />
<p>
As of this writing, the first example given on the <a href="https://copilot.github.com">product page</a> is some
code to call a REST service that does sentiment analysis, which the AI helpfully completes. For reference, this is the code:
<hr>
<blockquote>
<code>
<pre>
#!/usr/bin/env ts-node
import { fetch } from "fetch-h2";
// Determine whether the sentiment of text is positive
// Use a web service
async function isPositive(text: string): Promise<boolean> {
const response = await fetch(`http://text-processing.com/api/sentiment/`, {
method: "POST",
body: `text=${text}`,
headers: {
"Content-Type": "application/x-www-form-urlencoded",
},
});
const json = await response.json();
return json.label === "pos";
}
</pre>
</code>
</blockquote>
<hr>
Here is the same script in Objective-S:
<hr>
<blockquote>
<code>
<pre>
#!env stsh
#-sentiment:text
((ref:http://text-processing.com/api/sentiment/ postForm:#{ #text: text }) at:'label') = 'pos'
</pre>
</code>
</blockquote>
<hr>
And once you have it, reuse it. And keep those Daleks at bay.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com1tag:blogger.com,1999:blog-8397311766319215218.post-2177894866309439552021-06-28T10:28:00.001+02:002021-06-28T10:28:59.023+02:00Generating ARM Assembly: First StepsFinally took the plunge to start generating ARM64 assembly. As expected, the actual coding was much
easier than overcoming the barrier to just start doing it.<p>
The following snippet generates a program that prints a message to <code>stdout</code>, so
a classic "Hello World":
<hr>
<blockquote>
<code>
<pre>
#!env stsh
#-gen:msg
messageLabel ← 'message'.
main ← '_main'.
framework:ObjSTNative load
arm := MPWARMAssemblyGenerator stream
arm global: main;
align:2;
label:main;
mov:0 value:1;
adr:1 address:messageLabel;
mov:2 value: msg length;
mov:16 value:4;
svc:128;
mov:0 value:0;
ret;
label:messageLabel;
asciiz:msg.
file:hello-main.s := arm target
</pre>
</code>
</blockquote>
<hr>
One little twist is that the message to print gets passed to the generator. I like how
Smalltalk's keyword syntax keeps the code uncluttered, and often pretty close to the
actual assembly that will be generated.<p>
Of particular help here is message <em>cascading</em> using the semicolon. This
means I don't have to repeat the receiver of the message, but can just keep
sending it messages. Cascading works well together with streams, because
there are no return values to contend with, we just keep appending to
the stream.<p>
When invoked using <code>./genhello-main.st 'Hi Marcel, finish that blog post and get on your bike!'</code>, the generated code is as follows:<p>
<hr>
<blockquote>
<code>
<pre>
.global _main
.align 2
_main:
mov X0, #1
adr X1, message
mov X2, #54
mov X16, #4
svc #128
mov X0, #0
ret
message:
.asciz "Hi Marcel, finish that blog post and get on your bike!"
</pre>
</code>
</blockquote>
<hr>
And now I am going to do what it says :-)<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com2tag:blogger.com,1999:blog-8397311766319215218.post-9466057925570924872021-06-15T12:35:00.001+02:002021-06-15T12:37:52.436+02:00if let it beOne of funkier aspects of Swift syntax is the <code>if let</code> statement. As far as I can tell, it exists pretty
much exclusively
to check that an optional variable actually does contain a value and if it does, work with a no-longer-optional version
of that variable.<p>
Swift packages this functionality in a combination <code>if</code> statement and <code>let</code> declaration:
<hr>
<code>
<pre>
if let value = value {
print("value is \(value)")
}
</pre>
</code>
<hr>
This has a bunch of problems that are explained nicely in a <a href="https://forums.swift.org/t/lets-fix-if-let-syntax/48188/25">Swift Evolution thread</a> (via <a href="https://mjtsai.com/blog/2021/05/12/fixing-swifts-if-let-syntax/">Michael Tsai</a>) together with some proposals to fix it. One of the issues is the idiomatic
repitition of the variable name, because typically you do want the same variable, just with less optionality. Alas,
code-completion apparently doesn't handle this well, so the temptation is to pick a non-descriptive variable name.<p>
In my previous post (<a href="https://blog.metaobject.com/2021/06/asynchronous-sequences-and-polymorphic.html">Asynchronous Sequences and Polymorphic Streams</a>) I noted how the fact that iteration in Smalltalk and Objecive-S is done via
messages and blocks means that there is no separate concept of a "loop-variable", that is just an argument to the
block.<p>
Conditionals are handled the same way, with blocks and messages, but normally don't pass arguments to their
argument blocks, because in normal conditionals those arguments would always be just the constants
<code>true</code> or <code>false</code>. Not very interesting.<p>
When I added <code>ifNotNil:</code> some time ago, I used the same logic, but it turns out the object
is now actually potentially interesting. So <code>ifNotNil:</code> now passes the now-known-to-be-non-nil
value to the block and can be used as follows:
<hr>
<code>
<pre>
value ifNotNil:{ :value |
stdout println:value.
}
</pre>
</code>
<hr>
This doesn't eliminate the duplication, but does avoid the issue of having the newly
introduced variable name precede the original variable. Well, that and the whole
weird <code>if let</code> in the first place.<p>
With anonymous block arguments, we actually don't have to name the parameter at all:
<p>
<hr>
<code>
<pre>
value ifNotNil:{ stdout println:$0. }
</pre>
</code>
<hr>
Alternatively, we can just take advantage of some conveniensces and use a HOM
instead:
<p>
<hr>
<code>
<pre>
value ifNotNil printOn:stdout.
</pre>
</code>
<hr>
Of course, Objective-S currently doesn't care about optionality, and with the current
nil-eating behavior, the <code>ifNotNil</code> is
not strictly necessary, you could just write it as follow:
<p>
<hr>
<code>
<pre>
value printOn:stdout.
</pre>
</code>
<hr>
I haven't really done much thinking about
it, but the whole idea of optionality shouldn't really be handled in the
space of values, but in the space of references. Which are first class
objects in Objective-S.<p>
So you don't ask a value if it is nil or not, you ask the variable if
it contains a value:
<p>
<hr>
<code>
<pre>
ref:value ifBound:{ :value | ... }
</pre>
</code>
<hr>
To me that makes a lot more sense than having every type be accompanied
by an optional type.<p>
So if we were to care about optionality so in the future, we have the
tools to create a sensible solution. And we can let <code>if let</code>
just be.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com2tag:blogger.com,1999:blog-8397311766319215218.post-73725906066646139692021-06-13T10:05:00.001+02:002021-06-15T11:07:52.545+02:00Asynchronous Sequences and Polymorphic StreamsBrowsing the WWDC '21 session videos, I came across the <a href="https://developer.apple.com/wwdc21/10058">session on Asynchronous Sequences</a>.
The preview image showcased some code for asynchronously fetching and massaging current earthquake data from the U.S. Geological Survey:
<hr>
<code>
<pre style="word-break: break-all;">
@main
struct QuakesTool {
static func main() async throws {
let endpointURL = URL(string: "https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv")!
for try await event in endpointURL.lines.dropFirst() {
let values = event.split(separator: ",")
let time = values[0]
let latitude = values[1]
let longitude = values[2]
let magnitude = values[4]
print("Magnitude \(magnitude) on \(time) at \(latitude) \(longitude)")
}
}
}
</pre>
</code>
<hr>
This is nice, clean code, and it certainly looks like it serves as a good showcase for the benefits of
asynchronous coding with async/await and asynchronous sequences built on top of async/await.<p>
Or does it?<p>
Here is the equivalent code in <a href="http://objective.st">Objective-S</a>:
<p>
<hr>
<code>
<pre>
#!env stsh
stream ← ref:https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv linesAfter:1.
stream do: { :theLine |
values ← theLine componentsSeparatedByString:','.
time ← values at:0.
latitude ← values at:1.
longitude ← values at:2.
magnitude ← values at:4.
stdout println:"Quake: magnitude {magnitude} on {time} at {latitude} {longitude}".
}.
stream awaitResultForSeconds:20.
</pre>
</code>
<hr>
<p>
Objective-S does not (and will not) have async/await, but it can nevertheless provide the equivalent functionality easily and elegantly. How? Two features:
<ol>
<li><a href="https://conf.researchr.org/details/dls-2019/dls-2019/7/Standard-Object-Out-Streaming-Objects-with-Polymorphic-Write-Streams">Polymorphic Write Streams</a></li>
<li>Messaging</li>
</ol>
Let's see how these two conspire to make adding something equivalent to <code>for try await</code> trivial.
<h3>Polymorphic Write Streams</h3>
In the Objective-S implementation, <code>https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv</code> is not a string,
but an actual identifier, a Polymorphic Identifier, adding the <code>ref:</code> prefix turns it into a binding, a first class variable.
You can ask a binding for its <code>value</code>, but for bindings that can also be regarded as collections of some kind, you can also
ask them for a <code>stream</code> of their values, in this particular case a <code><a href="https://github.com/mpw/MPWFoundation/blob/master/Streams.subproj/MPWURLStreamingStream.m">MPWURLStreamingStream</a></code>. This stream is a <a href="https://github.com/mpw/MPWFoundation/blob/master/Documentation/Streams.md">Polymorphic Write Stream</a> that can be easily composed with other filters to create pipelines. The <code>linesAfter:</code>
method is a convenience method that does just that: it composes the URL fetcher with a filter that converts from bytes to lines of
text and another filter that drops the first <em>n</em> items.<p>
Objective-S actually has convenient syntax for creating these compositions without having to do it via convenience methods, but I wanted
to keep differences in the surrounding scaffolding small for this example, which is about the <code>for try away</code> and <code>do:</code>.<p>
When I encountered the example, Polymorphic Write Streams actually did not have a <code>do:</code> for iteration, but it was trivial to add:<p>
<hr>
<code>
<pre>
-(void)do:aBlock
{
[self setFinalTarget:[MPWBlockTargetStream streamWithBlock:aBlock]];
[self run];
}
</pre>
</code>
<hr>
(This code lives in MPWFoundation, so it is in Objective-C, not Objective-S).<p>
Those 5 lines were all that was needed. I did not have to make substantive changes to the language or its implementation. One reason
for this is that Polymorphic Write Streams are <em>asynchrony-agnostic</em>: although they are mostly implemented as straightforward
synchronous code, they work just as well if parts of the pipeline they are in are asynchronous. It just doesn't make a difference,
because the semantics are in the data flow, not in the control flow.<p>
<h3>Messaging</h3>
The other big reason an asynchronous <code>do:</code> was easy to add is <em><a href="http://lists.squeakfoundation.org/pipermail/squeak-dev/1998-October/017019.html">messaging</a></em>.
<blockquote>
If you focus on just messaging -- and realize that a good metasystem can
late bind the various 2nd level architectures used in objects -- then much
of the language-, UI-, and OS based discussions on this thread are really
quite moot.
</blockquote>
One of the many really, really neat ideas in Smalltalk is how control structures, which in most other languages
are special language features, are just plain old messages and implemented in the library, not in the language.<p>
So the <code>for ... in</code> loop in Swift is just the <code>do:</code> message sent to a collection, and the
keyword syntax makes this natural:
<hr>
<code>
<pre>
for event in lines {
...
}
...
lines do: { :event |
...
}
</pre>
</code>
<hr>
Note how making loops regular like this also makes the special concept of "loop variable"
disappear. The "loop variable" is just the block argument. And I just realized the same
would go for a not-nil result of a nil test.<p>
Anyway, if "loops" are just messages, it's easy to add a method implementing iteration to some other
entity, for example a stream, the way that I did. (Smalltalk streams also support the
iteration messages).<p>
And when you can easily make stream processing, which can handle asynchrony naturally and
easily, just as convenient as imperative programming,
you don't need async/await, which tries to make asynchronous programming <em>look</em> like
imperative programming in order to make it convenient.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com3tag:blogger.com,1999:blog-8397311766319215218.post-63449425461654675402021-06-09T10:43:00.001+02:002021-06-13T12:46:26.666+02:00Glue: the Dark Matter of Software"Software seems 'large' and 'complicated' for what it does". I keep coming back to this <a href="https://youtu.be/ubaX1Smg6pY?t=113">quote</a> by Alan Kay.<p>
The same feeling has been nagging me pretty me much ever since I started writing software. On the one hand, there is the magic, almost literally:
we write some text (spells) and the machine does things in the real world. On the other hand, it seems just way too much work to make the machine
do anything more complex than:
<br><blockquote> <code>10 PRINT "Hello"<br>20 GOTO 10</code></blockquote>
Almost like threading a needle with boxing gloves. And that's even if we are careful, if we avoid unnecessary complexity.<p>
And the numbers appear to back that up, Alan Kay mentions Microsoft office at several hundred million lines of code. From my personal experience,
the Wunderlist iOS client was not quite 200 KLOC. For the latter, I can attest to the attention given by the team to <em>not</em> introduce
unnecessary bloat, and even to actively reduce it. (For example, we cut our core code by around 30KLOC thanks to some of the architectural
mechanisms such as <a href="https://2019.splashcon.org/details/splash-2019-Onward-papers/7/Storage-Combinators">Storage Combinators</a>).
I am fairly sure I am not the only one with this experience.<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">hexagonal architecture has enabled me to extract the business logic in the product i’m building and currently it’s less than 5% of all code</p>— 3. life out of balance (@infinitary) <a href="https://twitter.com/infinitary/status/934321320338313216?ref_src=twsrc%5Etfw">November 25, 2017</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
So why so much code? After all Wunderlist was just a To Do List, albeit a really nice one. I can't really say much about Office, I don't
think anyone can, because 400 MLOC is just way too much code to comprehend. I think the answer is:<p>
Glue Code.<p>
It's the unglamorous, invisible code that connects two pieces of software, makes sure that data that's in location A reaches location B
unscathed (from the datbase to the UI, from the UI to the model, from the model to the backend and so on...). And like Dark Matter, it
is invisible and <em>massive</em>.<p>
Why do I say it is "invisible"? After all, the code is right there, isn't it? As far as I can tell, there are several related reasons:
<ol>
<li>Glue code is deemed not important. It's just a couple of lines here, and another couple of lines over there ... and soon enough you're talking
real MLOCs!</li>
<li>We cannot directly express glue code. Most of our languages are what I call "DSLs for Algorithms" (See <a href="https://en.wikipedia.org/wiki/ALGOL">ALGOL, the ALGOrithmic Language</a>), so glue can not be expressed intentionally, but only by describing algorithms for implementing the glue.</li>
</ol>
That's why it is invisible, and also partly why it is massive: not being able to express it directly means we cannot abstract and encapsulate it, we
keep repeating slight variations of that glue. There is another reason why it's massive:
<ol start="3">
<li>Glue is quadratic. If you have N features that interact with each other, you have O(N²) pieces of glue to get them to talk to each other.</li>
</ol>
<p>
This last point was illustrated quite nicely by Kevin Greer in a video comparing Multics and Unix development, with the crucial insight
being that you need to "program the perimeter, not the area":<p>
<iframe width="600" height="400" src="https://www.youtube.com/embed/3Ea3pkTCYx4" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<p>
For him, the key difference is that Unix had the pipe, and I would agree. The pipe is one-character glue: "|". This is absolutely crucial.<p>
If you have to write even a little custom code every time you connect two modules, you will be in quadratic complexity, meaning that
as your features grow your glue code <em>will</em> overwhelm the core functionality. And you will only notice this when it's
far too late to do anything about it, because the initial growth rate will be low.<p>
So what can we do about it? I think we need to make glue first class so we can actually write down the glue itself, and not the algorithms that implement
the glue. Once we have that, we can and hopefully will create better kinds of glue, ones like the Unix pipe in that they can connect components generically,
without requiring custom glue per component pair.<p>
UPDATE<p>
There were some questions as to what to do about this. Well, I am working on it, with <a href="http://objective.st">Objective-S</a>, and I write
fairly frequently on this blog (and occasionally submit my <a href="http://objective.st/Publications/">writing</a> to scientific conferences), one post that would be immediately relevant is: <a href="https://blog.metaobject.com/2019/02/why-architecture-oriented-programming.html">Why Architecture Oriented Programming Matters</a>.<p>
I also don't see Unix Pipes and Filters as <em>The Answer</em>™, they just demonstrate the concept of minimized
and constant glue. Expanding on this, and as I wrote in <a href="https://blog.metaobject.com/2019/02/why-architecture-oriented-programming.html">Why Architecture Oriented Programming Matters</a>, I also don't see any one single connector
as "the" solution. We need different kinds of connectors, and we need to write them down, to abstract over them and
use them natively. Not simulate everything by calling procedures, methods or functions. See also <a href="https://longnow.org/seminars/02007/jan/26/why-foxes-are-better-forecasters-than-hedgehogs/">Foxes vs. Hedgehogs</a>.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com14tag:blogger.com,1999:blog-8397311766319215218.post-37876206323094253952021-06-01T15:06:00.001+02:002021-06-15T14:55:58.976+02:00Towards a ToDoMVC Backend in Objective-SA couple of weeks ago, I showed a little http <a href="https://blog.metaobject.com/2021/05/a-far-too-simple-harcoded-tasks-backend.html">backend</a>. Well, <em>tiny</em> is probably a more apt description, and also aptly describes its functionality,
which is almost non-existent. All it does is define a simplistic <code>Task</code> class, create an array with two
sample instances and then serves that array of tasks over http. And it serves the <code>-description</code> of those tasks
rather than anything usefuk like a JSON encoding.<p>
For reference, this is the original code, hacked up in maybe 15 minutes:
<hr>
<code>
<pre>
#!env stsh
framework:ObjectiveHTTPD load.
class Task {
var <bool> done.
var title.
-description { "Task: {this:title} done: {this:done}". }
}
taskList ← #( #Task{ #title: 'Clean my room', #done: false }, #Task{ #title: 'Check twitter feed', #done: true } ).
scheme todo {
var taskList.
/tasks {
|= {
this:taskList.
}
}
}.
todo := #todo{ #taskList: taskList }.
server := #MPWSchemeHttpServer{ #scheme: todo, #port: 8082 }.
server start.
shell runInteractiveLoop.
</pre>
</code>
<hr>
What would it take to make this borderline useful? First, we would probably need to encode the result as JSON, rather than
serving a description. This is where Storage Combinators come in. We (now) have a <code>MPWJSONConverterStore</code> that's
a mapping store, it passes its "REST" requests through while performing certain transformations on the data and/or the
references. In this case the transformation is serializing or deserialzing objects from/to JSON, depending on which
way the request is going and which way the converter is pointing.<p>
In this case, the converter is pointing "up", that is it serializes objects read from its source to JSON and deserializes
data written to its source from JSON to objects. We also tell it that it is dealing with <code>Task</code> objects.
When we have the converter we connect it to our <code>todo</code> scheme and tell the HTTP server to talk to the json
converter (which talks to our todo scheme):
<hr>
<code>
<pre>
todo := #todo{ #taskList: taskList, #store: persistence }.
json := #MPWJSONConverterStore{ #up: true, #class: class:Task }.
json → todo.
server := #MPWSchemeHttpServer{ #scheme: json, #port: 8082 }.
</pre>
</code>
<hr>
Second, we also want to be to interact with individual tasks. No problem, just add a <code>/task/:id</code> proprerty path to our store/scheme handler,
along with GET ("|=") and PUT ("=|") handlers. I am not fully sold yet on the "|=" syntax for this, but I would like to avoid names for this sort of
structural component. Maybe arrows?
<hr>
<code>
<pre>
/task/:id {
|= {
this:taskDict at:id .
}
=| {
this:taskDict at:id put:newValue.
}
</pre>
</code>
<hr>
In order to facilitate this, the <code>taskList</code> was changed to a dictionary. Once we make changes to our data, we probably also want to persist it.
One easy way to do this is to store the tasks as JSON on disk. This allows us to reuse the JSON converter from above, but this time pointing "down". We
connect this converter to the filesystem at the directory <code>/tmp/tasks</code> and to the store:
<hr>
<code>
<pre>
json → todo → #MPWJSONConverterStore{ #class: class:Task } → ref:file:/tmp/tasks/ asScheme.
</pre>
</code>
<hr>
In addition, we need to trigger saving in the PUT handler:
<hr>
<code>
<pre>
=| {
this:taskDict at:id put:newValue.
self persist.
}
-<void>persist {
source:tasks := this:taskDict allValues.
}
}
</pre>
</code>
<hr>
This will (synchronously) write the entire task list on every PUT.
The full code is here:
<hr>
<code>
<pre>
#!env stsh
framework:ObjectiveHTTPD load.
class Task {
var id.
var <bool> done.
var title.
-description { "Task: {this:title} done: {this:done} id: {this:id}". }
-<void>writeOnJSONStream:aStream {
aStream writeDictionaryLikeObject:self withContentBlock:{ :writer |
writer writeInteger: this:id forKey:'id'.
writer writeString: this:title forKey:'title'.
writer writeInteger: this:done forKey:'done'.
}.
}
}
taskList ← #( #Task{ #id: '1', #title: 'Clean Room', #done: false }, #Task{ #id: '2', #title: 'Check Twitter', #done: true } ).
scheme todo : MPWMappingStore {
var taskDict.
-<void>setTaskList:aList {
this:taskDict := NSMutableDictionary dictionaryWithObjects: aList forKeys: aList collect id.
}
/tasks {
|= {
this:taskDict allValues.
}
}
/task/:id {
|= {
this:taskDict at:id .
}
=| {
this:taskDict at:id put:newValue.
self persist.
}
}
-<void>persist {
source:tasks := this:taskDict allValues.
}
}.
todo := #todo{ #taskList: taskList }.
json := #MPWJSONConverterStore{ #up: true, #class: class:Task }.
json → todo → #MPWJSONConverterStore{ #class: class:Task } → ref:file:/tmp/tasks/ asScheme.
server := #MPWSchemeHttpServer{ #scheme: json, #port: 8082 }.
server start.
shell runInteractiveLoop.
</pre>
</code>
<hr>
The <code>writeOnJSONStream:</code> method is currently still needed by the serializer to encode the task object as JSON. The parser doesn't need
any support, it can figure things out by itself for simple mappings. Yes, this makes no sense, as serializing is easier than parsing, but I
haven't gotten around to the automation for serializing yet.<p>
<h3>Analysis</h3>
So there you have it, an almost functional Todo backend, in refreshingly little code, and with refreshingly little magic.
What I find particularly pleasing is that this conciseness can be achieved while keeping the architecture fully visible
and maintaining a hexagonal/ports-and-adapters style.<p>
What is the architecture of this app? It says so right at the end: the server is parametrized by its scheme, and that scheme
is a JSON serializer hooked up to my todo scheme handler, hooked up to another JSON serializer hooked up to the directory <code>/tmp/tasks</code>.<p>
Although a Rails <a href="https://github.com/doerfli/todo-backend-rails5-api">app</a> contains comparably little code, this code is scattered over different classes and is only comprehensible as a
plugin to Rails. All the architecture is hidden inside Rails, it is not at all visible in the code and simply cannot be
divined from looking at the code. Although there are many reasons for this, one fundamental one is that Ruby is a call/return
language, and Rails does its best to translate from the REST architectural style to something that is more natural in the
call/return style. And it does an admirable job at it.<p>
I do think that this example gives us a little glimpse into what I believe to be the power of Architecture Oriented Programming: the
power and succinctness of frameworks, but with the simplicity, straightforwardness and reusability of more library-oriented styles.
<h3>Performance</h3>
I obviously couldn't resist benchmarking this, and to my great joy found that <a href="https://github.com/wg/wrk">wrk</a> now
works on the M1. Since the interpreter isn't thread safe, I had to restrict it to a single connection and thread. My
expectations were that it requests/s would be in the double to low triple digits, my fear was that it would be single
digits. (The reason for that fear is the <code>writeOnJSONStream:</code> method that is called for every object
serialized and is in interpreted Objective-S, probably one of the slowest language implementations currently in existence).
To say I was surprised is an understatement. Stunned is more like it:
<hr>
<code>
<pre>
wrk -c 1 -t 1 http://localhost:8082/task/1
Running 10s test @ http://localhost:8082/task/1
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 133.62us 14.45us 0.97ms 98.52%
Req/Sec 7.50k 311.09 7.62k 99.01%
75326 requests in 10.10s, 12.28MB read
Requests/sec: 7458.60
Transfer/sec: 1.22MBTransfer/sec: 1.97MB
</pre>
</code>
<hr>
More than 7K requests per second! Those M1 Macs really are fast. I wonder what it will be once I remove the need for the manually
written <code>writeOnJSONStream:</code> method.<p>
(NOTE: previous version said >12K requests/s, which is even more insane, but was with an incorrect URL that had the server returning 404s)Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-57827083575077473352021-05-21T08:43:00.001+02:002021-05-21T08:45:44.720+02:00Why are there no return statements in Objective-S?My <a href="https://blog.metaobject.com/2021/05/a-far-too-simple-harcoded-tasks-backend.html">previous example</a> raised a <a href="https://twitter.com/longmucholove/status/1395365345561661442?s=21">question</a>: why no return statements? I am assuming this was about this part of the example:
<hr>
<code>
<pre>
-description { "Task: {this:title} done: {this:done}". }
</pre>
</code>
<hr>
The answer is that I would like to do without return statements if and as much as I can. We will see
how much that is. In general, I am in favor of <a href="https://en.wikipedia.org/wiki/Expression-oriented_programming_language">expression-orientation</a> in programming languages. A simple example is if-statements
vs. conditional expressions. In most languages today, like C, Objective-C and Swift, <code>if</code> is a statement.
That means I write something as follows:
<hr>
<code>
<pre>
if ( condition ) {
do something if true
} else {
do something different if false
}
</pre>
</code>
<hr>
This seems obvious and is general, but very often you don't want to <em>do</em> arbitrary stuff, you just want to have
some variable have some value in one case and a different value in another case.
<hr>
<code>
<pre>
int foo;
if ( condition ) {
foo = 1;
} else {
foo = 42;
}
</pre>
</code>
<hr>
In that case, it is annoying that the <code>if</code> is defined to be a statement and not an expression, because
you can't just write the following:
<hr>
<code>
<pre>
int foo;
foo = if ( condition ) { 1; } else { 42; }
</pre>
</code>
<hr>
In addition, as hinted to in the previous examples, you can't use a statement to initialize a variable, that
definitely has to be an expression. Which is why C and many derived languages have the "ternary" operator
(?:), which is really just an if/else in expression form.
<hr>
<code>
<pre>
int foo=condition ? 1 : 42;
</pre>
</code>
<hr>
That solves the problem, but now you have two conditionals. Why not have just one? LISP, most of the
FP languages as well as Smalltalk and Objective-S have an <code>if</code> that returns a value.
<hr>
<code>
<pre>
a := condition ifTrue:{ 1. } ifFalse:{ 42. }.
</pre>
</code>
<hr>
So that's why expression-orientation is useful in general. What about methods? The same general idea
applies. Whereas in Java, for example, a read accessor is called <code>getX()</code>, indicating
an action that is performed ("get the value of x") in Objective-C Smalltalk and Objective-S, it is
just called <code>x</code>, ("the value of x").<p>
The same idea applies to dropping return statements where possible. It's not "get me the description
of this object", it is "the description of this object is...". And inside the method, it's not "this
statement now returns the following string as the description", but, again, "the description is...".<p>
Describing things that are, rather than actions to perform, is at the heart of Objective-S, as
discussed in <a href="https://2020.programming-conference.org/details/salon-2020-papers/5/Can-Programmers-Escape-the-Gentle-Tyranny-of-call-return-">Can Programmers Escape the Gentle Tyranny of Call/Return</a>.<p>
As Guy Steele <a href="https://dreamsongs.com/ObjectsHaveNotFailedNarr.html">put it</a>:
<blockquote cite="https://dreamsongs.com/ObjectsHaveNotFailedNarr.html">
Another weakness of procedural and functional programming is that their viewpoint assumes a process by which "inputs" are transformed into "outputs"; there is equal concern for correctness and for termination (and proofs thereof). But as we have connected millions of computers to form the Internet and the World Wide Web, as we have caused large independent sets of state to interact–I am speaking of databases, automated sensors, mobile devices, and (most of all) people–in this highly interactive, distributed setting, the procedural and functional models have failed, another reason why objects have become the dominant model. Ongoing behavior, not completion, is now of primary interest. Indeed, object-oriented programming had its origins in efforts to simulate the ongoing behavior of interacting real-world entities–thus the programming language SIMULA was born.
</blockquote>
So wherever possible, Objective-S tries to push towards expressing things as statically as possible, pushing
away from action-orientation. For example, hooking up a timed source to a pin:
<hr>
<code>
<pre>
#Blinker{ #seconds: 1, #active: true} → ref:gpio:17. </pre>
</pre>
</code>
<hr>
instead of executing a loop:
<hr>
<code>
<pre>
while True:
GPIO.output(17, True)
sleep(1)
GPIO.output(17, False)
sleep(1) </pre>
</pre>
</code>
<hr>
The same goes for many other relationships: instead of writing
procedural code that initiates and/or maintains the relatinship, with the
actual relationship remainig implicit, describe the actual relationship,
make <em>that</em> explicit, and instead keep the procedural code that maintains
it as a hidden implementation detail.<p>
If the <code>return</code> statement comes back, and it very well might, I am hoping
it will be in a slightly more general form. I recall Smalltalk's "^" being described
as "send back". I've already taken that and generalised it to mean "send result", using it
in filter definitions, where "^" means "send a result to the next filter in the pipeline".
It is needed there because filters are not limited to sending a single result, they can
send zero or many.<p>
With those more general semantics, "^" might also be used to send back results to the
sender of an asynchronous message, which is obviously quite different from a "return".<p>
And of course it would be useful for early returns, which are currently not possible.<p>
<h3>What about void methods?</h3>
Objective-S does have void methods, after all its procedural part is essentially identical
to Objective-C, which also has them. However, I agree with the FP folk that functions
(procedures, methods) should be as (side-)effect free as possible, and void methods by
definition are effectful (or no-ops).<p>
So where do the effects go? Two places:
<ol>
<li>The left hand side of the "←". <p> In most current programming languages, assignment
is severely crippled, and therefore not really useful for generalised effects. With
Polymorphic Identifiers and Storage Combinators, there is enough expressive power and
ability to abstract that we should need far fewer void methods.</li>
<li>Connecting via "→"<p>Much of the need for effectful methods in OO is for
constructing and connecting objects. In Objective-S, you don't need to call
methods that result in a connection being established as a side effect of
munging on some state, you define connections between objects directly using "→".<p>
Well, and you define objects using object literals such as <code>#Blinker{ #seconds: 1, #active: true}</code> instead of
setting instance variables procedurally.
</li>
</ol>
That's the plan, anyway. Although a lot of that plan is coming true at the moment. Exciting times! (And one of the reasons I haven't been blogging all that much).
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com1tag:blogger.com,1999:blog-8397311766319215218.post-90761192059530088822021-05-20T16:46:00.001+02:002021-05-21T17:40:20.574+02:00A far too simple (hardcoded) tasks backend in Objective-SRecently there was a question as to what one should use to create a backend for an iOS/macOS app these days. I couldn't
resist mentioning Objective-S, and just to check for myself whether that's feasible, I quickly jotted down the following
tiny backend that returns a hardcoded list of tasks via HTTP:
<hr>
<code>
<pre>
#!env stsh
framework:ObjectiveHTTPD load.
class Task {
var <bool> done.
var title.
-description { "Task: {this:title} done: {this:done}". }
}
taskList ← #( #Task{ #title: 'Clean my room', #done: false }, #Task{ #title: 'Check twitter feed', #done: true } ).
scheme todo {
var taskList.
/tasks {
|= {
this:taskList.
}
}
}.
todo := #todo{ #taskList: taskList }.
server := #MPWSchemeHttpServer{ #scheme: todo, #port: 8082 }.
server start.
shell runInteractiveLoop.
</pre>
</code>
<hr>
After loading the HTTP framework, we define a <code>Task</code> and a list of two example tasks. Then we define
a scheme with a single path, just <code>/tasks</code>, which returns said tasks list. We then instantiate the
scheme and serve it via HTTP on port 8082. Since this is a shell script and starting the server does not block,
we finally start up the REPL.<p>
Details such as coding the tasks as JSON, accessing a single task and modifying tasks are left as exercises
for the reader.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-74950177227880360742021-05-09T23:04:00.001+02:002021-05-09T23:04:03.585+02:00Talking to pinsThe last few weeks, I spent a little time getting <a href="http://objective.st">Objective-S</a> working well on the
Raspberry Pi, specifically my Pi400. It's a really wonderful little machine, and the form factor and price remind
me very much of the early personal computers.<p>
What's missing, IMHO, is an experience akin to the early BASICs. And I really mean "akin", not a nostalgia project, but
recovering a real quality that has been lost: not really "simplicity",
more "straightforwardness".<p>
Of course, one of the really cool thing about the Pi is its GPIO interface that lets you do all sorts of electronics
experiments, and I hear that the equivalent of "Hello World" for the Raspi is making an LED blink.
<hr>
<blockquote>
<code>
<pre>
import RPi.GPIO as GPIO
from time import sleep
GPIO.setwarnings(False)
GPIO.setmode(GPIO.BCM)
GPIO.setup(17, GPIO.OUT)
while True:
GPIO.output(17, True)
sleep(1)
GPIO.output(17, False)
sleep(1) </pre>
</pre>
</code>
</blockquote>
<hr>
Hmm. That's a a <em>lot</em> of <a href="https://blog.metaobject.com/2009/01/semantic-noise.html">semantic noise</a> for
something so conceptually simple. All we want to is set the value of a pin. As soon as I saw this,
I knew it would be ideal for <a href="https://www.hpi.uni-potsdam.de/hirschfeld/publications/media/WeiherHirschfeld_2013_PolymorphicIdentifiersUniformResourceAccessInObjectiveSmalltalk_AcmDL.pdf">Polymorphic Identifiers</a>, because a pin is the ultimate state, and PIs and their stores are made for abstracting over state.<p>
Of course, I first had to to get Objective-S running on the Pi, which meant getting <a href="http://gnustep.org">GNUstep</a> to run. While
there is a wonderful set of <a href="https://github.com/plaurent/gnustep-build">install scripts</a>, the one for the
Raspi only worked with an ancient clang version and libobjc 1.9. Alas, that version
has some bugs on the Raspi, for example with the <code>imp_implentationWithBlock()</code> runtime function
that Objective-S uses to define methods.<p>
Long story short, after learning about GNUstep installs and waiting for the wonderful David Chisnall to remove
some obsolete 32 bit exception-version detection code from libobjc, we now have a script that installs current
GNUstep with a reasonably current clang: <a href="https://github.com/plaurent/gnustep-build/tree/master/raspbian-10-clang-9.0-runtime-2.1-ARM">https://github.com/plaurent/gnustep-build/tree/master/raspbian-10-clang-9.0-runtime-2.1-ARM</a>.
With that in hand, a few bug fixes in MPWFoundation and Objective-S, I could add a really rudimentary
<a href="https://github.com/mpw/Objective-Smalltalk/blob/master/raspi/MPWBCMStore.m">Store</a> that manages talking to the pins. And this allows me to write the following in an interactive shell to drive the customary GPIO pin 17 that I
connected to the LED via resistor:
<hr>
<blockquote>
<code>
<pre>
gpio:17 ← 1.
</pre>
</code>
</blockquote>
<hr>
Now that's what I am talking about!<p>
Of course, we're supposed to make it blink, not just turn it on. We could use the same looping approach
as the Python script, or convenience methods like the ones provided, but the breadboard and pins
make me think of wanting to connect components to do the job instead.<p>
So let's connect some components, software architecture style! The following <a href="https://github.com/mpw/Objective-Smalltalk/blob/master/raspi/blink.st">script</a> creates an instance of
a <code>Blinker</code> object (using an <a href="http://objective.st/Language/">object literal</a>), which emits alternating ones and zeros and connects it to the pin.
<hr>
<blockquote>
<code>
<pre>
blinker ← #Blinker{ #seconds: 1 }.
blinker → ref:gpio:17.
blinker run.
gpio:17 ← 0.
</pre>
</code>
</blockquote>
<hr>
Once connected it tells the blinker to start running, which creates an <code>NSTimer</code> adds it to the
current runloop and then runs the run loop. That run is interruptible, so Ctrl-C breaks and runs the
cleanup code.<p>
What about setting up the pin for output? Happens automatically when you first output to it, but I will add
code so you can do it manually.<p>
Where does the Blinker come from? That's actually an <em>object-template</em> based on an <a href="https://github.com/mpw/MPWFoundation/blob/master/Streams.subproj/MPWFixedValueSource.m">MPWFixedValueSource</a>.
<hr>
<blockquote>
<code>
<pre>
object Blinker : #MPWFixedValueSource{ #values: #(0,1) }
</pre>
</code>
</blockquote>
<hr>
You can, of course, hook up a fixed-value source to any kind of stream.<p>
While getting here took a lot of work, and resulted in me (re-)learning a lot about GNUstep, the result,
even this intermediate one, is completely worth it and makes me very happy. This stuff really works
even better than I thought it would.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-74181294199624294002020-11-13T13:16:00.001+01:002020-11-13T13:46:29.317+01:00M1 Memory and PerformanceThe M1 Macs are out now, and not only does Apple claim they're absolutely smokin', early benchmarks
seem to confirm those claims. I don't find this surprising, Apple has been highly focused on
performance ever since Tiger, and as far as I can tell hasn't let up since.<p>
One maybe somewhat surprising aspect of the M1s is the limitation to "only"
16 Gigabytes of memory. As someone who bought a 16 <em>Kilo</em>byte language card to run the Merlin
6502 assembler on his Apple ][+ and expanded his NeXT cube, which isn't <em>that</em> different from
a modern Mac, to a
whopping 16 <em>Mega</em>bytes, this doesn't actually seem that much of a limitation, but it did
cause a bit of consternation.<p>
I have a bit of a theory as to how this "limitation" might tie in to how Apple's outside-the-box
approach to memory and performance has contributed to the remarkable achievement that is the M1.<p>
The M1 is apparently a multi-die package that contains both the actual processor die and the
DRAM. As such, it has a very high-speed interface between the DRAM and the processors.
This high-speed interface, in addition to the absolutely humongous caches, is key to keeping the various functional
units fed. Memory bandwidth and latency are probably <em>the</em> determining factors for many
of today's workloads, with a single access to main memory taking easily hundreds of clock cycles
and the CPU capable of doing a good number of operations in each of these clock cycles.
As Andrew Black <a href="http://web.cecs.pdx.edu/~black/publications/O-JDahl.pdf">wrote</a>: "[..] computation is essentially free, because it happens 'in the cracks' between data fetch and data store; ..".<p>
The tradeoff is that you can only fit so much DRAM in that package for now, but if it fits,
it's going to be super fast.<p>
So how do we make sure it all fits? Well, where Apple might have been "focused" on performance
for the last 15 years or so, they have been completely <em>anal</em> about memory consumption.
When I was there, we were fixing 32 <em>byte</em> memory leaks. Leaks that happened <em>once</em>.
So not an ongoing consumption of 32 bytes again and again, but a one-time leak of 32 bytes.<p>
That dedication verging on the obsessive is one of the reasons iPhones have been besting
top-of-the-line Android phone that have twice the memory. And not by a little, either.<p>
Another reason is the iOS team's steadfast refusal to adopt tracing garbage collection as
most of the rest of the industry did,
and macOS's later abandonment of that technology in favor of the reference counting (RC) they've
been using since NeXTStep 4.0. With increased automation of those reference counting operations
and the addition of weak references, the convenience level for developers is essentially
indistinguishable from a tracing GC now.<p>
The benefit of sticking to RC is much-reduced memory consumption. It <a href="https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf">turns out</a> that for
a tracing GC to achieve performance comparable with manual allocation, it needs several
times the memory (different studies find different overheads, but at least 4x is a conservative
lower bound). While I haven't seen a study comparing RC, my personal experience is that the
overhead is much lower, much more predictable, and can usually be driven down with little
additional effort if needed.<p>
So Apple can afford to live with more "limited" total memory because they need much less
memory for the system to be fast. And so they can do a system design that imposes this
limitation, but allows them to make that memory wicked fast. <em>Nice</em>.<p>
Another "well-known" limitation of RC that has made it the second choice compared to tracing
GC is the fact that updating those reference counts all the time is expensive, particularly
in a multi-threaded environment where those updates need to be atomic. Well...<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">fun fact: retaining and releasing an NSObject takes ~30 nanoseconds on current gen Intel, and ~6.5 nanoseconds on an M1</p>— David Smith (@Catfish_Man) <a href="https://twitter.com/Catfish_Man/status/1326238434235568128?ref_src=twsrc%5Etfw">November 10, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
How?
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">We got that working on x86-64 too :) this further improvement is because uncontended acquire-release atomics are about the same speed as regular load/store on A14</p>— David Smith (@Catfish_Man) <a href="https://twitter.com/Catfish_Man/status/1326298205034696705?ref_src=twsrc%5Etfw">November 10, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
Problem solved. I guess it helps if you can make your own Silicon ;-)<p>
So Apple's focus on keeping memory consumption under control, which includes but is not limited
to going all-in on reference counting where pretty much the rest of the industry has adopted
tracing garbage collection, is now paying off in a majory way ("bigly"? Too soon?). They can get away
with putting less memory in the system, which makes it possible to make that memory really fast.
And that locks in an advantage that'll be hard to duplicate.<p>
It also means that native development will have a bigger advantage compared to web technologies,
because native apps benefit from the speed and don't have a problem with the memory limitations,
whereas web-/electron apps will fill up that memory much more quickly.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com6tag:blogger.com,1999:blog-8397311766319215218.post-31636799989934996822020-09-15T11:01:00.001+02:002020-09-15T11:01:17.760+02:00Pointers are Easy, Optimization is ComplicatedJust recently came across Ralf Jung's 2018 post titled <a href="https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html">Pointers are Complicated</a>. The central thesis is that the model that most C (and
assembly language) programmers have that a pointer is just an integer that happens to be a machine
address is wrong, in fact the author flat out states: "Pointers are definitely not integers."<p>
That's a strong statement. I like strong statements, because they make a discussion possible. So
let's respond in kind: the claim that pointers are <em>definitely</em> not integers is wrong.<p>
<h3>The example</h3>
The example the author uses to show that pointers are definitely not integers is the following:
<hr>
<figure class="highlight"><pre><code class="language-c--" data-lang="c++"><span class="kt">int</span> <span class="nf">test</span><span class="p">()</span> <span class="p">{</span>
<span class="k">auto</span> <span class="n">x</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="mi">8</span><span class="p">];</span>
<span class="k">auto</span> <span class="n">y</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="mi">8</span><span class="p">];</span>
<span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">42</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="cm">/* some side-effect-free computation */</span><span class="p">;</span>
<span class="k">auto</span> <span class="n">x_ptr</span> <span class="o">=</span> <span class="o">&</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
<span class="o">*</span><span class="n">x_ptr</span> <span class="o">=</span> <span class="mi">23</span><span class="p">;</span>
<span class="k">return</span> <span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="p">}</span></code></pre></figure>
<hr>
And this is the crux of the reasoning:
<blockquote>
It would be beneficial to be able to optimize the final read of y[0] to just return 42. The justification for this optimization is that writing to x_ptr, which points into x, cannot change y.
</blockquote>
So pointers are "hard" and "not integers" because they conflict with this optimization that "would be
beneficial".<p>
I find this fascinating: a "nice to have" optimzation is so obviously more important than a simple
and obvious pointer model that it doesn't even need to be explained as a possible tradeoff, never
mind justified as to why the tradeoff is resolved in favor of the nice-to-have optimization.<p>
I prefer the simple and obvious pointer model. Vastly.<p>
This way of placing the optimizer's concerns far ahead of the programmer's is not unique, if
you check out Chris Lattner's <a href="https://blog.llvm.org/posts/2011-05-13-what-every-c-programmer-should-know/">What Every C Programmer Should Know About Undefined Behavior</a>, you will note
the frequent occurrence of the phrase "enables ... optimizations". It's pretty much the only
justification ever given.<p>
I call this now industry-dominating style of programming <em>Compiler Optimizer Creator Oriented
Programming</em> (COCOP). It was thoroughly critiqued in <a href="http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf">What every compiler writer should know about programmers or “Optimization” based on undefined behaviour hurts performance (pdf)</a>.<p>
<h3>Pointers as Integers</h3>
There are certainly machines where pointers are not integers, the most prominent being 8086/80286 16 bit
segmented mode, where a (far) pointer consists of a segment and an offset. On 8086, the segment
is simply shifted left 4 bits and added to the offset, on 80286 the segment can be located anywhere
in memory or not be resident, implementing a segmented virtual memory. AFAIK, these modes are
simplified variants of the iAPX 432 object memory model.<p>
What's important to note in this context is that the iAPX 432 and its memory model failed horribly,
and industry actively and happily moved away from the x86 segmented model to what is called a
"flat address space", common on other architectures and finally also adopted by Intel with the
386.<p>
The salient feature of a "flat address space" is that a pointer is an integer, and in fact this
eqivalence is also rather influential on CPU architecture, with address-space almost universally
tied to the CPU's integer size. So although the 68K was billed as a 16 bit CPU (or 16/32), its
registers were actually 32 bits, and IIRC its address ALUs were fully 32 bit, so if you wanted
to do some kinds of 32 bit aritmetic, the LEA (Load Effective Address) instruction was your
friend. The reason for the segmented architecturee on the 8086 was that it was a true 16 bit
machine, with 16 bit registers, but Intel wanted to have a 20 bit address space.<p>
So not only was and is there an equivalence of pointers and integers, this state of affairs
was one that was actively sought and joyously received once we achieved it again. Giving
it up for nice-to-have optimizations seems at <em>best</em> debatable, but at the very
least it is something that should be discussed/debated, rather than simply assumed away.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com1tag:blogger.com,1999:blog-8397311766319215218.post-74702322437846399032020-06-21T15:54:00.001+02:002022-01-16T16:43:41.613+01:00Beyond Faster JSON Support for iOS/macOS, Part 9: CSV and SQLiteWhen looking at the <code>MPWPlistStreaming</code> protocol that I've been using for my
JSON parsing series, one thing that was probably noticeable is that it isn't particularly
JSON-focused. In fact, it wasn't even initially designed for parsing, but for generating.<p>
So could we use this for other de-serialization tasks? Glad you asked!<p>
<h3>CSV parsing</h3>
One of the examples in my performance book involves parsing Comma Separated Values
quickly, within the context of getting the time to convert a 139Mb
<a href="https://en.wikipedia.org/wiki/General_Transit_Feed_Specification">GTFS</a> file
to something usable on the phone down from 20 minutes using
using CoreData/SQLite to slightly less than a second using custom in-memory data structures
that are also several orders of magnitude faster to query on-device.<p>
<iframe width="640" height="360" src="https://www.youtube-nocookie.com/embed/kHG_zw75SjE?start=1273" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
The original project's CVS parser took around 18 seconds, which wasn't a significant
part of the 20 minutes, but when the rest only took a couple of hundred milliseconds,
it was time to make that part faster as well. The result, slightly generalized,
is <code>MPWDelimitedTable</code> ( <a href="https://github.com/mpw/MPWFoundation/blob/master/Collections.subproj/MPWDelimitedTable.h">.h</a> <a href="https://github.com/mpw/MPWFoundation/blob/master/Collections.subproj/MPWDelimitedTable.m">.m</a> ).<p>
The basic interface is block-based, with the block being called for every row in the
table, called with a dictionary composed of the header row as keys and the contents
of the row as values.<p>
<hr>
<code>
<pre>
-(void)do:(void(^)(NSDictionary* theDict, int anIndex))block;
</pre>
</code>
<hr>
Adapting this to the <code>MPWPlistStreaming</code> protocol is straightforward:
<hr>
<code>
<pre>
-(void)writeOnBuilder:(id <MPWPlistStreaming>)builder
{
[builder beginArray];
[self do:^(NSDictionary* theDict, int anIndex){
[builder beginDictionary];
for (NSString *key in self.headerKeys) {
[builder writeObject:theDict[key] forKey:key];
}
[builder endDictionary];
}];
[builder endArray];
}
</pre>
</code>
<hr>
This is a quick-and-dirty implementation based on the existing API that is clearly
sub-optimal: the API we call first constructs a dictionary from the row and the
header keys and then we iterate over it. However, it works with our existing set
of builders and doesn't build an in-memory representation of the entire CSV.<p>
It will also be relatively straightforward to invert this API usage, modifying the
low-level API to use <code>MPWPlistStreaming</code> and then creating a higher-level
block- and dictionay-based API on top of that, in a way that will also work with
other <code>MPWPlistStreaming</code>
clients.
<h3>SQLite</h3>
Another tabular data format is SQL data bases. On macOS/iOS, one very common database
is SQLite, usually accessed via CoreData or the excellent and much more light-weight
<a href="https://github.com/ccgus/fmdb">fmdb</a>.<p>
Having used fmdb myself before, and bing quite delighted with it, my first impulse was
to write a <code>MPWPlistStreaming</code> adapter for it, but after looking at the code
a bit more closely, it seemed that it was doing quite a bit that I would not need for
<code>MPWPlistStreaming</code>.<p>
I also think I saw the same trade-off between a convenient and slow convenience based
on <code>NSDictionary</code> and a much more complex but potentially faster API
based on pulling individual type values.<p>
So Instead I decided to try and do something ultra simple that sits directly on
top of the SQLite C-API, and the implementation is really quite simple and
compact:<p>
<hr>
<code>
<pre>
@interface MPWStreamQLite()
@property (nonatomic, strong) NSString *databasePath;
@end
@implementation MPWStreamQLite
{
sqlite3 *db;
}
-(instancetype)initWithPath:(NSString*)newpath
{
self=[super init];
self.databasePath = newpath;
return self;
}
-(int)exec:(NSString*)sql
{
sqlite3_stmt *res;
int rc = sqlite3_prepare_v2(db, [sql UTF8String], -1, &res, 0);
@autoreleasepool {
[self.builder beginArray];
int step;
int numCols=sqlite3_column_count(res);
NSString* keys[numCols];
for (int i=0;i < numCols; i++) {
keys[i]=@(sqlite3_column_name(res, i));
}
while ( SQLITE_ROW == (step = sqlite3_step(res))) {
@autoreleasepool {
[self.builder beginDictionary];
for (int i=0; i < numCols; i++) {
const char *text=(const char*)sqlite3_column_text(res, i);
if (text) {
[self.builder writeObject:@(text) forKey:keys[i]];
}
}
[self.builder endDictionary];
}
}
sqlite3_finalize(res);
[self.builder endArray];
}
return rc;
}
-(int)open
{
return sqlite3_open([self.databasePath UTF8String], &db);
}
-(void)close
{
if (db) {
sqlite3_close(db);
db=NULL;
}
}
</pre>
</code>
<hr>
Of course, this doesn't do a lot, chiefly it only reads, no updates, inserts or deletes.
However, the code is striking in its brevity and simplicity, while at the same time
being both convenient and fast, though with still some room for improvement.<p>
In my experience, you tend to not get all three of these properties at the same time:
code that is simple and convenient tends to be slow, code that is convenient and
fast tends to be rather tricky and code that's simple and fast tends to be inconvenient
to use.<p>
How easy to use is it? The following code turns a table into an array of dictionaries:
<hr>
<code>
<pre>
#import <MPWFoundation/MPWFoundation.h>
int main(int argc, char* argv[]) {
MPWStreamQLite *db=[[MPWStreamQLite alloc] initWithPath:@"chinook.db"];
db.builder = [MPWPListBuilder new];
if( [db open] == 0 ) {
[db exec:@"select * from artists;"];
NSLog(@"results: %@",[db.builder result]);
[db close];
} else {
NSLog(@"Can't open database: %s\n", [db error]);
}
return(0);
}
</pre>
</code>
<hr>
This is pretty good, but probably roughly par for the course for returning a generic
data structure such as array of dictionaries, which is not going to be particularly
efficient. (One of my first clues that CoreData's predecessor EOF wasn't particularly
fast was when I read that fetching raw dictionaries was an optimization, much faster than
fetching objects.)<p>
What if we want to get objects instead? Easy, just replace the <code>MPWPListBuilder</code>
with an <code>MPWObjectBuilder</code>, parametrized with the class to create. Well, and
define the class, but presumably you already havee that if the task is to convert to
objects of that class. And it cold obviously also be automated.<p>
<hr>
<code>
<pre>
#import <MPWFoundation/MPWFoundation.h>
@interface Artist : NSObject { }
@property (assign) long ArtistId;
@property (nonatomic,strong) NSString *Name;
@end
@implementation Artist
-(NSString*)description
{
return [NSString stringWithFormat:@"<%@:%p id: %ld name: %@>",[self class],self,self.ArtistId,self.Name];
}
@end
int main(int argc, char* argv[]) {
MPWStreamQLite *db=[[MPWStreamQLite alloc] initWithPath:@"chinook.db"];
db.builder = [[MPWObjectBuilder alloc] initWithClass:[Artist class]];
if( [db open] == 0) {
[db exec:@"select * from artists"];
NSLog(@"results: %@",[db.builder result]);
[db close];
} else {
NSLog(@"Can't open database: %s\n", [db error]);
}
return(0);
}
</pre>
</code>
<hr>
Note that this does <em>not</em> generate a plist representation as an intermediate
step, it goes straight from database result sets to objects. The generic intermediate
"format" is the <code>MPWPlistStreaming</code> protocol, which is a dematerialized
representation, both plist and objects are peers.<p>
<h3>TOC</h3>
<a href="https://blog.metaobject.com/2020/04/somewhat-less-lethargic-json-support.html">Somewhat Less Lethargic JSON Support for iOS/macOS, Part 1: The Status Quo</a><br>
<a href="https://blog.metaobject.com/2020/04/somewhat-less-lethargic-json-support_12.html">Somewhat Less Lethargic JSON Support for iOS/macOS, Part 2: Analysis</a><br>
<a href="https://blog.metaobject.com/2020/04/somewhat-less-lethargic-json-support_14.html">Somewhat Less Lethargic JSON Support for iOS/macOS, Part 3: Dematerialization</a><br>
<a href="https://blog.metaobject.com/2020/04/equally-lethargic-json-support-for.html">Equally Lethargic JSON Support for iOS/macOS, Part 4: Our Keys are Small but Legion</a><br>
<a href="https://blog.metaobject.com/2020/04/less-lethargic-json-support-for.html">Less Lethargic JSON Support for iOS/macOS, Part 5: Cutting out the Middleman</a><br>
<a href="https://blog.metaobject.com/2020/04/somewhat-faster-json-support-for.html">Somewhat Faster JSON Support for iOS/macOS, Part 6: Cutting KVC out of the Loop</a><br>
<a href="https://blog.metaobject.com/2020/04/faster-json-support-for-iosmacos-part-7.html">Faster JSON Support for iOS/macOS, Part 7: Polishing the Parser</a><br>
<a href="https://blog.metaobject.com/2020/04/faster-json-support-for-iosmacos-part-8.html">Faster JSON Support for iOS/macOS, Part 8: Dematerialize All the Things!</a><br>
<a href="https://blog.metaobject.com/2020/06/beyond-faster-json-support-for-iosmacos.html">Beyond Faster JSON Support for iOS/macOS, Part 9: CSV and SQLite</a><br>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-6773580767725802862020-06-14T14:05:00.001+02:002020-06-14T17:13:02.241+02:00The Curious Case of Swift's Adoption of Smalltalk Keyword SyntaxI was really surprised to learn that Swift recently adopted Smalltalk keyword syntax: <a href="https://forums.swift.org/t/accepted-se-0279-multiple-trailing-closures/36141">[Accepted] SE-0279: Multiple Trailing Closures</a>. That is: a keyword terminated by a colon, followed by an argument and without
any surrounding braces.<p>
The mind boggles.<p>
A little.<p>
Of course, Swift wouldn't be Swift if this weren't a special case of a special case, specifically
the case of <em>multiple</em> trailing closures, which is a special case of <em>trailing closures</em>,
which are weird and special-casey enough by themselves. Below is an example:<p>
<hr>
<code>
<pre>
UIView.animate(withDuration: 0.3) {
self.view.alpha = 0
} completion: { _ in
self.view.removeFromSuperview()
}
</pre>
</code>
<hr>
Note how the arguments to <code>animate()</code> would seem to terminate at the closing parenthesis,
but that's actually not the case. The curly braces after the closing paren start a closure that is
actually also an argument to the method, a so-called trailing closure. I have a little bit of
sympathy for this construct, because closures inside of the parentheses look really, really
awkward. (Of course, <em>all</em> params apart from a sole <code>x</code> inside <code>f(x)</code>
look awkward, but let's not quibble. For now.).<p>
Another thing this enables is methods that reasonably resemble control structures, which I heard
is a really great idea.<p>
The problem is that sometimes you have more than one closure argument, and then just stacking them
up behind what appears to be end of the function/method call gets really, really awkward, and
you can't tell which block is which argument, because the trailing closure doesn't get a keyword.<p>
Well, now it does. And we now have 4 different method syntaxes in one!<p>
<ol>
<li>Traditional C/Pascal/C++/Java function call syntax <code>x.f()</code></li>
<li>The already weird-ish addition of Smalltalk/Objective-C keywords inside the <code>f(x)</code> syntax: <code>f(arg:x)</code></li>
<li>Original trailing-closure syntax, which is just its own thing, for the first closure</li>
<li>Smalltalk non-brackted keyword syntax for the 2nd and subsequent closures.</li>
</ol>
That is impressive, in a scary kind of way.
<blockquote>
Swift is a crescendo of special cases stopping just short of the general; the result is complexity in the semantics, complexity in the behaviour (i.e. bugs), and complexity in use (i.e. workarounds).
<footer>
— <cite><a href="https://www.quora.com/Which-features-overcomplicate-Swift-What-should-be-removed?share=1">Which features overcomplicate Swift, Rob Rix</a></cite>
</footer>
</blockquote>
In understand that this proposal was quite controversial, with heated discussion between opponents
and proponents. I understand and sympathize with both sides. On the one hand, this <em>is</em>
markedly better than alternatives. On the other hand it is a special case of a special case that
is difficult to justify as an addition of all that is already there.<p>
Special cases beget special cases beget special cases.<p>
Of course the answer was always there: Smalltalk keyword syntax is not just the only reasonable
solution in this case, it also solves all the other cases. It is the general solution. Here's
how this could look in Objective-Smalltalk (which uses curly braces instead for closures instead
of Smalltalk-80's square brackets):<p>
<hr>
<code>
<pre>
UIView animate:{ self.view.alpha ← 0. } withDuration:0.3 completion:{ self view removeFromSuperview. }.
</pre>
</code>
<hr>
No special cases, every argument is labeled, no syntax mush of brackets inside parentheses etc.
And yes, this also handles user-defined control structures, <code>to:do:</code> is just a
method on <code>NSNumber</code>:<p>
<hr>
<code>
<pre>
1 to:10 do:{:i | stdout println:"I will not introduce {i} special cases willy nilly.".}.
</pre>
</code>
<hr>
And since keywords naturally go between their arguments, there is no need for "operators",
as a very different and special syntax form. You just allow some "binary" keywords to look
a little different, so instead of <code>2 multiply:3</code> you can write <code>2 * 3</code>.
And when you have <code>2 raisedTo:3</code> instead of <code>pow(2,3)</code> (with the
signature: <code>func pow(_ x: Decimal, _ y: Int) -> Decimal</code>), do you
really neeed to go to the <a href="https://gist.github.com/steakknife/d629fa0c398f75682d00">trouble</a> of defining an "operator"?
<p>
Or Swift's <code>a as b</code>, another special kind of syntax. How about <code>a as:b</code>?
(Yes I know there are details, but those are ... details.). And so on and so forth.<p>
But of course, it's too late now. When I chose Smalltalk as the base syntax for the language
that has turned into Objective-Smalltalk, it wasn't just because I just like it or have gotten
used to it via Objective-C. Smalltalk's syntax is surprisingly flexible and general,
Smalltalk APIs look a lot like DSLs, without any of the tooling or other overheads.<p>
And that's the frustrating part: this stuff was and is available and well-known. At least
if you bother to look and/or ask. But instead, we just choose these things willy-nilly
and everybody has to suffer the consequences.<p>
UPDATE:<p>
I guess what I am trying to get at is that if you'd thought things through just a little bit, you
could have had almost the entire syntax of your language for the cost (complexity,
implementation size and brittleness, cognitive load, etc.) of this one special case of a special case.
And it would have been overall better to boot.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com6tag:blogger.com,1999:blog-8397311766319215218.post-52459783032066042692020-06-01T13:07:00.001+02:002020-06-01T13:10:28.034+02:00MPWTest Only Tests FrameworksIt should be noted, if it wasn't obvious, that <a href="https://blog.metaobject.com/2020/05/mpwtest-reducing-test-friction-by-going.html">MPWTest</a> is <em>opinionated software</em>, meaning it
achieves some of its smoothness by gleefully embracing constraints that some might view as
potentially crippling limitations.<p>
Maybe the biggest of these constraints, mentioned in the previous post, is that MPWTest only tests
frameworks. This means that the following workflow is not supported out of the box:
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">The original OCUnit (that I wrote after <a href="https://twitter.com/KentBeck?ref_src=twsrc%5Etfw">@KentBeck</a>'s paper and was out at about the same time as JUnit) friction was not "high" IMHO: add the test framework, write a subclass of TestCase, launch your app with a -Test argument, results are logged in ProjectBuilder' console.</p>— Marco Scheurer (@phink0) <a href="https://twitter.com/phink0/status/1266803201447268352?ref_src=twsrc%5Etfw">May 30, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
The point being that this is a workflow I not just somewhat indifferently do not want, but rather
emphatically and actively <em>want to avoid</em>. Tests that are run (only?) when launching the
app are application tests. My perspective is that unit tests are an integral part of the
class. This may seem a subtle distinction, but subtle differences in something you do
constantly can have huge impacts. "Steter Tropfen höhlt den Stein."<p>
Another aspect is that launching the app for testing as a permanent and fixed part of your build
process seems highly annoying at best. Linker finishes, app pops up, runs for a couple of seconds,
shuts down again. I don't see that as viable. For testing to be integral and pervasive, it has
to be invisible when the tests succeed.<p>
The <a href="https://www.agilecoachjournal.com/wp-content/uploads/2014/01/AgileTestingPyramid1.jpg?189db0&189db0">testing pyramid</a> is helpful here: my contention is that you want to be
at the bottom of that pyramid, ideally <em>all</em> of the time. Realistically, you're
probably not going to get there, but you should push really, really hard, even making
sacrifices that appear to be unreasonable to achieve that goal.<p>
<h3>Framework-oriented programming</h3>
Only testing frameworks begs the question as to how to test those parts of the application
not in frameworks. For me the answer is simple: there isn't any production code outside
of frameworks.<p>
None. Not the UI, not the application delegate. Only the auto-generated <code>main()</code>.<p>
The benefits of this approach are plentiful, the effort minimal. And if you think this is
an, er, eccentric position to take, the program you almost certainly use to create apps
for iOS/macOS etc. takes the same eccentric position: Xcode's main executable is 45K in size and
only contains a <code>main()</code> function and some Swift boilerplate.<p>
<img src="https://dl.dropbox.com/s/8osjcuto8jevqhk/xcode-main.png?dl=0" alt="" title="" border="0" width="500" height="" /><p>
If all your code is in frameworks, only testing frameworks is not a problem. That may seem
like a somewhat extreme case of sour grapes, with the arbitrary limitations of a one-off
unit testing framework driving major architectural decisions, but the causality is the
other way around: I embraced framework-oriented programming before and independently of
MPWTest.<p>
<h3>iOS</h3>
Another issue is iOS. Running a command-line tool that dynamically loads and tests frameworks
is at least tricky and may be impossible, so that approach currently does not work. My current
approach is that I view on-device and on-simulator tests as higher-up in the testing hierarchy:
they are more costly, less numerous and run less frequently.<p>
The vast majority of code lives in cross-platform frameworks (see: Ports and Adapters) and is
developed and tested primarily on macOS. I have found this to be much faster than
using the simulator or a device in day-to-day programming, and have used this "mac-first"
technique even on projects where we were using XCTest.<p>
Although not testing on the target platform may be seen as a problem, I have found
discrepancies to be between exceedingly rare and non-existent, with "normal" code
trending towards the latter. One of the few exceptions in the not-quite-so-normal
code that I sometimes create was the change of calling conventions on arm64, which
meant that plain method pointers (IMPs) no longer worked, but had to be cast to
the "correct" pointer type, only on device. Neither macOS nor the simulator
would show the problem.<p>
For that purpose, I hacked together a small iOS <a href="https://github.com/mpw/MPWFoundation/blob/master/TestMPWFoundation/AppDelegate.m">app</a> that runs the tests specified
in a plist in the app bundle. There is almost certainly a better way to handle this, but
I haven't had the cycles or motivation to look into it.<p>
<h3>How to approximate</h3>
So you can't or don't want to adopt MPWTest. That doesn't mean you can't get at least some
of the benefits of the approach. As a start, instead of using Cmd-B in Xcode to build, just
use Cmd-U instead. That's what I did when working on Wunderlist, where we used XCTest.<p>
Second, adopt framework-oriented programming and the Ports and Adapters style as much as possible.
Put all your code in frameworks, and as much as possible in cross-platform frameworks that you
can test/run on macOS, and even if you are developing exclusively for iOS, create a macOS target
for that framework. This makes using Cmd-U to build much less painful.<p>
Third, adhere to a strict 1:1 mapping between production classes and test classes, and place
your test classes in the same file as the class they are testing.<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Thanks for sharing! I do this all the time in Rust, where tests are at the bottom of the source. I love it. Some people complain that now we're mingling testing code and prod code, but I think that doesn't hold: we're also adding logs and assertions to our prod code.</p>— Benedikt Terhechte @ 🏠 (@terhechte) <a href="https://twitter.com/terhechte/status/1267380695896403968?ref_src=twsrc%5Etfw">June 1, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">With OCUnit you didn't have to but your tests in a separate bundle, target etc. You could have them at the bottom of your class file in a subclass of TestCase instead of a "(testing) category". The difference is not that big.</p>— Marco Scheurer (@phink0) <a href="https://twitter.com/phink0/status/1267079516603777024?ref_src=twsrc%5Etfw">May 31, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
My practical experience with both JUnit and XCTest on medium-sized projects does not
square with the assertion that the difference is not that big: you still have to
create these additional classes, they have to communicate with the class under tests
(<code>self</code> in MPWTest), you have to track changes etc. And of course, you
have to know to configure und use the framework differently from the way it was
built, intended and documented. And what I've seen of OCUnit use was that the
tests were not co-located with the class, but in a separate part of the project.<p>
A final note is that the trick of interchangeably using the class as the test fixture is only
really possible in a language like Objective-C where classes are first class objects.
It simply wouldn't be possible in Java. This is how the class can test itself, and the
tests become an integral part of the class, rather than something that's added
somewhere else.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-44196906384742281592020-05-30T19:35:00.001+02:002020-05-31T08:59:55.964+02:00MPWTest: Reducing Test Friction by Going Beyond the xUnit ModelBy popular demand, a quick rundown of <a href="https://github.com/mpw/mpwtest">MPWTest</a> (“<em>The Simplest Testing Framework That Could Possibly Work</em>”), my own personal unit testing framework, and
how it makes TDD fast, fun, and frictionless.<p>
I created MPWTest because once I had been bitten by the TDD bug, I definitely did not
want to write software without TDD ever again, if I could help it. This was long before XCTest, and even its
precursor SenTestKit was in at best in parallel development, I certainly
wasn't aware of it.<p>
It is a bit different, and the differences make it sufficiently better that I
much prefer it to the xUnit variants that I've worked with (JUnit, some SUnit, XCTest). All of these are
vastly better than not doing TDD, but they introduce significant amounts of
overhead, friction, that make the testing experience much more cumbersome
than it needs to be, and to me at least partly explains some of the antipathy
I see towards unit testing from developers.<p>
The attitude I see is that testing is like eating your vegetables, you know
it's supposed to be good for you and you do it, grudgingly, but it really
is rather annoying and the benefits are more something you know intellectually.<p>
For me with MPWTest, TDD is also still intellectually a Good Thing™, but also
viscerally <em>fun</em>, less like vegetables and more like tasty snacks, except
that those snacks are not just yummy, but also healthy. It helps me stay in
the flow and get things done.<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">I would second MPWTest.<br><br>Mind you, I don't know how it's normally done—but the fact that everything is in one file && part of the build process makes it WAY WAY WAY faster.<br><br>Given the advent of SwiftUI—it's like Xcode previews—but for tests.<br><br>1/</p>— 𝔾𝕦𝕤𝕥𝕒𝕧𝕠 𝕄𝕦𝕔𝕙𝕠 𝕃𝕠𝕧𝕖 👌🏻 (@LongMuchoLove) <a href="https://twitter.com/LongMuchoLove/status/1266085192931868678?ref_src=twsrc%5Etfw">May 28, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>
What it does is let me change code quickly and safely, the key to agile:
<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Contrary to what you may have read in the Agile literature, the key to agility is the ability to change code quickly and safely. And the key to that is the ability to re-test code quickly and effectively. Fast-running automated tests ("unit tests") are the key to agility.</p>— Jason Gorman (only, more indoors than usual) (@jasongorman) <a href="https://twitter.com/jasongorman/status/1251408890505420800?ref_src=twsrc%5Etfw">April 18, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
Here is how it works.<p>
<h3>Setup</h3>
First you need to build the <code>testlogger</code> binary of the <a href="https://github.com/mpw/mpwtest">MPWTest</a> project. I put mine in <code>/usr/local/bin</code> and forget about it. You can put it anywhere you like, but will have to adjust the paths
in what follows.<p>
Next, add a "Script" build phase to your (framework) project. MPWTest currently only
tests frameworks.<p>
<hr>
<code>
<pre>
tester=/usr/local/bin/testlogger
framework=${TARGET_BUILD_DIR}/${FULL_PRODUCT_NAME}
if [ -f ${tester} ] ; then
$tester ${framework}
else
echo "projectfile:0:1: warning: $tester or $framework not found, tests not run"
fi
</pre>
</code>
<hr>
The bottom of the Build Phases pane of your project should then look something roughly like the following:<p>
<img src="https://dl.dropbox.com/s/wmgvqsnc9fqen8z/test-phase.png?dl=0" alt="" title="" border="0" width="600" height="" />
<p>
There is no separate test bundle, no extra targets, nada. This may not seem such
a big deal when you have just a single target, but once you start getting having
a few frameworks, having an additional test target for each really starts to
add up. And adds a decision-point: should I really create an additional
test bundle for this project? Maybe I can just repurpose this existing one?<p>
<h3>Code</h3>
In the class to be tested, add the <code>+(NSArray*)testSelectors</code> method,
returning the list of tests to run/test methods to execute. Here is an example
from the JSON parser I've been writing about:<p>
<hr>
<code>
<pre>
+testSelectors
{
return @[
@"testParseJSONString",
@"testParseSimpleJSONDict",
@"testParseSimpleJSONArray",
@"testParseLiterals",
@"testParseNumbers",
@"testParseGlossaryToDict",
@"testDictAfterNumber",
@"testEmptyElements",
@"testStringEscapes",
@"testUnicodeEscapes",
@"testCommonStrings",
@"testSpaceBeforeColon",
];
}
</pre>
</code>
<hr>
You could also determine these names automagically, but I prefer the explicit list
as part of the specification: these are the tests that should be run.
Otherwise it is too easy to just lose a test to editing mistakes and be
none the wiser for it.<p>
Then just implement a test, for example <code>testUnicodeEscapes</code>:
<hr>
<code>
<pre>
+(void)testUnicodeEscapes
{
MPWMASONParser *parser=[MPWMASONParser parser];
NSData *json=[self frameworkResource:@"unicodeescapes" category:@"json"];
NSArray *array=[parser parsedData:json];
NSString *first = [array objectAtIndex:0];
INTEXPECT([first length],1,@"length of parsed unicode escaped string");
INTEXPECT([first characterAtIndex:0], 0x1234, @"expected value");
IDEXPECT([array objectAtIndex:1], @"\n", @"second is newline");
}
</pre>
</code>
<hr>
Yes, this is mostly old code. The macros do what you, er, expect: <code>INTEXPECT()</code> expects integer equality (or other scalars, to be honest), <code>IDEXPECT()</code> expects object equality. There are
also some conveniences for nil, not nil, true and false, as well as a specialized one
for floats that sets an acceptable range.<p>
In theory, you can put these methods anywhere, but I tend to place them in a
<code>testing</code> category at the bottom of the file.<p>
<hr>
<code>
<pre>
...
@end
#import "DebugMacros.h"
@implementation MPWMASONParser(testing)
</pre>
</code>
<hr>
The <code>DebugMacros.h</code> header has the various <code>EXPCECT()</code> macros.
The header is the only dependency in your code, you do not need to link anything.<p>
Even more than not having a separate test bundle, not having a separate test class
(-hierarchy) really simplifies things. A lot.<p>
First, there is no question as to where to find the tests for a particular class:
at the bottom of the file, just scroll down. Same for the class for some tests:
scroll up. I find this incredibly useful, because the tests serve as specification,
documentation and example code for class.<p>
There is also no need to maintain parallel class hierarchies, which are widely
regarded as a fairly serious <a href="http://mikamantyla.eu/BadCodeSmellsTaxonomy.html">code-smell</a>, for the obvious reasons: the need to keep those hierarchies in sync along
with the problems once they do get out of sync, which they will, etc.<p>
<h3>Use</h3>
After the setup, you just build your projects, the tests will be run automatically
as part of the build. If there are test failures, they are reported by Xcode
as you would expect:<p>
<img src="https://dl.dropbox.com/s/e75j3m71yqgs6ei/xcode-integration.png?dl=0" alt="" title="" border="0" width="650" height="" />
<p>
My steps tend to be:
<ol>
<li>add name of test to <code>+testSelectors</code>,</li>
<li>hit build to ensure tests are red,</li>
<li>while Xcode builds, add empty test method,</li>
<li>hit build again to ensure tests are now green,</li>
<li>either add an actual <code>EXPECT()</code> for the test, </li>
<li>or an <code>EXPECTTRUE(false,@"impelemented")</code> as placeholder</li>
</ol>
This may seem like a lot of steps, but it's really mostly just letting
Xcode check things while I am doing the edits that need to be done
anyhow. Hitting Cmd-B a couple of times while editing doesn't hurt.<p>
The fact that tests run as part of every build, because you cannot
build without running the tests, gives you a completely different
level of confidence in your code, which translates to <a href="http://www.extremeprogramming.org/values.html">courage</a>.<p>
Running the tests all the time is also splendid motivation to keep those
tests green, because if the tests fail, the build fails. And if the
build fails, you cannot run the program. Last not least, running the
tests on every build also is strong motivation to keep those tests
fast. Testing just isn't this separate activity, it's as integral
a part of the development process as writing code and compiling it.<p>
<h3>Caveats</h3>
There are some drawbacks to this approach, one that the pretty Xcode
unit test integration doesn't work, as when this was done Apple had
already left the platform idea behind and was only focused on making
an integrated solution.<p>
As noted above, displaying test failures as errors and jumping to the
line of the failed test-expectation <em>does</em> work. This hooks
into the mechanism Xcode uses to get that information from compilers,
which simply output the line number and error message on <code>stdout</code>.
Any tool that formats its output the same way will work wth Xcode.<p>
In the end, while I do enojoy the blinkenlights of Xcode's unit test
integration, and being able to run tests individually with simple
mouse-click, all this bling really just reinforces that idea of
tests as a separate entity. If my tests are always run and
are always green, and are always fast, then I don't need or
even want UI for them, the UI is a distraction, the tests
should fade into the background.<p>
Another slightly more annoying issue is debugging: as the tests are run
as part of the build, a test failure is a build failure and will
block any executables from running. However, Xcode only debugs
executables, so you can't actually get to a debuggable run session.<p>
As I don't use debuggers all that much, and failure in TDD usually
manifests itself in test failure rather than something you need the
debugger to track, this hasn't been much of a problem. In the past,
I would then just revert to the command line, for example with
<code>lldb testlogger MPWFoundation</code> to debug my foundation framework,
as you can't actually run a framewework.
Or so I thought. Only receently
did I find out that you can set an executable parameter in your target's
build scheme. I now set that to <code>testlogger</code> and can debug the
framework to my heart's content.
<p>
<img src="https://dl.dropbox.com/s/ewbb36osopmjxm3/set-executable.png?dl=0" alt="" title="" border="0" width="500" height="" />
<p>
Leaving the problem of Xcode not actually letting me run the executable due to
the build failing, and as far as I know having no facility for debugging
build phases.<p>
The workaround for that is temporarily disabling the Test build phase,
which can be accomplished by misusing the "Run script only when installing" flag.
<p>
<img src="https://dl.dropbox.com/s/6ll82zialx77chf/disabled-tests.png?dl=0" alt="" title="" border="0" width="600" height="" />
<p>
While these issues aren't actually all the significant, they are somewhat more
jarring than you might expect because the experience is so buttery smooth the
rest of the time.<p>
Of course, if you want a pure test class, you can do that: just create a
class that only has tests. Furthermore, each class is actually asked for
a test fixture object for each test. The default is just to return the
class object itself, but you can also return an instance, which can have
setup and teardown methods the way you expect from xUnit.<p>
The code to enumerate and probe all classes in the system in order to find
tests is also interesting, if straightforward, and needs to be updated from
time to time, as there are a few class in the system that do not like to be probed.
<h3>Outlook</h3>
I'd obviously be happy if people try out MPWTest and find it useful. Or find
it not so useful and provide good feedback. I currently have no specific
plans for Swift support. Objective-C compatible classes should probably work,
the rest of the language probably isn't dynamic enough to support this kind
of transparent integration, certainly not without more compiler work.
But I am currently investigating Swift interop. more generally, and now
that I am no longer restricted to C/Objective-C, more might be possible.<p>
I will almost certainly use the lessons learned here to create linguistically
integrated testing in Objective-Smalltalk. As with many other aspects of
Objective-Smalltalk, the gap to be bridged for super-smooth is actually not
that large.<p>
Another takeaway is that unit testing is really, really simple. In fact,
when I asked Kent Beck about it, his response was that everyone should
build their own. So go and build wonderful things!<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0tag:blogger.com,1999:blog-8397311766319215218.post-45932217058983386862020-05-14T08:57:00.001+02:002020-05-14T11:08:36.542+02:00Embedding Objective-SmalltalkIlja just asked for embedded scripting-language suggestions,
presumably for his <a href="https://www.iwascoding.com/GarageSale/">GarageSale</a> E-Bay listings manager, and
so of course I suggested <a href="http://objective.st">Objective-Smalltalk</a>.<p>
Unironically :-)<p>
This is a bit scary. On the one hand, Objective-Smalltalk has
been in use in my
own applications for well over a decade and runs the
<a href="http://objective.st">http://objective.st</a> site, both
without a hitch and the latter shrugging of a Hacker News
"Hug of Death" without even the hint of glitch. On the
other hand, well, it's scary.<p>
As for usability, you include two frameworks in
your application bundle, and the code to start up and
interact with the interpreter or interpreters is also
fairly minimal, not least of all because I've been
doing so in quite a number of applications now, so
inconvenience gets whittled away over time.<p>
In terms of suitability, I of course can't answer that
except for saying it is absolutely the best ever. I can
also add that another macOS embeddable Smalltalk, FScript,
was used successfully in a number of products.
Anyway, Ilja was kind enough to at least pretend to take
my suggestion seriously, and responded with the following
question as to how code would look in practice:<p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">It is not obvious to me how such a custom script in Objective-SmallTalk would look like. Here is a JavaScript pseudo-code example that ends & restarts certain listings. How would this look like in Obj-ST? <a href="https://t.co/UjbwUD0Q4b">pic.twitter.com/UjbwUD0Q4b</a></p>— Ilja A. Iwas (@iljawascoding) <a href="https://twitter.com/iljawascoding/status/1260662815125377029?ref_src=twsrc%5Etfw">May 13, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
I am only too happy to answer that question, but the
answer is a bit beyond the scope of twitter, hence
this blog post.<p>
First, we can keep things very close to the original,
just replacing the loop with a <code>-select:</code>
and of course changing the syntax to Objective-Smalltalk.
<hr>
<code>
<pre>
runningListings := context getAllRunningListings.
listingsToRelist := runningListings select:{ :listing |
listing daysRunning > 30 and: listing watchers < 3 .
}
ebay endListings:listingsToRelist ended:{ :ended |
ebay relistListings:ended relisted: { :relisted |
ui alert:"Relisted: {relisted}".
}
}
</pre>
</code>
<hr>
Note the use of "and:" instead of "&&" and the general
reduction of sigils. Although I personally don't like the <a href="https://zäta.com/en/blog/async_javascript.htm">pyramid of doom</a>, the
keyword message syntax makes it significantly less odious.<p>
So much in fact, that Swift recently <a href="https://forums.swift.org/t/accepted-se-0279-multiple-trailing-closures/36141/7">adopted</a> open keyword
syntax for the special case of multiple trailing closures. Of
course the mind boggles a bit, but that's a topic for a
separate post.<p>
So how else can we simplify? Well, the <code>context</code>
seems a little unspecific, and <code>getAllRunningListings</code>
a bit specialized, it probably has lots of friends that
result from mapping a website with lots of resources onto
a procedural interface.<p>
Let's instead use URLs for this, so an <code>ebay:</code>
scheme that encapsulates the resources that EBay lets
us play with.
<hr>
<code>
<pre>
listingsToRelist := ebay:listings/running select:{ :listing |
listing daysRunning > 30 and: listing watchers < 3 .
}
ebay endListings:listingsToRelist ended:{ :ended |
ebay relistListings:ended relisted: { :relisted |
ui alert:"Relisted {relisted} listings".
}
}
</pre>
</code>
<hr>
I have to admit I also don't really understand the use
of callbacks in the relisting process, as we are waiting
for everything to complete before moving to the next stage.
So let's just implement this as plain sequential code:
<hr>
<code>
<pre>
listingsToRelist := ebay:listings/running select:{ :listing |
listing daysRunning > 30 and: listing watchers < 3 .
}
ended := ebay endListings:listingsToRelist.
relisted := ebay relistListings:ended.
ui alert:"Relisted: {relisted}".
</pre>
</code>
<hr>
(In scripting contexts, Objective-Smalltalk currently
allows defining variables by assigning to them. This
can be turned off.)<p>
However, it seems odd and a bit non-OO that the listings
shouldn't know how to do stuff, so how about just having
<code>relist</code> and <code>end</code> be methods on
the listings themselves? That way the code simplifies
to the following:
<hr>
<code>
<pre>
listingsToRelist := ebay:listings/running select:{ :listing |
listing daysRunning > 30 and: listing watchers < 3 .
}
ended := listingsToRelist collect end.
relisted := ended collect relist.
ui alert:"Relisted: {relisted}".
</pre>
</code>
<hr>
If batch operations are typical, it probably makes sense
to have a listings collection that understands about
those operations:
<hr>
<code>
<pre>
listingsToRelist := ebay:listings/running select:{ :listing |
listing daysRunning > 30 and: listing watchers < 3 .
}
ended := listingsToRelist end.
relisted := ended relist.
ui alert:"Relisted: {relisted}".
</pre>
</code>
<hr>
Here I am assuming that ending and relisting can fail
and therefore these operations need to return the
listings that succeeded.<p>
Oh, and you might want to give that predicate a name,
which then makes it possible to replace the last
gobbledygook with a clean, "do what I mean" Higher
Order Message. Oh, and since we've had Unicode
for a while now, you can also use '←' for assignment,
if you want.
<hr>
<code>
<pre>
extension EBayListing {
-<bool>shouldRelist {
self daysRunning > 30 and: self watchers < 3.
}
}
listingsToRelist ← ebay:listings/running select shouldRelist.
ended ← listingsToRelist end.
relisted ← ended relist.
ui alert:"Relisted: {relisted}".
</pre>
</code>
<hr>
To my obviously completely unbiased eyes, this looks
pretty close to a high-level, pseudocode specification
of the actions to be taken, except that it is executable.<p>
This is a nice step-by-step script, but with everything so compact now, we can get
rid of the temporary variables (assuming the
extension) and make it a one-liner (plus the alert):
<hr>
<code>
<pre>
relisted ← ebay:listings/running select shouldRelist end relist.
ui alert:"Relisted: {relisted}".
</pre>
</code>
<hr>
It should be noted that the one-liner came to be not as a result
of sacrificing readability in order to maximally compress the code,
but rather as an indirect result of improving readability by
removing the cruft that's not really part of the problem being
solved.<p>
Although not needed in this case (the precedence rules of unary
message sends make things unambiguous) some pipe separators
may make things a bit more clear.<p>
<hr>
<code>
<pre>
relisted ← ebay:listings/running select shouldRelist | end | relist.
ui alert:"Relisted: {relisted}".
</pre>
</code>
<hr>
Whether you prefer the one-liner or the step-by-step is probably a matter
of taste.<p>
Marcel Weiherhttp://www.blogger.com/profile/11651004661887001433noreply@blogger.com0