I've been making good progress on
Objective-Smalltalk recently. Apart
from the
port to
GNUstep that allowed me to
run the
official site on it (shrugging off the HN hug of death
in the process), I've also been
concentrating on not just pushing the concepts further, but also on adding some of the more mundane bits that are just needed for a programming language.
And so my programs have been getting longer and more useful, and I am starting to actually see the effect of
"I'd rather write this on Objective-Smalltalk than Objective-C". And with that, I thought I'd share
one of these slightly larger examples, show how it works, what's cool and possibly a bit weird, and where
work is still needed (lots!).
The code I am showing is a script that implements a generic scheme handler for sqlite databases and then uses that scheme handler to access the database given
on the command line. When you run it, in this case with the sample database Chinook.db, it allows you to interact with the database using URIs using
the db
scheme. For example, db:.
lists the available tables:
> db:.
( "albums","sqlite_sequence","artists","customers","employees","genres","invoices",
"invoice_items","media_types","playlists","playlist_track","tracks","sqlite_stat1")
You can then access entire tables, for example
db:albums
would show all the albums, or you
can access a specific album:
> db:albums/3
{ "AlbumId" = 4;
"Title" = "Let There Be Rock";
"ArtistId" = 1;
}
With that short intro and without much further ado, here's the
code :
#!/usr/local/bin/stsh
#-sqlite:<ref>dbref
framework:FMDB load.
class ColumnInfo {
var name.
var type.
-description {
"Column: {var:self/name} type: {var:self/type}".
}
}
class TableInfo {
var name.
var columns.
-description {
cd := self columns description.
"Table {var:self/name} columns: {cd}".
}
}
class SQLiteScheme : MPWScheme {
var db.
-initWithPath: dbPath {
self setDb:(FMDatabase databaseWithPath:dbPath).
self db open.
self.
}
-dictionariesForResultSet:resultSet
{
results := NSMutableArray array.
{ resultSet next } whileTrue: { results addObject:resultSet resultDictionary. }.
results.
}
-dictionariesForQuery:query {
self dictionariesForResultSet:(self db executeQuery:query).
}
/. {
|= {
resultSet := self dictionariesForQuery: 'select name from sqlite_master where [type] = "table" '.
resultSet collect at:'name'.
}
}
/:table/count {
|= { self dictionariesForQuery: "select count(*) from {table}" | firstObject | at:'count(*)'. }
}
/:table/:index {
|= { self dictionariesForQuery: "select * from {table}" | at: index. }
}
/:table {
|= { self dictionariesForQuery: "select * from {table}". }
}
/:table/:column/:index {
|= { self dictionariesForQuery: "select * from {table}" | at: index. }
}
/:table/where/:column/:value {
|= { self dictionariesForQuery: "select * from {table} where {column} = {value}". }
}
/:table/column/:column {
|= { self dictionariesForQuery: "select {column} from {table}"| collect | at:column. }
}
/schema/:table {
|= {
resultSet := self dictionariesForQuery: "PRAGMA table_info({table})".
columns := resultSet collect: { :colDict |
#ColumnInfo{
#'name': (colDict at:'name') ,
#'type': (colDict at:'type')
}.
}.
#TableInfo{ #'name': table, #'columns': columns }.
}
}
-tables {
db:. collect: { :table| db:schema/{table}. }.
}
-<void>logTables {
stdout do println: scheme:db tables each.
}
}
extension NSObject {
-initWithDictionary:aDict {
aDict allKeys do:{ :key |
self setValue: (aDict at:key) forKey:key.
}.
self.
}
}
scheme:db := SQLiteScheme alloc initWithPath: dbref path.
stdout println: db:schema/artists
shell runInteractiveLoop.
Let's walk through the code in detail, starting with the header:
#!/usr/local/bin/stsh
#-sqlite:<ref>dbref
This is a normal Unix shell script invoking
stsh
, the Smalltalk Shell. The Smalltalk Shell is a
bigger topic for another day, but for now let's focus on the second line, which looks
like a method declaration, and that's exactly what it is! In order to ease the transition
between small scripts and larger systems (successful scripts tend to get larger, and successful
large systems evolve from successful small systems), scripts have a dual nature, being at the
same time callable from the Unix command line and also usable as a method (or filter) from
a program.
Since this script is interactive, that part is not actually that important, but a nice side effect
is that the declaration of a parameter gets us automatic command-line parameter parsing, conversion, and
error checking. Specifically, stsh
knows that the script takes a single parameter of
type <ref> (a reference, so a filename or URL) and will put that in the dbref
variable as a reference. If the script is invoked without that parameter, it will exit with an error message, all
without any further work by the script author. These declarations are optional, without them parameters
will go into an args
array without further interpretation.
Next up, we load a dependency, Gus Mueller's wonderful FMDB
wrapper for SQLite.
framework:FMDB load.
The
framework
scheme looks for frameworks on the default framework path, and the
load
message is sent to the
NSBundle
that is returned.
The next bit is fairly straightforward, defining the ColumnInfo
class with two instance
variables, name
and type
, and a -descritpion
method.
class ColumnInfo {
var name.
var type.
-description {
"Column: {var:self/name} type: {var:self/type}".
}
}
Again, this is very straightforward, with maybe the missing superclass
specification being slightly unusual. Different constructs may have different
implicit superclasses, for class it is assumed to be
NSObject
. The description
method, introduced by "-" just like in Objective-C, uses string interpolation with curly braces.
(I currently need to use fully qualified names like
var:self/name
to access instance
variables, that should be fixed in the near future).
It also doesn't have a
return
statement or the like, a method return can be specified
by just writing out the return value.
To me, this has the great effect of putting the focus on the essential "this is the description" rather
than on the incidental, procedural "this is how you build the description". It is obviously only a very
small instance of this shift, but I think even this small examples speaks to what that shift can look like
in the large.
The way instance variables are defined is far from being done, but for now the var
syntax
does the job. The TableInfo
class follows the same pattern as ColumnInfo
,
and of course these two classes are used to represent the metadata of the database.
So on to the main attraction, the scheme-handler itself, which is just a plain old class inheriting
from MPWScheme
, with an instance variable and an initialisation method:
class SQLiteScheme : MPWScheme {
var db.
-initWithPath: dbPath {
self setDb:(FMDatabase databaseWithPath:dbPath).
self db open.
self.
}
Having advanced language features largely defined as/by plain old classes goes back to the need
for a
stable starting point. However, it has turned out to be a little bit more than that, because the mapping to classes is not
just the trivial one of "since this written in on OO language, obviously the implementation of features
is somehow done with classes". Instead, the classes map onto the language features very much in an
Open Implementation kind of way, except
that in this case it is
Open Language Implementation.
That means that unlike a typical MOP,
the classes actually make sense outside the specific language implementation, making their features usable
from other languages, albeit less conveniently. Being easily accessible from other languages is important
for an architectural language.
With this mapping, a very narrow set of syntactic language mechanism
can be used to map a large and extensible (thus infinite) set of semantic features into the languages. This
is of course similar to language features like procedures, methods and classes, but is expanded to things
that usually haven't been as extensible.
The next two methods handle the interface to FMDB, they are mostly straightforward
and, I think, understandable to both Smalltalk and
Objective-C programmers without much explanation.
-dictionariesForResultSet:resultSet
{
results := NSMutableArray array.
{ resultSet next } whileTrue: { results addObject:resultSet resultDictionary. }.
results.
}
-dictionariesForQuery:query {
self dictionariesForResultSet:(self db executeQuery:query).
}
Smalltalk programmers may balk a little at the use of curly braces rather than square
brackets to denote blocks. To me, this is a very easy concession to "the mainstream";
I have bigger fish to fry. To Objective-C programmers, the fact that the condition of
the while-loop is implemented as a message sent to a block rather than as syntax might
take a little getting used to, but I don't think it presents any fundamental difficulties.
Next up we have some property path definitions, the meat of the scheme handler.
Each property path lets you define code that will run for a specific subset of the
scheme's namespace, with the subset defined by the property path's URI pattern
.
As the name implies, property paths can be regarded as a generalisation of Objective-C
properties, extended to handle both entire sets of properties, sub-paths and the
combination of both.
/. {
|= {
resultSet := self dictionariesForQuery: 'select name from sqlite_master where [type] = "table" '.
resultSet collect at:'name'.
}
}
The first property path definition is fairly straightforward as it only applies to a single
path, the period (so the
db:.
example from above). Property path definitions
start with the forward slash ("/"), similar to the way that instance methods start with "-"
and class methods with "+" in Objective-C (and Objetive-Smalltalk). The slash seemed natural
to indicate the definition of paths/URIs.
Like C# or Swift property
definitions, you need to be able to deal with (at least) "get" and/or "set" access to a property. I
really dislike having random keywords like "get" or "set" for program structure, I prefer to see names
reserved for things that have domain meaning.
So instead of keywords, I am using constraint connector
syntax: "|=" means the left hand side is constrained to be the same as the right hand side
(aka "get"). "=|" means the right hand side is constrained to be the same as the left hand
side (aka "set"). The idea is that the "left hand side" in this case is the interface, the
outside of the object/scheme handler, whereas the "right hand side" is the inside of the
object, with properties mediating between the outside and the inside of the object.
As most everything, this is currently experimental, but so far I like it more than I
expected to, and again, it shifts us away from being action oriented to
describing relationships. For example, delegating both get and set to some other
object could then be described by using the bidirectional constraint connector:
/myProperty =|= var:delegate/otherroperty
.
Getting the result set is a straightforward message-send with the SQL query as a constant,
non-interpolated string (single quotes, double quotes is for interpolation). We then
need to extract the name of the table from the return dictionaries, which we do via the
collect
HOM and the Smalltalk-y -at:
message, which in this
case maps to Foundation's -objectForKey:
.
The next property paths map URIs to queries on the tables. Unlike the previous
example, which had a constant, single element path and so was largely equivalent
to a classic property, these all have variable path elements, multiple path
segments or both.
/:table/count {
|= { self dictionariesForQuery: "select count(*) from {table}" | firstObject | at:'count(*)'. }
}
/:table/:index {
|= { self dictionariesForQuery: "select * from {table}" | at: index. }
}
/:table {
|= { self dictionariesForQuery: "select * from {table}". }
}
Starting at the back,
/:table
returns the data from the entire table specified
in the URI using the parameter
:table
. The leading semicolon means that this
path segment is a parameter that will match any single string and deliver it the method as
the parameter name used, in this case "table". Wildcards are also possible.
Yes, the
SQL query is performed using simple string interpolation without any sanitisation. DON'T DO THAT.
At least not in production code. For experimenting in an isolated setting it's OK.
The second query retrieves a specific row of the table specified. The pipe "operator" is
for method chaining with keyword syntax without having to bracket like crazy:
self dictionariesForQuery: "select count(*) from {table}" | firstObject | at:'count(*)'
((self dictionariesForQuery: "select count(*) from {table}") firstObject) at:'count(*)'
I find the "pipe" version to be much easier to both just visually scan and to understand,
because it replaces nested (recursive) evaluation with linear piping. And yes, it is
at least partly a by-product of integrating pipes/filters, which is a part of the larger
goal of smoothly integrating multiple architectural styles. That this integration would
lead to an improvement in the procedural part was an unexpected but welcome side effect.
The first property path, /:table/count
returns the size of the given table,
using the optimised SQL code select count(*)
. This shows an interesting
effect of property paths. In a standard ORM, getting the count of a table might look
something like this: db.artists.count
. Implemented naively, this code
retrieves the entire "artists" table, converts that to an array and then counts
the array, which is incredibly inefficient. And of course, this was/is a real problem
of many ORMs, not just a made up example.
The reason it is a real problem is that it isn't trivial to solve, due to the fact that
OOPLs are not structurally polymorphic. If I have something like db.artists.count
,
there has to be some intermediate object returned by artists
so I can
send it the count
message. The natural thing for that is the artists table, but
that is inefficient. I can of course solve this by returning some clever proxy that doesn't
actually materialise the table unless it has to, or I can have count
handled
completely separately, but neither of these solutions are straightforward, which is why
this has traditionally been a problem.
With property paths, the problem just goes away, because any scheme handler (or object) has
control over its sub-structure to an arbitrary depth.
Queries are handled in a similar matter, so db:albums/where/ArtistId/6
retrieves the two albums by band Apocalyptica. This is obviously very manual,
for any specific database you'd probably want to specialise this generic
scheme handler to give you named relationships and also to return actual objects,
rather than just dictionaries. A step in that direction is the /schema/:table
property path:
/schema/:table {
|= {
resultSet := self dictionariesForQuery: "PRAGMA table_info({table})".
columns := resultSet collect: { :colDict |
#ColumnInfo{
#'name': (colDict at:'name') ,
#'type': (colDict at:'type')
}.
}.
#TableInfo{ #'name': table, #'columns': columns }.
}
}
This property path returns the SQL schema in terms of the objects we defined at the top.
First is a simple query of the SQLite table that holds the schema information, returning
an array of dictionaries. These individual dictionaries are then converted to
ColumnInfo
objects using
object literals
.
Similar to defining the -description
method above as simple the parametrized
string literal instead of as instructions to build the result string, object literals
allow us to simple write down general objects instead of constructing them. The example
inside the collect
defines a ColumnInfo
object literal with
the name
and type
attributes set from the column dictionary
retrieved from the database.
Similarly, the final TableInfo
is defined by its name and the column info
objects. Object literals are a fairly trivial extension of Objective-Smalltalk dictionary
literals, #{ #'key': value }
, with a class-name specified between the
"#" and the opening curly brace. Being able to just write down objects is, I think,
one of the better and under-appreciated features of WebObjects .wod files (though it's
not 100% the same thing), as well as QML
and I think also part of what makes React "declarative".
Not entirely coincidentally, the "configurations" of architectural description languages
can also be viewed as literal object definitions.
With that information in hand, and with the Objective-Smalltalk runtime providing
class definition objects that can be used to install objects with utheir methods
in the runtime, we now have enough information to create some classes straight from
the SQL table definitions, without going through the intermediate steps of
generating source code, adding that to a project and compiling it.
That isn't implemented, and it's also something that you don't really want, but
it's a stepping stone towards creating a general mechanism for orthogonal modular
persistence.
The final two utility methods are not really all that noteworthy, except that they
do show how expressive and yet straightforward even ordinary Objective-Smalltalk code is.
-tables {
db:. collect: { :table| db:schema/{table}. }.
}
-<void>logTables {
stdout do println: scheme:db tables each.
}
The
-tables
method just gets the all the schema information for all the tables. The
-logTables
methods prints all the tables to
stdout
, but individually, not as
an array.
Finally, there is a class extension to NSObject that supports the literal syntax on all objects and
the script code that actually initialises the scheme with a database and starts an interactive session.
This last feature has also been useful in Smalltalk scripting: creating specialized shells that
configure themselves and then run the normal interactive prompt.
So that's it!
It's not a huge revelation, yet, but I do hope this example gives at least a tiny glimpse
of what Objective-Smalltalk already is and of what it is poised to become. There is a lot that I
couldn't cover here, for example that scheme-handlers aren't primarily for interactive
exploration, but for composition. I also only mentioned pipes-and-filters in passing,
and of course there "is" a lot more that just "isn't" there, quite yet.
As always, but even more than usual, I would love to get feedback! And of course the code is
on github