metablog: Swift

Showing posts with label Swift. Show all posts

Wednesday, June 7, 2023

Mojo is a much better "Objective-C without the C" than Swift ever was

One of the primary things that people don't understand about Objective-C is that it is a solution of the two language problem, or more precisely a generalisation of the two language problem to the scripted component pattern.

The scripted component pattern itself is a (common) solution to the problem, first identified in the 70s that programming-in-the-large is not the same as programming-in-the-small, that module implementation languages are not necessarily suitable as module interconnection languages.

And so we have all sorts of flexible connection languages, often interpreted (aka glue, scripting, and orchestration languages), starting with the Unix shell, in addition to fast, compiled component languages such as C, C++ and Rust, and a system will usually incorporate at least one of each kind.

But then you run into the two language problem: you have to deal with these two distinct languages, with how they integrate, and with the boundaries of the integration often not matching up very well with the boundaries of the problem you're trying to solve.

Objective-C solved the two language problem by just jamming the two languages into one: Smalltalk for the scripting/integration and C for the component language. Interoperability is smooth and at the statement level, thougha there is some friction due to overlaps caused by integrating two existing languages that were not designed to be integrated.

Mojo essentially uses the Objective-C approach of jamming the two languages into one. Except it doesn't repeat Objective-C's mistake of using the component language as the base (which, inexplicably, Swift didn't just repeat, but actually doubled down on by largely deprecating objects). The reason this is a mistake is that it turns out that the connection language is actually the more general one, the component language is a specialisation of the connection language.

With this realisation, Mojo's approach of making the connection language the base language make sense. In addition, the fact that the component language is a specialisation also means that you don't actually need to jam a full second language into your base, a few syntactic markers to to indicate the specialisations are sufficient.

This is pretty much exactly stage 2 of the 4 stages of Objective-S, so I think they are using exactly the right approach for this. Except of course for the use of Python as the base instead of Smalltalk, which is a pragmatic choice given what they are trying to accomplish, but means your connection language is unduly limited.

Objective-S has the same basic structure, but with a much more capable connection language as the base.

Tuesday, June 15, 2021

if let it be

One of funkier aspects of Swift syntax is the if let statement. As far as I can tell, it exists pretty much exclusively to check that an optional variable actually does contain a value and if it does, work with a no-longer-optional version of that variable.

Swift packages this functionality in a combination if statement and let declaration:


if let value = value {
   print("value is \(value)")
}

This has a bunch of problems that are explained nicely in a Swift Evolution thread (via Michael Tsai) together with some proposals to fix it. One of the issues is the idiomatic repitition of the variable name, because typically you do want the same variable, just with less optionality. Alas, code-completion apparently doesn't handle this well, so the temptation is to pick a non-descriptive variable name.

In my previous post (Asynchronous Sequences and Polymorphic Streams) I noted how the fact that iteration in Smalltalk and Objecive-S is done via messages and blocks means that there is no separate concept of a "loop-variable", that is just an argument to the block.

Conditionals are handled the same way, with blocks and messages, but normally don't pass arguments to their argument blocks, because in normal conditionals those arguments would always be just the constants true or false. Not very interesting.

When I added ifNotNil: some time ago, I used the same logic, but it turns out the object is now actually potentially interesting. So ifNotNil: now passes the now-known-to-be-non-nil value to the block and can be used as follows:


value ifNotNil:{ :value |
    stdout println:value.
}

This doesn't eliminate the duplication, but does avoid the issue of having the newly introduced variable name precede the original variable. Well, that and the whole weird if let in the first place.

With anonymous block arguments, we actually don't have to name the parameter at all:


value ifNotNil:{ stdout println:$0. }

Alternatively, we can just take advantage of some conveniensces and use a HOM instead:


value ifNotNil printOn:stdout.

Of course, Objective-S currently doesn't care about optionality, and with the current nil-eating behavior, the ifNotNil is not strictly necessary, you could just write it as follow:


value printOn:stdout.

I haven't really done much thinking about it, but the whole idea of optionality shouldn't really be handled in the space of values, but in the space of references. Which are first class objects in Objective-S.

So you don't ask a value if it is nil or not, you ask the variable if it contains a value:


ref:value ifBound:{ :value | ... }

To me that makes a lot more sense than having every type be accompanied by an optional type.

So if we were to care about optionality so in the future, we have the tools to create a sensible solution. And we can let if let just be.

Sunday, June 13, 2021

Asynchronous Sequences and Polymorphic Streams

Browsing the WWDC '21 session videos, I came across the session on Asynchronous Sequences. The preview image showcased some code for asynchronously fetching and massaging current earthquake data from the U.S. Geological Survey:


@main
struct QuakesTool {
   static func main() async throws {
      let endpointURL = URL(string: "https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv")!

      for try await event in endpointURL.lines.dropFirst() {
         let values = event.split(separator: ",")
         let time = values[0]
         let latitude = values[1]
         let longitude = values[2]
         let magnitude = values[4]
         print("Magnitude \(magnitude) on \(time) at \(latitude) \(longitude)")
      }
   }
}

This is nice, clean code, and it certainly looks like it serves as a good showcase for the benefits of asynchronous coding with async/await and asynchronous sequences built on top of async/await.

Or does it?

Here is the equivalent code in Objective-S:


#!env stsh
stream ← ref:https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv linesAfter:1.

stream do: { :theLine |
   values ← theLine componentsSeparatedByString:','.
   time ← values at:0.
   latitude ← values at:1.
   longitude ← values at:2.
   magnitude ← values at:4.
   stdout println:"Quake: magnitude {magnitude} on {time} at {latitude} {longitude}".
}. 
stream awaitResultForSeconds:20.

Objective-S does not (and will not) have async/await, but it can nevertheless provide the equivalent functionality easily and elegantly. How? Two features:

Polymorphic Write Streams
Messaging

Let's see how these two conspire to make adding something equivalent to for try await trivial.

Polymorphic Write Streams

In the Objective-S implementation, https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv is not a string, but an actual identifier, a Polymorphic Identifier, adding the ref: prefix turns it into a binding, a first class variable. You can ask a binding for its value, but for bindings that can also be regarded as collections of some kind, you can also ask them for a stream of their values, in this particular case a MPWURLStreamingStream. This stream is a Polymorphic Write Stream that can be easily composed with other filters to create pipelines. The linesAfter: method is a convenience method that does just that: it composes the URL fetcher with a filter that converts from bytes to lines of text and another filter that drops the first n items.

Objective-S actually has convenient syntax for creating these compositions without having to do it via convenience methods, but I wanted to keep differences in the surrounding scaffolding small for this example, which is about the for try away and do:.

When I encountered the example, Polymorphic Write Streams actually did not have a do: for iteration, but it was trivial to add:


-(void)do:aBlock
{
    [self setFinalTarget:[MPWBlockTargetStream streamWithBlock:aBlock]];
    [self run];
}

(This code lives in MPWFoundation, so it is in Objective-C, not Objective-S).

Those 5 lines were all that was needed. I did not have to make substantive changes to the language or its implementation. One reason for this is that Polymorphic Write Streams are asynchrony-agnostic: although they are mostly implemented as straightforward synchronous code, they work just as well if parts of the pipeline they are in are asynchronous. It just doesn't make a difference, because the semantics are in the data flow, not in the control flow.

Messaging

The other big reason an asynchronous do: was easy to add is messaging.

If you focus on just messaging -- and realize that a good metasystem can late bind the various 2nd level architectures used in objects -- then much of the language-, UI-, and OS based discussions on this thread are really quite moot.

One of the many really, really neat ideas in Smalltalk is how control structures, which in most other languages are special language features, are just plain old messages and implemented in the library, not in the language.

So the for ... in loop in Swift is just the do: message sent to a collection, and the keyword syntax makes this natural:


for event in lines {
...
}
...
lines do: { :event |
...
}

Note how making loops regular like this also makes the special concept of "loop variable" disappear. The "loop variable" is just the block argument. And I just realized the same would go for a not-nil result of a nil test.

Anyway, if "loops" are just messages, it's easy to add a method implementing iteration to some other entity, for example a stream, the way that I did. (Smalltalk streams also support the iteration messages).

And when you can easily make stream processing, which can handle asynchrony naturally and easily, just as convenient as imperative programming, you don't need async/await, which tries to make asynchronous programming look like imperative programming in order to make it convenient.

Sunday, June 14, 2020

The Curious Case of Swift's Adoption of Smalltalk Keyword Syntax

I was really surprised to learn that Swift recently adopted Smalltalk keyword syntax: [Accepted] SE-0279: Multiple Trailing Closures. That is: a keyword terminated by a colon, followed by an argument and without any surrounding braces.

The mind boggles.

A little.

Of course, Swift wouldn't be Swift if this weren't a special case of a special case, specifically the case of multiple trailing closures, which is a special case of trailing closures, which are weird and special-casey enough by themselves. Below is an example:


UIView.animate(withDuration: 0.3) {
  self.view.alpha = 0
} completion: { _ in
  self.view.removeFromSuperview()
}

Note how the arguments to animate() would seem to terminate at the closing parenthesis, but that's actually not the case. The curly braces after the closing paren start a closure that is actually also an argument to the method, a so-called trailing closure. I have a little bit of sympathy for this construct, because closures inside of the parentheses look really, really awkward. (Of course, all params apart from a sole x inside f(x) look awkward, but let's not quibble. For now.).

Another thing this enables is methods that reasonably resemble control structures, which I heard is a really great idea.

The problem is that sometimes you have more than one closure argument, and then just stacking them up behind what appears to be end of the function/method call gets really, really awkward, and you can't tell which block is which argument, because the trailing closure doesn't get a keyword.

Well, now it does. And we now have 4 different method syntaxes in one!

Traditional C/Pascal/C++/Java function call syntax x.f()
The already weird-ish addition of Smalltalk/Objective-C keywords inside the f(x) syntax: f(arg:x)
Original trailing-closure syntax, which is just its own thing, for the first closure
Smalltalk non-brackted keyword syntax for the 2nd and subsequent closures.

That is impressive, in a scary kind of way.

Swift is a crescendo of special cases stopping just short of the general; the result is complexity in the semantics, complexity in the behaviour (i.e. bugs), and complexity in use (i.e. workarounds).
— Which features overcomplicate Swift, Rob Rix

In understand that this proposal was quite controversial, with heated discussion between opponents and proponents. I understand and sympathize with both sides. On the one hand, this is markedly better than alternatives. On the other hand it is a special case of a special case that is difficult to justify as an addition of all that is already there.

Special cases beget special cases beget special cases.

Of course the answer was always there: Smalltalk keyword syntax is not just the only reasonable solution in this case, it also solves all the other cases. It is the general solution. Here's how this could look in Objective-Smalltalk (which uses curly braces instead for closures instead of Smalltalk-80's square brackets):


UIView animate:{ self.view.alpha ← 0. } withDuration:0.3 completion:{ self view removeFromSuperview. }.

No special cases, every argument is labeled, no syntax mush of brackets inside parentheses etc. And yes, this also handles user-defined control structures, to:do: is just a method on NSNumber:


1 to:10 do:{:i | stdout println:"I will not introduce {i} special cases willy nilly.".}.

And since keywords naturally go between their arguments, there is no need for "operators", as a very different and special syntax form. You just allow some "binary" keywords to look a little different, so instead of 2 multiply:3 you can write 2 * 3. And when you have 2 raisedTo:3 instead of pow(2,3) (with the signature: func pow(_ x: Decimal, _ y: Int) -> Decimal), do you really neeed to go to the trouble of defining an "operator"?

Or Swift's a as b, another special kind of syntax. How about a as:b? (Yes I know there are details, but those are ... details.). And so on and so forth.

But of course, it's too late now. When I chose Smalltalk as the base syntax for the language that has turned into Objective-Smalltalk, it wasn't just because I just like it or have gotten used to it via Objective-C. Smalltalk's syntax is surprisingly flexible and general, Smalltalk APIs look a lot like DSLs, without any of the tooling or other overheads.

And that's the frustrating part: this stuff was and is available and well-known. At least if you bother to look and/or ask. But instead, we just choose these things willy-nilly and everybody has to suffer the consequences.

UPDATE:

I guess what I am trying to get at is that if you'd thought things through just a little bit, you could have had almost the entire syntax of your language for the cost (complexity, implementation size and brittleness, cognitive load, etc.) of this one special case of a special case. And it would have been overall better to boot.

Friday, April 24, 2020

Faster JSON Support for iOS/macOS, Part 7: Polishing the Parser

A convenient setback

One thing that you may have noticed last time around was that we were getting the instance variable names from the class, but then also still manually setting the common keys manually. That's a bit of duplicated and needlessly manual effort, because the common keys are exactly those ivar names.

However, the two pieces of information are in different places, the ivar names in the builder and the common strings in the in the parse itself. One way of consolidating this information is by creating a convenience intializer for decoding to objects as follows:



-initWithClass:(Class)classToDecode
{
    self = [self initWithBuilder:[[[MPWObjectBuilder alloc] initWithClass:classToDecode] autorelease]];
    [self setFrequentStrings:(NSArray*)[[[classToDecode ivarNames] collect] substringFromIndex:1]];
    return self;
}

We still compute the ivar names twice, but that's not really such a big deal, so something we can fix later, just like the issue that we should probably be using property names instead of instance variable names that in the case of properties we have to post-process to get rid of the underscores added by ivar synthesis.

With that, the code to parse to objects simplifies to the following, very similar to what you would see in Swift with JSONDecoder.


-(void)decodeMPWDirect:(NSData*)json
{
    MPWMASONParser *parser=[[MPWMASONParser alloc] initWithClass:[TestClass class]];
    NSArray* objResult = [parser parsedData:json];
}

So, quickly verifying that performance is still the same (always do this!) and...oops! Performance dropped significantly, from 441ms to over 700ms. How could such an innocuous change lead to a 50% performance regression?

The profile shows that we are now spending significantly more time in MPWSmallStringTable's objectForKey: method, where it gets the bytes out of the NSString/CFString, but why that should be the case is a bit mysterious, since we changed virtually nothing.

A little further sleuthing revealed that the strings in question are now instances of NSTaggedPointerString, where previously they were instances of __NSCFConstantString. The latter has a pointer to its byte-oriented character orientation, which it can simply return, while the former cleverly encodes the characters in the pointer itself, so it first has to reconstruct that byte representation. The method of constructing that representation and computing the size of such a representation also appears to be fairly generic and slow via a stream.

This isn't really easy to solve, since the creation of NSTaggedPointerStrring instances is hardwired pretty deep in CoreFoundation with no way to disable this "optimization". Although it would be possible to create a new NSString subclass with a byte buffer, make sure to convert to that class before putting instances in the lookup table, that seems like a lot of work. Or we could just revert this convenience.

Damn the torpedoes and full speed ahead!

Alternatively, we really wanted to get rid of this whole process of packing character data into NSString instances just to immediately unpack them again, so let's leave the regression as is and do that instead.

Where previously the builder had a NSString *key instance vaiable, it now has a char *keyStr and a int keyLen. The string-handling case in the JSON parser is now split betweeen the key and the non-key casse, with the non-key case still doing the conversion, but the key-case directly sending the char* and length to the builder.


			case '"':
                parsestring( curptr , endptr, &stringstart, &curptr  );
				if ( curptr[1] == ':' ) {
                    [_builder writeKeyString:stringstart length:curptr-stringstart];
					curptr++;
					
				} else {
                    curstr = [self makeRetainedJSONStringStart:stringstart length:curptr-stringstart];
					[_builder writeString:curstr];
				}
                curptr++;
				break;

This means that at least temporarily, JSON escape handling is disabled for keys. It's straightforward to add back, makeRetainedJSONStringStart:length: does all its processing in a character buffer, only converting to a string object at the very end.


-(void)writeString:(NSString*)aString
{
    if ( keyStr ) {
        MPWValueAccessor *accesssor=OBJECTFORSTRINGLENGTH(self.accessorTable, keyStr, keyLen);
        [accesssor setValue:aString forTarget:*tos];
        keyStr=NULL;
    } else {
        [self pushObject:aString];
    }
}

If there is a key, we are in a dictionary, otherwise an array (or top-level). In the dictionary case, we can now fetch the ValueAccessor via the OBJECTFORSTRINGLENGTH() macro.

The results are encouraging: 299ms, or 147 MB/s.

The MPWPlistBuilder also needs to be adjusted: as it builds and NSDictionary and not an object, it actually needs the NSString key, but the parser no longer delivers those. So it just creates them on the fly:


-(NSString*)key
{
    NSString *key=nil;
    if ( keyStr) {
        if ( _commonStrings ) {
            key=OBJECTFORSTRINGLENGTH(_commonStrings, keyStr, keyLen);
        }
        if ( !key ) {
            key=[[[NSString alloc] initWithBytes:keyStr length:keyLen encoding:NSUTF8StringEncoding] autorelease];
        }
    }
    return key;
}

Surprisingly, this makes the dictionary parsing code slightly faster, bringing up to par with NSSJSSONSerialization at 421ms.

Eliminating NSNumber

Our use of NSNumber/CFNumber values is very similar to our use of NSString for keys: the parser wraps the parsed number in the object, the builder then unwraps it again.

Changing that, initially just for integers, is straightforward: add an integer-valued message to the builder protocol and implement it.


-(void)writeInteger:(long)number
{
    if ( keyStr ) {
        MPWValueAccessor *accesssor=OBJECTFORSTRINGLENGTH(_accessorTable, keyStr, keyLen);
        [accesssor setIntValue:number forTarget:*tos];
        keyStr=NULL;
    } else {
        [self pushObject:@(number)];
    }
}

The actual integer parsing code is not in MPWMASONParser but its superclasss, and as we don't want to touch that for now, let's just copy-paste that code, modifying it to return a C primitive type instead of an object.


-(long)longElementAtPtr:(const char*)start length:(long)len
{
    long val=0;
    int sign=1;
    const char *end=start+len;
    if ( start[0] =='-' ) {
        sign=-1;
        start++;
    } else if ( start[0]=='+' ) {
        start++;
    }
    while ( start < end && isdigit(*start)) {
        val=val*10+ (*start)-'0';
        start++;
    }
    val*=sign;
    return val;
}

I am sure there are better ways to turn a string into an int, but it will do for now. Similarly to the key/string distinction, we now special case integers.


                if ( isReal) {
                    number = [self realElement:numstart length:curptr-numstart];

                    [_builder writeString:number];
                } else {
                    long n=[self longElementAtPtr:numstart length:curptr-numstart];
                    [_builder writeInteger:n];
                }

Again, not pretty, but we can clean it up later.

Together with using direct instance variable access instead of properties to get to the accessorTable, this yields a very noticeable speed boost:

229 ms, or 195 MB/s.

Nice.

Discussion

What happened here? Just random hacking on the profile and replacing nice object-oriented programming with ugly but fast C?

Although there is obviously some truth in that, profiles were used and more C primitive types appeared, I would contend that what happened was a move away from objects, and particularly away from generic and expensive Foundation objects ("Foundation oriented programming"?) towards message oriented programming.

I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea.
The big idea is "messaging" -- that is what the kernal of Smalltalk/Squeak is all about (and it's something that was never quite completed in our Xerox PARC phase). The Japanese have a small word -- ma -- for "that which is in between" -- perhaps the nearest English equivalent is "interstitial". The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be.
prototypes vs classes was: Re: Sun's HotSpot, Alan Kay, Squeak Mailing List (1998)

It turns out that message oriented programming (or should we call it Protocol Oriented Programming?) is where Objective-C shines: coarse-grained objects, implemented in C, that exchange messages, with the messages also as primitive as you can get away with. That was the idea, and when you follow that idea, Objective-C just hums, you get not just fast, but also flexible and architecturally nicely decoupled objects: elegance.

The combination of objects + primitive messages is very similar to another architecturally elegant and productive style: Unix pipes and filters. The components are in C and can have as rich an internal structure as you want, but they have to talk to each other via byte-streams. This can also be made very fast, and also prevents or at least reduces coupling between the components.

Another aspect is the tension between an API for use and an API for reuse, particularly within the constraints of call/return. When you get tasked with "Create a component + API for parsing JSON", something like NSJSONSerialization is something you almost have to come up with: feed it JSON, out comes parsed JSON. Nothing could be more convenient to use for "parsing JSON".

MPWMASONParser on the other hand is not convenient at all when viewed in isolation, but it's much more capable of being smoothly integrated into a larger processing chain. And most of the work that NSJSONSerialization did in the name of convenience is now just wasted, it doesn't make further processing any easier but sucks up enormous amounts of time.

Anyway, let's look at the current profile:

First, times are now small enough that high-resolution (100µs) sampling is now necessary to get meaningful results. Second, the NSNumber/CFNumber and NSString packing and unpacking is gone, with an even bigger chunk of the remaining time now going to object creation. objc_msgSend() is now starting to actually become noticeable, as is the (inefficient) character level parsing. The accessors of our test objects start to appear, if barely.

With the work we've done so far, we've improved speed around 5x from where we started, and at 195 MB/s are almost 20x faster than Swift's JSONDecoder.

Can we do better? Stay tuned.

Note

I can help not just Apple, but also you and your company/team with performance and agile coaching, workshops and consulting. Contact me at info at metaobject.com.

Monday, April 20, 2020

Somewhat Faster JSON Support for iOS/macOS, Part 6: Cutting KVC out of the Loop

Last time, we actually made some significant headway by taking advantage of the dematerialisation of the plist intermediate representation. So instead of first producing an array of dictionaries, we went directly from JSON to the final object representation.

This got us down from around 1.1 seconds to a little over 600 milliseconds.

It was accomplished by using the Key Value Coding method setValue:forKey: to directly set the attributes of the objects from the parsed JSON. Oh, and instantiating those objects in the first place, instead of dictionaries.

That this should be so much faster than most other methods, for example beating Swift's JSONDecoder() by a cool 7x, is a little surprising, given that KVC is, as I mentioned in the first article of the series, the slowest mechanism for getting data in and out of objcets short of deliberate Rube Goldber Mechanisms.

What is KVC and why is it slow?

Key Value Coding was a core part of NeXT's Enterprise Object Framework, introduced in 1994.

Key-value coding is a data access mechanism in which the properties of an object are accessed indirectly by key or name, rather than directly as fields or by invocation of accessor methods. It is used throughout Enterprise Objects but is perhaps most useful to you when accessing data in relationships between enterprise objects.
Key-value coding enables the use of keypaths to traverse relationships. For example, if a Person entity has a relationship called toPhoto whose destination entity (called PersonPhoto) contains an attribute called photo, you could access that data by traversing the keypath toPhoto.photo from within a Person object.
Keypaths are just one way key-value coding is an invaluable feature of Enterprise Objects. In general, though, it is most useful in providing a consistent way to access an object's data. Rather than needing to know if an object's data members have accessor methods, what the names of those accessor methods are, or if the data is accessible through fields, all you need to know are the keys that represent an object’s data. Key-value coding automatically finds the data, regardless of how the object provides its data. In this context, key-value coding satisfies the classic design pattern principle to “encapsulate the things that varies.”
Key Concepts of Object-Oriented Programming, NeXT/Apple EOF Overview

It still is an extremely powerful programming technique that lets us write algorithms that work generically with any object properties, and is currently the basis for CoreData, AppleScript support, Key Value Observing and Bindings. (Though I am somewhat skeptical of some of these, not least for performance reasons, see The Siren Call of KVO and (Cocoa) Bindings). It was also part of the inspiration for Polymorphic Identifiers.

The core of KVC are the valueForKey: and setValue:forKey: messages, which have default implementations in NSObject. These default implementations take the NSString key, derive an accessor message from that key and then send the message, either setting or returning a value. If the value that the underlying message takes/returns is a non-object type, then KVC wraps/unwraps as necessary.

If this sounds expensive, then that's because it is. To derive the set accessor from the key, the first character of the key has to be capitalized, the the string "set" prepended and the string converted to an Objective-C selector (SEL). In theory, this has to be done on every call to one of the KVC methods, and it has to be done with NSString objects, which do a fantastic job of representing human-visible text, but are a bit heavy-weight for low-level work.

Doing the full computation on every invocation would be way too expensive, so Apple caches some of the intermediate results. As there is no obvious place to put those intermediate results, they are placed in global hash tables, keyed by class and property/key name. However, even those lookups are still significantly more expensive than the final set or get property accesss, and we have to do multiple lookups. Since theses tables have to be global, locking is also required.

ValueAccessor

All this expense could be avoided if we had a custom object to mediate the access, rather than a naked NSString. That object could store those computed values, and then provide fast and generic access to arbitrary properties. Enter MPWValueAccesssor (.h .m).

A word of warning: unlike MPWStringtable, MPWValueAccesssor is mostly experimental code. It does have tests and largely works, but it is incomplete in many ways and also contains a bunch of extra and probably extraneous ideas. It is sufficient for our current purpose.

The core of this class is the AccessPathComponent struct.


typedef struct {
    Class   targetClass;
    int     targetOffset;
    SEL     getSelector,putSelector;
    IMP0    getIMP;
    IMP1    putIMP;
    id      additionalArg;
    char    objcType;
} AccessPathComponent;

This struct contains a number of different ways of getting/setting the data:

the integer offset into the object where the ivar is located
a pair of Objective-C selectors/message names, one for getting, one for setting.
a pair of function pointers to the Objective-C methods that the respective selectors resolve to
the additional arg is the key, to be used for keyed access

The getIMP and putImp are initialized to objc_msgSend(), so they can always be used. If we bind the ValueAccessor to a class, those function pointers get resolved to the actual getter/setter methods. In addition the objcType gets set to the type of the instance variable, so we can do automatic conversions like KVC. (This was some code I actually had to add between the last instalment and the current one.)

The key takeaway is that all the string processing and lookup that KVC needs to do on every call is done once during initialization, after that it's just a few messages and/or pre-resolved function calls.

Hooking up the ValueAccessor

Adapting the MPWObjectBuilder (.h .m) to use MPWValueAccessor was much easier than I had expected. Thee following shows the changes made:


@property (nonatomic, strong) MPWSmallStringTable *accessorTable;

...

-(void)setupAcceessors:(Class)theClass
{
    NSArray *ivars=[theClass ivarNames];
    ivars=[[ivars collect] substringFromIndex:1];
    NSMutableArray *accessors=[NSMutableArray arrayWithCapacity:ivars.count];
    for (NSString *ivar in ivars) {
        MPWValueAccessor *accessor=[MPWValueAccessor valueForName:ivar];
        [accessor bindToClass:theClass];
        [accessors addObject:accessor];
    }
    MPWSmallStringTable *table=[[[MPWSmallStringTable alloc] initWithKeys:ivars values:accessors] autorelease];
    self.accessorTable=table;
}

-(void)writeObject:anObject forKey:aKey
{
    MPWValueAccessor *accesssor=[self.accessorTable objectForKey:aKey];
    [accesssor setValue:anObject forTarget:*tos];
}

The bulk of the changes come as part of the new -setupAccessors: method. It first asks the class what its instance variables are, creates a value accessor for that instance variabl(-name), binds the accessor to the class and finally puts the accessors in a lookup table keyed by name.

The -writeObject:forKey: method is modified to look up and use a value accessor instead of using KVC.

Results

The parsing driver code didn't have to be changed, re-running it on our non-representative 44 MB JSON file yields the following time:

441 ms.

Now we're really starting to get somewhere! This is just shy of 100 MB/s and 10x faster then Swift's JSONDecoder, and within 5% of raw NSJSONSerialization.

Analysis and next steps

Can we do better? Why yes, glad you asked. Let's have a look at the profile.

First thing to note is that object-creation (beginDictionary) is now the #1 entry under the parse, as it should be. This is another indicator that we are not just moving in the right direction, but also closing in on the endgame.

However, there is still room for improvement. For example, although actually searching the SmallStringTable for the ValueAccessor (offsetOfCStringWithLengthInTableOfLength()) takes only 2.7% of the time, about the same as getting the internal char* out of a CFString via the fast-path (CFStringGetCStringPtr()), the total time for the -objectForKey: is a multiple of that, at 13%. This means that unwrapping the NSString takes more time than doing the actual work. Wrapping the char* and length into an NSString also takes significant time, and all of this work is redundant...we would be better of just passing along the char* and length.

A similar wrap/unwrap situation occurs with integers, which we first turn into NSNumbers, only to immediately get the integer out again so we can set it.

objc_msgSend() also starts getting noticeable, so looking at a bit of IMP-caching and just eliminating unnecessary indirection also seems like a good idea.

That's another aspect of optimization work: while the occasional big win is welcome, getting to truly outstanding performance means not being satisfied with that, but slogging through all the small-ish seeming detail.

Note

I can help not just Apple, but also you and your company with performance and agile coaching, workshops and consulting. Contact me at info at metaobject.com.

Tuesday, April 14, 2020

Somewhat Less Lethargic JSON Support for iOS/macOS, Part 3: Dematerialization

In the previous in instalments, we looked at and analysed the status quo for JSON parsing on Apple platforms in general and Swift in particular and it wasn't all that promising: we know that parsing to an intermediate representation of Foundation plist types (dictionaries, arrays, strings, numbers) is one of the worst possible ideas, yet it is the fastest we have. We know that creating objects from JSON is, or at least should be, the slowest part of this, yet it is by far the fastest, and last, not least, we also know is the slowest possible way to transfer values to those objects, yet Swift Coding somehow manages to be several times slower.

So either we're wrong about all of these things we know, always a distinct possibility, or there is something fishy going on. My vote is on the latter, and while figuring out exactly what fishy thing is going on would probably be a fascinating investigation for an Apple performance engineer, I prefer proof by creation:

Just make something that doesn't have these problems. In that case you not only know where the problem is, you also have a better alternative to use.

MASON

Without much further ado, here is the definition of the MPWMASONParser class:


@class MPWSmallStringTable;
@protocol MPWPlistStreaming;

@interface MPWMASONParser : MPWXmlAppleProplistReader {
	BOOL inDict;
	BOOL inArray;
	MPWSmallStringTable *commonStrings;
}

@property (nonatomic, strong) id  builder;

-(void)setFrequentStrings:(NSArray*)strings;

@end

What it does is send messages of the MPWPlistStreaming protocol to its builder property. So a Message-oriented parser for JaSON, just like MAX is the Message oriented API for XML.

The implementation-history is also reflected in the fact that it is a subclass of MPWXmlAppleProplistReader, which itself is a subclass of MPWMAXParser>. The core of the implementation is a loop that handles JSON syntax and sends one-way messages for the different elements to the builder. It looks very similar to loops in other simple parsers (and probably not at all like the crazy SIMD contortioins of simdjson). When done, it returns whatever the builder constructed.


-parsedData:(NSData*)jsonData
{
	[self setData:jsonData];
	const char *curptr=[jsonData bytes];
	const char *endptr=curptr+[jsonData length];
	const char *stringstart=NULL;
	NSString *curstr=nil;
	while (curptr < endptr ) {
		switch (*curptr) {
			case '{':
				[_builder beginDictionary];
				inDict=YES;
				inArray=NO;
				curptr++;
				break;
			case '}':
				[_builder endDictionary];
				curptr++;
				break;
			case '[':
				[_builder beginArray];
				inDict=NO;
				inArray=YES;
				curptr++;
				break;
			case ']':
				[_builder endArray];
				curptr++;
				break;
			case '"':
                parsestring( curptr , endptr, &stringstart, &curptr  );
                curstr = [self makeRetainedJSONStringStart:stringstart length:curptr-stringstart];
				curptr++;
				if ( *curptr == ':' ) {
					[_builder writeKey:curstr];
					curptr++;
					
				} else {
					[_builder writeString:curstr];
				}
				break;
			case ',':
				curptr++;
				break;
			case '-':
			case '0':
			case '1':
			case '2':
			case '3':
			case '4':
			case '5':
			case '6':
			case '7':
			case '8':
			case '9':
			{
				BOOL isReal=NO;
				const char *numstart=curptr;
				id number=nil;
				if ( *curptr == '-' ) {
					curptr++;
				}
				while ( curptr < endptr && isdigit(*curptr) ) {
					curptr++;
				}
				if ( *curptr == '.' ) {
					curptr++;
					while ( curptr < endptr && isdigit(*curptr) ) {
						curptr++;
					}
					isReal=YES;
				}
				if ( curptr < endptr && (*curptr=='e' | *curptr=='E') ) {
					curptr++;
					while ( curptr < endptr && isdigit(*curptr) ) {
						curptr++;
					}
					isReal=YES;
				}
                number = isReal ?
                            [self realElement:numstart length:curptr-numstart] :
                            [self integerElementAtPtr:numstart length:curptr-numstart];

				[_builder writeString:number];
				break;
			}
			case 't':
				if ( (endptr-curptr) >=4  && !strncmp(curptr, "true", 4)) {
					curptr+=4;
					[_builder pushObject:true_value];
				}
				break;
			case 'f':
				if ( (endptr-curptr) >=5  && !strncmp(curptr, "false", 5)) {
					// return false;
					curptr+=5;
					[_builder pushObject:false_value];

				}
				break;
			case 'n':
				if ( (endptr-curptr) >=4  && !strncmp(curptr, "null", 4)) {
					[_builder pushObject:[NSNull null]];
					curptr+=4;
				}
				break;
			case ' ':
			case '\n':
				while (curptr < endptr && isspace(*curptr)) {
					curptr++;
				}
				break;

			default:
				[NSException raise:@"invalidcharacter" format:@"JSON invalid character %x/'%c' at %td",*curptr,*curptr,curptr-(char*)[data bytes]];
				break;
		}
	}
    return [_builder result];

}

It almost certainly doesn't correctly handle all edge-cases, but doing so is unlikely to impact overall performance.

Dematerializing Property Lists with `MPWPlistStreaming`

Above, I mentioned that MASON is message-oriented, and that its main purpose is sending messages of the MPWPlistStreaming protocol to its builder. Here is that protocol:


@protocol MPWPlistStreaming

-(void)beginArray;
-(void)endArray;
-(void)beginDictionary;
-(void)endDictionary;
-(void)writeKey:aKey;
-(void)writeString:aString;
-(void)writeNumber:aNumber;
-(void)writeObject:anObject forKey:aKey;
-(void)pushContainer:anObject;
-(void)pushObject:anObject;

@end

What this enables is using property lists as an intermediate format without actually instantiating them, instead sending the messages we would have sent if we had a property list. Protocol Oriented Programming, anyone? Oh, I forgot, you can only do that in Swift...

The same protocol can also be used on the output side, then you get something like Standard Object Out.

Trying it out

By default, MPWMASONParser sets its builder to an instance of MPWPlistBuilder, which, as the name hints, builds property lists. Just like NSJSONSerialization.

So let's give it a whirl:


-(void)decodeMPWDicts:(NSData*)json
{
    MPWMASONParser *parser=[MPWMASONParser parser];
    NSArray* plistResult = [parser parsedData:json];
    NSLog(@"MPWMASON %@ with %ld dicts",[plistResult firstObject],[plistResult count]);
}

And the time is, drumroll, ... 0.621 seconds.

Hmm...that's disappointing. We didn't do anything wrong, yet almost 50% slower than NSJSONSerialization. Well, those dang Apple engineers do know what they're doing after all, and we should probably just give up.

Well, not so fast. Let's at least check out what we did wrong. Unleash the Cracken...er...Instruments!

So that's interesting: the vast majority of time is actually spent in Apple code building the plist. And we have to build the plist. So how does NSJSONSerialization get the same job done faster? Last I checked, with NSPropertyListSerialization, but close enough, they actually use specialised CoreFoundation-based dictionaries that are optimized for the case of having a lot of string keys and having them all in one place during initialization. These are not exposed, CoreFoundation being C-based means non-exposure is very effective and apparently Apple stopped open-sourcing CFLite a while ago.

So how can we do better? Tune in for the next exciting instalment :-)

Sunday, April 12, 2020

Somewhat Less Lethargic JSON Support for iOS/macOS, Part 2: Analysis

In Part 1: The Status Quo, we saw that something isn't quite right with JSON procsesing in Apple land: while something like simdjson can accomplish the basic parsing task at a rate of 2.5 GB/s and creating objects happens at an equivalent rate of 310 MB/s, Swift's JSON Codable support manages a measly 10 MB/s, underperforming the MacBook Pro's built in SSD by at least 200x and a Gigabit network connection still by factor 10.

Some of the feedback I got indicated that the implications of the data presented in "Status Quo" were not as clear as they should have been, so a little analysis before we dive into code.

The MessagePack decode is the only "pure" Swift Codable decoder. As it is so slow as to make the rest of the graph almost unreadable and was only included for comparison, not actually being a JSON decoder, let's leave it out for now. In addition, let's show how much time of each result is the underlying parser and how much time is spent in object creation.

This chart immediately lays to rest two common hypotheses for the performance issues of Swift Codable:

It's the object creation.
No.
That is, yes, object creation is slow compared to many other things, but here it represents only around 3% of the total runtime. Yes, finding a way to reduce that final 3% would also be cool (watch this space!), but how about tackling the 97% first?
It's the fact that it is using NSJSONSerialization and therefore Objective-C under the hood that makes it slow.
No.
Again, yes, parsing something to a dictionary-based representation that is more expensive than the final representation is not ideal and should be avoided. This is one of the things we will be doing. However:
- The NSJSONSerialization part of decoding makes up only 13% of the running time, the remaining 87% are in the Swift decoder part.
- Turning the dictionaries into objects using Key-Value-Coding, which to me is just about the slowest imaginable mechanism for getting data into an object that's not deliberately adding Rube-Goldberg elements, "only" adds 740ms to the basic NSJSONSerialization's parse from JSON to dictionaries. While this is ~50% more time than the parse to dictionaies and 5x the pure object creaton time, it is still 5x less than the Codable overhead.
- All the pure Swift parsers are also this slow or slower.

It also shows that stjson is not a contender (not that it ever claimed to be), because it is slower than even Swift's JSONDecoder without actually going to full objects. JASON is significantly faster, but also doesn't go to objects, and for not going to objects is still significantly slower than NSJSONSerialization. That really only leaves the NSJSONSerialization variants as useful comparison points for what is to come, the rest is either too slow, doesn't do what we need it to do, or both.

Here we can see fairly clearly that creating objects instead of dictionaries would be better. Better than creating dictionaries and certainly much better than first creating dictionaries and then objects, as if that weren't obvious. It is also clear that the actual parsing of JSON text doesn't add all that much extra overhead relative to just creating the dictionaries. In fact, just adding the -copy to convert from mutable dictionaries to immutable dictionaries appears to take more time than the parse!

In truth, it's actually not quite that way, because as far as I know, NSJSONSerialization, like its companion NSPropertyListSerialization uses special dictionaries that are cheaper to create from a textual representation.

simdjson

With all that in mind, it should be clear that simdjson, although it would likely take the pure parse time for that down to around 17 ms, is not that interesting, at lest at this stage. What it optimizes is the part that already takes the least time, and is already overwhelmed by even small changes in the way we create our objects.

What this also means is that simdjson will only be useful if it doesn't make object creation slower. This is also a lesson I learned when creating the MAX XML parser: you can't just make the XML parser part as fast as possible, sometimes it makes sense to make the parser itself somewhat slower if that means other parts, such as object creation, significantly faster. Or more generally: it's not enough to have fast components, they have to play well together. Optimization is about systems and architecture. If you want to do it well.

MASON

In the next installment, we will start looking at the actual parser.

Friday, April 10, 2020

Somewhat Less Lethargic JSON Support for iOS/macOS, Part 1: The Status Quo

I just finished watching Daniel Lemire's talk on the current iteration of simdjson, a JSON parser that clocks in at 2.5GB/s! I've been following Daniel's work for some time now and can't really recommend it highly enough.

This reminded me of a recent twitter conversation where I had offered to contribute a fast, Swift-compatible JSON parser loosely based on MAX, my fast and convenient XML parser. Due to various factors most of which are not under my control, I can't really offer anything that's fast when compared to simdjson, but I can manage something quite a bit less lethargic than what's currently on offer in the Apple and particularly the Swift world.

Environmental assumptions and constraints

My first assumption is that we are going to operate in the Apple ecosystem, and for simplicity's sake I am going to use macOS. Next, I will assume that what we want from our parse(r) are domain objects for further processing within our application (or structs, the difference is not important in this context).

We are going to use the following class with a mix of integer and string instance variables, in Swift:


@objc class TestClass: NSObject, Codable {
    let hi:Int
    let there:Int
    let comment:String
...
}

and the same in Objective-C:


@interface TestClass : NSObject

@property (nonatomic) long hi,there;
@property (nonatomic,strong) NSString *comment;

@end

To make it all easy to measure, we are going to use one million objects, which we are going to initialise with increasing integers and the constant string "comment". This yields the same 44MB JSON file with different serialisation methods, which can be correctly parsed by all the parsers tested. This is obviously a very simple class an file structure, but I think it gives a reasonable approximation for real-world use.

The first thing to check is how quickly we can create these objects straight in code, without any parsing.

That should give us a good upper bound for the performance we can achieve when parsing to domain objects.


#define COUNT 1000000
-(void)createObjects
{
    NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:COUNT+20];
    for ( int i=0;i<COUNT;i++ ) {
        TestClass *cur=[TestClass new];
        cur.hi=i;
        cur.there=i;
        cur.comment=@"comment";
        [objResult addObject:cur];
    }
    NSLog(@"Created objects in code w/o parsing %@ with %ld objects",objResult[0],[objResult count]);
}

On my Quad Core, 2.7Ghz MBP '18, this runs in 0.141 seconds. Although we aren't actually parsing, it would mean that just creating all the objects that would result from parsingg our 44MB JSON file would yield a rate of 312 MB/s.

Wait a second! 312MB/s is almost 10x slower than Daniel Lemire's parser, the one that actually parses JSON, and we are only creating the objects that would result if we were parsing, without doing any actual parsing.

This is one of the many unintuitive aspects of parsing performance: the actual low-level, character-level parsing is generally the least important part for overall performance. Unless you do something crazy like use NSScanner. Don't use NSScanner. Please.

One reason this is unintuitive is that we all learned that performance is dominated by the innermost loop, and character level processing is the innermost loop. But when you have magnitudes in performance differences and inner and outer loops differ by less than that amount, the stuff happennnig in the outer loop can dominate.

NSJSONSerialization

Apple's JSON story very much revolves around NSJSONSerialization, very much like most of the rest of its serialization story revolves around the very similar NSPropertyListSerialization class. It has a reasonable quick implementation, turning the 44 MB JSON file into an NSArrray of NSDictionary instances in 0.421 seconds when called from Objective-C, for a rate of 105 MB/s. From Swift, it takes 0.562 seconds, for 78 MB/s.

Of course, that gets us to a property list (array of dicts, in this case), not to the domain objects we actually want.

If you read my book (did I mention my book? Oh, I think I did), you will know that this type of dictonary representation is fairly expensive: expensive to create, expensive in terms of memory consumption and expensive to access. Just creating dictionaries equivalent to the objects we created before takes 0.321 seconds, so around 2.5x the time for creating the equivalent objects and a "rate" of 137 MB/s relative to our 44 MB JSON file.


-(void)createDicts
{
    NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:COUNT+20];
    for ( int i=0;i<COUNT;i++ ) {
        NSMutableDictionary *cur=[NSMutableDictionary dictionary];
        cur[@"hi"]=@(i);
        cur[@"there"]=@(i);
        cur[@"comment"]=@"comment";
        [objResult addObject:cur];
    }
    NSLog(@"Created dicts in code w/o parsing %@ with %ld objects",objResult[0],[objResult count]);
}

Creating the dict in a single step using a dictionary literal is not significantly faster, but creating an immutable copy of the mutable dict after we're done filling brings the time to half a second.

Getting from dicts to objects is typically straightforward, if tedious: just fetch the entry of the dictionary and call the corresponding setter with the value thus retrieved from the dictionary. As this isn't production code and we're just trying to get some bounds of what is possible, there is an easier way: just use Key Value Coding with the keys found in the dictionary. The combined code, parsing and then creating the objects is shown below:


-(void)decodeNSJSONAndKVC:(NSData*)json
{
    NSArray *keys=@[ @"hi", @"there", @"comment"];
    NSArray *plistResult=[NSJSONSerialization JSONObjectWithData:json options:0 error:nil];
    NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:plistResult.count+20];
    for ( NSDictionary *d in plistResult) {
        TestClass *cur=[TestClass new];
        for (NSString *key in keys) {
            [cur setValue:d[key] forKey:key];
        }
        [objResult addObject:cur];
    }
    NSLog(@"NSJSON+KVC %@ with %ld objects",objResult[0],[objResult count]);
}

Note that KVC is slow. Really slow. Order-of-magnitude slower than just sending messages kind of slow, and so it has significant impact on the total time, which comes to a total of 1.142 seconds including parsing and object creation, or just shy of 38 MB/s.

Swift JSON Coding

For the first couple of releases of Swift, JSON support by Apple was limited to a wrapped NSJSONSerialization, with the slight performance penalty already noted. As I write in my book (see sidebar), many JSON "parsers" were published, but none of these with the notable exception of the Big Nerd Ranch's Freddy were actual parses, they all just transformed the arrays and dictionaries returned by NSJSONSerialization into Swift objects. Performance was abysmal, with around 25x overhead in addition to the basic NSJSONSerialization parse.

Apple's Swift Codable promised to solve all that, and on the convenience front it certainly does a great job.


    func readJSONCoder(data:Data) -> [TestClass] {
        NSLog("Swift Decoding")
        let coder=JSONDecoder( )
        let array=try! coder.decode([TestClass].self, from: data)
        return array
    }

(All the forcing is because this is just test code, please don't do this in production!). Alas, performance is still not great: 4.39 seconds, or 10 MB/s. That's 10x slower than the basic NSJSONSerialization parse and 4x slower than our slow but simple complete parse via NSJSONSerialization and KVC.

However, it is significantly faster than the previous third-party JSON to Swift objects "parsers", to the tune of 3-4x. This is the old "first mark up 400% then discount 50%" sales trick applied to performance, except that the relative numbers are larger.

Third Party JSON Parsers

I looked a little at third party JSON parsers, particularly JASON, STJSON and ZippyJSON.

STTJSON does not make any claims to speed and manages to clock in at 5 seconds, or just under 10 MB/s. JASON bills itself as a "faster" JSON parser (they compare to SwiftyJSON), and does reasonably well at 0.75 seconds or 59 MB/s. However both of these parse to their own internal representation, not to domain objects (or structs), and so should be compared to NSJSONSerialization, at which point they both disappoint.

Probably the most interesting of these is ZippyJSON, as it uses Daniel Lemire's simdjson and is Codable compatible. Alas, I couldn't get ZippyJSON to compile, so I don't have numbers, but I will keep trying. They claim around 3x faster than Apple's JSONDecoder, which would make it the only parser to be at least in the same ballpark as the trivial NSJSONSerialization + KVC method I showed above.

Another interesting tidbit comes from ZippyJSON's README, under the heading "Why is it so much faster".

Apple's version first converts the JSON into an NSDictionary using NSJSONSerialization and then afterwards makes things Swifty. The creation of that intermediate dictionary is expensive.

This is true by itself: first converting to an intermediate representation is slow, particularly one that's as heavy-weight as property lists. However, it cannot be the primary reason, because creating that expensive representation only takes 1/8th of the total running time. The other 7/8ths is Codable apparently talking to itself. And speaking very s-l-o-w-l-y while doing that.

To corroborate, I also tried a the Flight-School implementation of Codable for MessagePack, which obviously does not use NSJSONSerialization. It makes no performance claims and takes 18 seconds to decode the same objects we used in the JSON files, of course with a different file that's 34 MB in size. Normalized to our 44 MB file that would be 2.4 MB/s.

MAX and MASON

So where does that leave us? Considering what simdjs shows is theoretically possible with JSON parsing, we are not in a good place, to put it mildly. 2.5 GB/s vs. 10 MB/s with Apple's JSONDecoder, several times slower than NSJSONSerialization, which isn't exactly a speed daemon and around 30x slower than pure object creation. Comically bad might be another way of putting it. At least we're being entertained.

What can I contribute? Well, I've been through most of this once before with XML and the result was/is MAX (Messaging API for XML), a parser that is not just super-fast itself (though no SIMD), but also presents APIs that make it both super-convenient and also super-fast to go directly from the XML to an object-representation, either as a tree or a stream of domain objects while using mostly constant memory. Have I mentioned my book? Yeah, it's in the book, in gory detail.

Anyway, XML has sorta faded, so the question was whether the same techniques would work for a JSON parser. The answer is yes, roughly, though with some added complexity and less convenience because JSON is a less informative file format than XML. Open- and close-tags really give you a good heads-up as to what's coming that "{" just does not.

The goal will be to produce domain objects at as close to the theoretical maximum of slightly more than 300 MB/s as possible, while at the same time making the parser convenient to use, close to Swift Codable in convenience. It won't support Codable per default, as the overheads seem to be too high, but ZippyJSON suggests that an adapter wouldn't be too hard.

That parser is MPWMASONParser, and no, it isn't done yet. In its initial state, it parses JSON to dictionaries in 0.58 seconds, or 76 MB/s and slightly slower than NSJSONSerialization.

So we have a bit of way to go, come join me on this little parsing performance journey!

Wednesday, April 8, 2020

Swift Initialization, SwiftUI and Function Builders: Called It!

Back in 2014, I wrote a post titled Remove features for greater power, aka: Swift and Objective-C initializers. In this post, I compared the IMHO insane language rules for initialisation in Swift (at the time 14 pages in the Swift book) with the complete lack of such rules in Objective-C, or Smalltalk for that matter.

Chris was so kind to leave a comment stating that my desire for simplicity was incompatible with some specific goals they had for the language. My response was that maybe those goals were incompatible with simplicity. It's a matter of priorities.

A prediction I made was that these rules, despite or more likely because of their complexity, would not be sufficient. And that turned out to be correct, as predicted, people turned to workarounds, just like they did with C++ and Java constructors.

Well, turns out I was correct beyond my wildest dreams: what are SwiftUI Function Builders if not a way to create/initialize complex object structures?

So I'll just come out and say that I called it. :-)

And while I obviously agree that a way to write down complex object structures is useful and important, and the mechanism is once again very clever, I will go out on a limb and claim that the pain that people are encountering now due to weird interactions with the language and type-system is not just due to an immature implementation and growing pains. Of course things will get better, but the fundamental problems of complexity, restrictions, non-obvious interactions with the type-system etc. are essential, not accidental, and therefore can be expected to be with us for good.

UPDATE (2024)

I guess the Swift team finally cottoned on to it: "By formalizing Objective-C's initialization conventions, we've ended up with a tower of complexity where users find it easier to do the wrong thing..."

Thursday, November 7, 2019

Instant Builds

One of the goals I am aiming for in Objective-Smalltalk is instant builds and effective live programming.

A month ago, I got a package from an old school friend: my old Apple ][+, which I thought I had given as a gift, but he insisted had been a long-term loan. That machine featured 48KB of DRAM and a 1 MHz, 8 bit 6502 processor that took multiple cycles for even the simplest instructions, had no multiply instructions and almost no registers. Yet, when I turn it on it becomes interactive faster than the CRT warms up, and the programming experience remains fully interactive after that. I type something in, it executes. I change the program, type "RUN" and off it goes.

Of course, you can also get that experience with more complex systems, Smalltalk comes to mind, but the point is that it doesn't take the most advanced technology or heroic effort to make systems interactive, what it takes is making it a priority.

Didn't the build time continuous increase over the year?
Build time at my work jump 2.5x to almost an hour in 3 years (Granted, it's a 2014 Mac mini, but still)
Even a iMac Pro takes 8 minutes now 🤦‍♂️
— 👽 (@et_rc1) November 7, 2019

But here we are indeed.

Now Swift is only one example of this, it's a current trend, and of course these systems do claim that they provide benefits that are worth the wait. From optimizations to static type-checking with type-inference, so that "once it compiles, it works". This is deemed to be (a) 100% worthwhile despite the fact that there is no scientific evidence backing up these claims (a paper which claimed that it had the evidence was just shredded at this year's OOPSLA) and (b) essentially cost-free. But of course it isn't cost free:

Minimum Viable Program:

"A running program, even if not correct, feels closer to working than a program that doesn't run at all"

(from a paper about Scratch)https://t.co/pNzDJYL9Gc pic.twitter.com/Nh2sBFnDvB
— Geoffrey Litt (@geoffreylitt) November 6, 2019

So when everyone zigs, I zag, it's my contrarian nature. Where Swift's message was, essentially "there is too much Smalltalk in Objective-C", my contention is that there is too little Smalltalk in Objective-C (and also that there is too little "Objective" in Smalltalk, but that's a different topic).

Smalltalk was perfectly interactive in its own environment on high end late 70s and early 80s hardware. With today's monsters of computation, there is no good reason, or excuse for that matter, to not be interactive even when taken into the slightly more demanding Unix/macOS/iOS development world. That doesn't mean there aren't loads of reasons, they're just not any good.

So Objective-Smalltalk will be fast, it will be live or near-live at all times, and it will have instant builds. This isn't going to be rocket science, mostly, the ingredients are as follows:

An interpreter
Late binding
Separate compilation
A fast and simple native compiler

Let's look at these in detail.

An interpreter

The basic implementation of Objective-Smalltalk is an AST-walking interpreter. No JIT, not even a simple bytecode interpreter. Which is about as pessimal as possible, but our machines are so incredibly fast, and a lot of our tasks simple enough or computational steering enough that it actually does a decent enough job for many of those tasks. (For more on this dynamic, see The Death of Optimizing Compilers by Daniel J. Bernstein)

And because it is just an interpreter, it has no problems doing its thing on iOS:

(Yes, this is in the simulator, but it works the same on an actual device)

Late Binding

Late binding nicely decouples the parts of our software. This means that the compiler has very little information about what happens and can't help a lot in terms of optimization or checking, something that always drove the compiler folks a little nuts ("but we want to help and there's so much we could do"). It enables strong modularity and separate compilation. Objective-Smalltalk is as late-bound in its messaging as Objective-C or Smalltalk are, but goes beyond them by also late-binding identifiers, storage and dataflow with Polymorphic Identifiers (ACM, pdf), Storage Combinators (ACM, pdf) and Polymorphic Write Streams (ACM, pdf).

Allowing this level of flexibility while still not requiring a Graal-level Helden-JIT to burn away all the abstractions at runtime will require careful design of the meta-level boundaries, but I think the technically desirable boundaries align very well with the conceptually desirable boundaries: use meta-level facilities to define the language you want to program in, then write your program.

It's not making these boundaries clear and freely mixing meta-level and base-level programming that gets us in not just conceptual trouble, but also into the kinds of technical trouble that the Heldencompilers and Helden-JITs have to bail us out of.

Separate Compilation

When you have good module boundaries, you can get separate compilation, meaning a change in file (or other code-containing entity if you don't like files) does not require changes to other files. Smalltalk had this. Unix-style C programming had this, and the concept of binary libraries (with the generalization to frameworks on macOS etc.). For some reason, this has taken more and more of a back-seat in macOS and iOS development, with full source inclusion and full builds becoming the norm in the community (see CocoaPods) and for a long time being enforced by Apple by not allowing user-define dynamic libraries on iOS.

While Swift allows separate compilation, this can have such severe negative effects on both performance and compile times that compiling everything on any change has become a "best practice". In fact, we now have a build option "whole module optimization with optimizations turned off" for debugging. I kid you not.

Objective-Smalltalk is designed to enable "Framework-oriented-programming", so separate compilation is and will remain a top priority.

A fast and simple native compiler

However, even with an interpreter for interactive adjustments, separate compilation due to good modularity and late binding, you sometimes want to do a full build, or need to rebuild a large part of the codebase.

Even that shouldn't take forever, and in fact it doesn't need to. I am totally with Jonathan Blow on this subject when he says that compiling a medium size project shouldn't really more than a second or so.

My current approach for getting there is using TinyCC's backend as the starting point of the backend for Objective-Smalltalk. After all, the semantics are (mostly) Objective-C and Objective-C's semantics are just C. What I really like about tcc is that it goes so brutally directly to outputting CPU opcode as binary bytes.


static void gcall_or_jmp(int is_jmp)
{
    int r;
    if ((vtop->r & (VT_VALMASK | VT_LVAL)) == VT_CONST &&
	((vtop->r & VT_SYM) && (vtop->c.i-4) == (int)(vtop->c.i-4))) {
        /* constant symbolic case -> simple relocation */
        greloca(cur_text_section, vtop->sym, ind + 1, R_X86_64_PLT32, (int)(vtop->c.i-4));
        oad(0xe8 + is_jmp, 0); /* call/jmp im */
    } else {
        /* otherwise, indirect call */
        r = TREG_R11;
        load(r, vtop);
        o(0x41); /* REX */
        o(0xff); /* call/jmp *r */
        o(0xd0 + REG_VALUE(r) + (is_jmp << 4));
    }
}

No layers of malloc()ed intermediate representations here! This aligns very nicely with the streaming/messaging approach to high-performance I've taken elsewhere with Polymorphic Write Streams (see above), so I am pretty confident I can make this (a) work and (b) simple/elegant while keeping it (c) fast.

How fast? I obviously don't know yet, but tcc is a fantastic starting point. The following is the current (=wrong) ObjectiveTcc code to drive tcc to build a function that sends a single message:


-(void)generateMessageSendTestFunctionWithName:(char*)name
{
    SEL flagMsg=@selector(setMsgFlag);
    [self functionOnlyWithName:name returnType:VT_INT argTypes:"" body:^{
        [self pushFunctionPointer:objc_msgSend];
        [self pushObject:self];
        [self pushPointer:flagMsg];
        [self call:2];
    }];
}

How often can I do this in one second? On my 2018 high spec but 13" MBP: 300,000 times. Including in-memory linking (though not much of that happening in this example), not including Mach-O generation as that's not implemented yet and writing the whole shebang to disk. I don't anticipate either of these taking appreciably additional time.

If we consider this 2 "lines" of code, one for the function/method header and one for the message, then we can generate binary for 600KLOC/s. So having a medium size program compile and link in about a second or so seems eminently doable, even if I manage to slow the raw Tcc performance down by about an order of magnitude.

(For comparison: the Swift code base that motivated the Rome caching system for Carthage was clocking in at around 60 lines per second with the then Swift compiler. So even with an anticipated order of magnitude slowdown we'd still be 1000x faster. 1000x is good enough, it's the difference between 3 seconds and an hour.)

What's the downside? Tcc doesn't do a lot of optimization. But that's OK as (a) the sorts of optimizations C compilers and backends like LLVM do aren't much use for highly polymorphic and late-bound code and (b) the basics get you around 80% of the way (c) most code doesn't need that much optimization (see above) and (d) machines have become really fast.

And it helps that we aren't doing crazy things like initially allocating function-local variables on the heap or doing function argument copying via vtables that require require leaning on the optimizer to get adequate performance (as in: not 100x slower..).

Defense in Depth

While any of these techniques might be adequate some of the time, it's the combination that I think will make the Objective-Smalltalk tooling a refreshing, pleasant and highly productive alternative to existing toolchains, because it will be reliably fast under all circumstances.

And it doesn't really take (much) rocket science, just a willingness to make this aspect a priority.

Saturday, April 27, 2019

What's Going Down at the TIOBE Index? Swift, Surprisingly

Last month I expressed my surprise at the fact that Objective-C was recovering its rankings in the TIOBE index, not quite to the lofty #3 spot it enjoyed a while ago, but to a solid 10, once again surpassing Swift, which had dropped to #17.

This month, Swift has dropped to #19 almost looking like it's going to fall out of the top 20 altogether.

Strange times.

Saturday, March 9, 2019

Software-ICs, Binary Compatibility, and Objective-Swift

Swift recently achieved ABI stability, meaning that we can now ship Swift binaries without having to ship the corresponding Swift libraries. While it's been a long time coming, it's also great to have finally reached this point. However, it turns out that this does not mean you can reasonably ship binary Swift frameworks, for reasons described very well by Peter Steinberger of PSPDFKit and the good folks at instabug.

To reach this not-quite-there-yet state took almost 5 years, which is pretty much the total time NeXT shipped their hardware, and it mirrors the state with C++, which is still not generally suitable for binary distribution of libraries. Objective-C didn't have these problems, and as it turns out this is not a coincidence.

Software ICs

Objective-C was created specifically to implement the concept of Software-ICs. I briefly referenced the concept in a previous article, and also mentioned its relationship to the scripted components pattern, but the comments indicated that this is no longer a concept people are familiar with.

As the name suggests the intention was to bring the benefits the hardware world had reaped from the introduction of the Integrated Circuits to the software world.

It is probably hard to overstate the importance of ICs to the development of the computer industry. Instead of assembling computers from discrete components, you could now put entire subsystem onto one component, and then compose these subsystems to form systems. The interfaces are standardised pins, and the relationship between the outside interface and the complexity hidden inside can be staggering. Although the socket of the CPU I am writing is a beast, with 1151 pins, the chip inside has a staggering 2.1 billion transistors. With a ratio of one million to one, that's a very deep interface, even if you disregard the fact that the bulk of those pins are actually voltage supply and ground pins.

The important point is that you do not have to, and in fact cannot, look inside the IC. You get the pins, very much a binary interface, and the documentation, a specification sheet. With Software-ICs, the idea was the same: you get a binary, the interface and a specification sheet. Here are two BYTE articles that describe the concepts:

A lot of what they write seems quaint now, for example a MailFolder that inherits from Array(!), but the concepts are very relevant, particularly with a couple of decades worth of perspective and the new circumstances we find ourselves in.

Although the authors pretty much equate Software-ICs with objects and object-oriented programming, it is a slightly different form of object-oriented programming than the one we mostly use today. They do write about object/message programming, similar to Alan Kay's note that 'The big idea is "messaging"'.

With messaging as the interconnect, similar to Unix pipes, our interfaces are sufficiently well-defined and dynamic that we really can deliver our Software-ICs in binary form and be compatible, something our more static languages like C++ and Swift struggle with.

ObjC is pretty awesome in how it manages to embed a “COM”. Swift doesn’t (currently) provide anything like it (and IMO it lost a chance not just using the ObjC runtime for that)
— Helge Heß (@helje5) January 6, 2019

Objective-C is middleware with language features.

Message Oriented Middleware

Virtually all systems based on static languages eventually grow an additional, separate and more dynamic component mechanism. Windows has COM, IBM has SOM, Qt has signals and slots and the meta-object system, Be had BMessages etc.

In fact, the problem of binary compatibility of C++ objects was one of the reasons for creating COM:

Unlike C++, COM provides a stable application binary interface (ABI) that does not change between compiler releases.

COM has been incredibly successful, it enables(-ed?) much of the Windows and Office ecosystems. In fact, there is even a COM implementation on macOS: CFPlugin, part of CoreFoundation.

CFPlugIn provides a standard architecture for application extensions. With CFPlugIn, you can design your application as a host framework that uses a set of executable code modules called plug-ins to provide certain well-defined areas of functionality. This approach allows third-party developers to add features to your application without requiring access to your source code. You can also bundle together plug-ins for multiple platforms and let CFPlugIn transparently load the appropriate plug-in at runtime. You can use CFPlugIn to add plug-in capability to, or write a plug-in for, your application.

That COM implementation is still in use, for example for writing Spotlight importers. However, there are, er, issues:

Creating a new Spotlight importer is tricky because they are based on CFPlugIn, and CFPlugIn is… well, how to say this diplomatically?… super ugly )-: One option here is to use Xcode 9 to create your plug-in based on the old template. Honestly though, I don’t recommend that because the old template… again, diplomatically… well, let’s just say that the old template lets the true nature of CFPlugIn shine through! (-:

Having written both Spotlight importers and even some COM component on Windows (I think it was just for testing), I can confirm that COM's success is not due to the elegance or ease-of-use of the implementation, but due to the fact that having an interoperable, stable binary interface is incredibly enabling for a platform.

That said, all this talk of COM is a bit confusing, because we already have NSBundle.

Apple uses bundles to represent apps, frameworks, plug-ins, and many other specific types of content.

So NSBundle already does everything a CFPlugin does and a lot more, but is really just a tiny wrapper around a directory that may contain a dynamic shared library. All the interfacing, introspection and binary compatibility features come automagically with Objective-C. In fact, NeXT had a Windows product called d'OLE that pretty automagically turned Objective-C libraries into COM-comptible OLE servers (.NET has similar capabilities). Again, this is not a coincidence, the Software-IC concept that Objective-C is based on is predicated on exactly this sort of interoperation scenario.

Objective-C is middleware with language features.

Frameworks and Microservices

To me, the idea of a Software-IC is actually somewhat higher level than a single object, I tend to see it at the level of a framework, which just so happens to provide all the trappings of a self-contained Software-IC: a binary, headers to define the interface and hopefully some documentation, which could even be provided in the Resources directory of the bundle. In addition, frameworks are instances of NSBundle, so they aren't limited to being linked into an application, they can also be loaded dynamically.

I use that capability in Objective-Smalltalk, particularly together with the stsh the Smalltalk Scripting Shell. By loading frameworks, this shell can easily be transformed into an application-specific scripting language. An example of this is pdfsh, a shell for examining an manipulating PDF files using EGOS, the Extensible Graphical Object System.


#!/usr/local/bin/stsh
#-<void>pdfsh:<ref>file
framework:EGOS_Cocoa load.
pdf := MPWPDFDocument alloc initWithData: file value.
shell runInteractiveLoop

The same binary framework is also used in in PdfCompress, PostView and BookLightning. With this framework, my record for creating a drag-and-drop applicaton to do something useful with a PDF file was 5 minutes, and the only reason I was so slow was that I thought I had remembered the PDF dictionary entry...and had not.

Framework-oriented programming is awesome, alas it was very much deprecated by Apple for quite some time, in fact even impossible on iOS until dynamic libraries were allowed. Even now, though, the idea is that you create an app, which consists of all the source-code needed to create it (exception: Apple code!), even if some of that code may be organised into framework units that otherwise don't have much meaning to the build.

Apps, however are not Software-ICs, they aren't the right packaging technology for reuse (AppleScript notwithstanding). And so iOS and macOS development shops routinely get themselves into big messes, also known as the Big Ball of Mud architectural pattern.

Of course, there are reasons that things didn't quite work out the way we would have liked. Certainly Apple's initial Mac OS X System Architecture book showed a much more flexible arrangement, with groups of applications able to share a set of frameworks, for example. However, DLL hell is a thing, and so we got a much more restricted approach where every app is a little fortress and frameworks in general and binary frameworks in particular are really something for Apple to provide and for the rest to use. However, the fact that we didn't manage to get this right doesn't mean that the need went away.

Swift has been making this worse, by strongly "suggesting" that everything be compiled together and leading to such wonderful oxymorons as "whole module optimisation in debug mode", meaning without optimisation. That and not having a binary modularity story for going on half a decade. The reason for compiling whole modules together is that the modularity mechanism is, practically speaking, very much source-code based, with generics and specialisation etc. (Ironically, Swift also does some pretty crazy things to enable separate compilation, but that hasn't really panned out so far).

On the other hand, Swift's compiler is so slow that teams are rediscovering forms of framework-oriented programming as a self-defense mechanism. In order to get feedback cycles down from ludicrously bad to just plain awful, they split up their projects into independent frameworks that they then compile and run independently during development. So in a somewhat roundabout way, Swift is encouraging good development practices.

I find it somewhat interesting that the industry is rediscovering variants of the Software-IC, in this case on the backend in the form of Microservices. Why do I say that Microservices are a form of Software-IC? Well, they are a binary unit of deployability, fairly loosely coupled and dynamically typed. In fact, Fred George, one of the people who came up with the idea refers to them as Smalltalk objects:

Of course, there are issues with this approach, one being that reliable method calls are replaced with unreliable network calls. Stepping back for a second should make it clear that the purported benefits of Microservices also largely apply to Software-ICs. At least real Software-ICs. Objective-C made the mistake of equating Software-ICs with objects, and while the concepts are similar with quite a bit of overlap, they are not quite the same. You certainly can use Objective-C to build and connect Software-ICs if you want to do that. It will also help you in this endeavour, but of course you have to know that this is something you want. It doesn't do this automatically and over time the usage of Objective-C has shifted to just a regular old object-oriented language, something it is OK but not that brilliant at.

Interoperability

One of the interesting aspects of Microservices is that they are language-agnostic, it doesn't matter what language a particular services is written in, as long as they can somehow communicate via HTTP(S). This is another similarity to Software-ICs (and other middleware such as COM, SOM, etc.): there is a fairly narrowly defined, simple interface, and as long as you can somehow service that interface, you can play.

Microservices are pretty good at this, Unix filters probably the most extreme example and just about every language and every kind of application on Windows can talk to and via COM. NeXT only ever sold 50000 computers, but in a short number of years the NeXT community had bridges to just about every language imaginable. There were a number of Objective- languages, including Objective-Fortran. Apple alone has around 140K employees (though probably a large number of those in retail), and there are over 2.8 million iOS developers, yet the only language integration for Swift I know of is the Python support, and that took significant effort, compiler changes and the original Swift creator, Chris Lattner.

This is not a coincidence. Swift is designed as a programming language, not as middleware with language features. Therefore its modularity features are an add-on to the language, and try to transport the full richness of that programming model. And Swift's programming model is very rich.

Objective-Swift

The middlewares I've talked about use the opposite approach, from the outside in. For SOM, it is described as such:

SOM allows classes of objects to be defined in one programming language and used in another, and it allows libraries of such classes to be updated without requiring client code to be recompiled.

So you define interfaces separately from their implementations. I am guessing this is part of the reason we have @interface in Objective-C. Having to write things down twice can be a pain (and I've worked on projects that auto-generated Objective-C headers from implementation files), but having a concrete manifestation of the interface that precedes the implementation is also very valuable. (One of the reasons TDD is so useful is that it also forces you to think about the interface to your object before you implement it).

In Swift, a class is a single implementation-focused entity, with its interface at best a second-class and second-order effect. This makes writing components more convenient (no need to auto-generate headers...), but connecting components is more complicated.

Which brings us back to that other complication, the lack of stable binary compatibility for libraries and frameworks. One consequence of this is to write frameworks exclusively in Objective-C, which was particularly necessary before ABI stability had been reached. The other workaround, if you have Swift code, is to have an Objective-C wrapper as an interface to your framework. The fact that Swift interoperates transparently with the Objective-C runtime makes this fairly straightforward.

Did I mention that Objective-C is middleware with language features?

So maybe this supposed "workaround" is actually the solution? Have Objective-C or Objective- as our message-oriented middleware, the way it was always intended? Maybe with a bit of a tweak so that it loses most of the C legacy and gains support for pipes and filters, REST/Microservices and other architectural patterns?

Just sayin'.

Wednesday, June 7, 2023

Tuesday, June 15, 2021

Sunday, June 13, 2021

Polymorphic Write Streams

Messaging

Sunday, June 14, 2020

Friday, April 24, 2020

A convenient setback

Damn the torpedoes and full speed ahead!

Eliminating NSNumber

Discussion

Note

TOC

Monday, April 20, 2020

What is KVC and why is it slow?

ValueAccessor

Hooking up the ValueAccessor

Results

Analysis and next steps

Note

TOC

Tuesday, April 14, 2020

MASON

Dematerializing Property Lists with MPWPlistStreaming

Trying it out

TOC

Sunday, April 12, 2020

simdjson

MASON

TOC

Friday, April 10, 2020

Environmental assumptions and constraints

NSJSONSerialization

Swift JSON Coding

Third Party JSON Parsers

MAX and MASON

TOC

Wednesday, April 8, 2020

Thursday, November 7, 2019

An interpreter

Late Binding

Separate Compilation

A fast and simple native compiler

Defense in Depth

Saturday, April 27, 2019

Saturday, March 9, 2019

Software ICs

Message Oriented Middleware

Frameworks and Microservices

Interoperability

Objective-Swift

Also read

Followers

Blog Archive

About Me

Follow @mpweiher !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');

My Book

Dematerializing Property Lists with `MPWPlistStreaming`

Follow @mpweiher