After initially disappointing results trying to get to faster JSON processing (parsing, for now), we
finally got parity with NSJSONSerialization, more or less, in the last instalment, with the
help of MPWSmallStringTable to unique our strings before turning them into
objects, string creation being surprisingly expensive even for tagged pointer strings.
Cutting out the Middleman: ObjectBuilder
In the first instalment of this series, we saw that we could fairly trivially
create objects from the plist created by NSJSONSerialization.
MPWObjectBuilder (.h.m) is a subclass of MPWPlistBuilder that changes just
a few things: instead of creating dictionaries, it creates objects, and instead of using
-setObject:forKey: to set values in that dictionary, it uses the KVC message
-setValue:forKey: (vive la petite différence!) to set values in that object.
That's it! Well, all that need concern us for now, the actual class has some additional
features that don't matter here. The _tos instance variable is the top
of a stack that MPWPlistBuilder maintains while constructing the result.
The MPWObjectCache is just a factory for creating objects.
Not the most elegant code in the universe, and not a complete parser by an stretch of the
imagination, but workable.
Result: 621 ms.
Not too shabby, only 50% slower than baseNSJSONSerialization on our non-representative 44MB JSON file,
but creating the final objects, instead of just the intermediate representation, and arround 7x faster than Apple's JSONDecoder.
Although still below 100 MB/s and nowhere near 2.5 GB/s we're also starting to close in on the performance level
that should be achievable given the context, with 140ms for basic object creation and 124ms for a mostly empty parse.
Analysis and next steps
Ignoring such trivialities as actually being useful for more than the most constrained situations
(array of single kind of object), how can we improve this? Well, make it
faster, of course, so let's have a look at the profile:
As expected, the KVC code is now the top contributor, with around 40% of total runtime.
(The locking functions that show up as siblings of -setValue:forKey: are almost
certainly part of that implementation, this slight misattribution of times is something you
should generally expect and be aware of with Instruments. I am guessing it has to do with missing frame-pointers
(-fomit-frame-pointer) but don't really feel any deep urge to investigate, as it doesn't
materially impact the outcome of the analysis.
I guess that's another point: gather enough data to inform your next step, certainly no less, but also no more.
I see both mistakes, the more common one definitely being making things "fast" without enough data. Or any,
for that matter. If I had a €uro for every project that claims high performance without any (comparative)
benchmarking, simply because they did something the authors think should be fast, well, you know, ....
The other extreme is both less common and typically less bad, as at least you don't get the complete
nonsense of performance claims not backed by any performance testing, but running a huge battery of
benchmarks on every step of an optimization process is probably going to get in the way of achieving
results, and yes, I've seen this in practice.
In our last instalment, we started implementing our JSON parser with lots of good ideas, such as dematerialization via a property list protocol, but immediately fell flat on our face with our code being 50% slower than NSJSONSerialization. And what's worse, there wasn't an obvious way out, as the bulk of the time was spent in Apple code.
Nobody said this was going to be easy.
Analysis
Let's have another look at the profile:
The top 4 consumers of CPU are -setObject:forKey:, string creation, dictionary
creation and message sending. I don't really know what to do about either creating those
dictionaries we have to create or setting their contents, so what about string creation?
Although making string creation itself faster is unlikely, what we can do is reduce the
number of strings we create: since most of our JSON payload consists of objects né dictionaries,
the vast majority of our strings
is actually going to be string keys. So they will come from a small set of
known strings and be on the small-ish side. Particularly the former suggests that we
should re-use keys, rather than creating multiple new copies.
The usual way to look up something with a known key is an NSDictionary, but
alas that would require the keys we look up to already be objects, meaning we would have
to create string objects to look up our sting object values, rather defeating the
purpose of the exercise.
What we would need is a way of looking up objects by raw C-Sting, an unadorned char*.
Fortunately, I've been here before, so the required class has been in MPWFoundation for a
little over 13 years. (What's the "Trump smug face emoticon?)
MPWSmallSStringTable
The MPWSmallStringTable (.h / .m ) class is exactly what it says
on the tin: a table for looking up objects by (small) string keys. And it is accessible
by char* (+length, don't want to require NUL termination) in addition to string objects.
Quite a bit of work went into making this fast, both the implementation and the interface. It
is not a hash table, it compares chars directly, using indexing and bucketing to
expend as little work as possible while discarding non-matching strings.
In fact, since performance is its primary reason for existing, its unit tests include
performance comparisons against an NSDictionary with NSString
keys, which currently clock in at 5-8x faster.
The interface includes two macros: OBJECTFORSTRINGLENGTH() and OBJECTFORCONSTANTSTRING(). You need to give the former a length, the
latter figures the size out compile time using the sizeof operator, which really does
return the length of string constants. Don't use it with non-constant strings (so char*)
as there sizeof will return the size of the pointer.
Avoiding Allocation of Frequent Strings
With MPWSmallStringTable at hand, we can now use it in MPWMASONParser
to look up common strings like our keys without allocating them.
The -setFrequentStrings: method we saw declared in the interface takes an
array of strings, which the parser turns into a string table mapping from the
C-Sting versions of those to the NSString version.
The method that is supposed to create string objects from char*s starts as follows:
-(NSString*)makeRetainedJSONStringStart:(const char*)start length:(long)len
{
NSString *curstr;
if ( commonStrings ) {
NSString *res=OBJECTFORSTRINGLENGTH( commonStrings, start, len );
if ( res ) {
return [res retain];
}
}
...
So we first check the common stings table, and only if we don't find it
there do we drop down to the code to allocated the string. (Yeah, the
-retain is probably questionable, though currently necessary)
Trying it out
Now all we need to do is tell the parser about those common strings before we
ask it to parse JSON.
While this seems a bit tacky, telling a JSON parser what to expect beforehand at least
a little seems par for the course, so whatever.
How does that fare? Well, 440ms, which is 180ms faster than before and anywhere from
as fast as NSJSONSerialization to 5% slower. Good enough for now.
This result is actually a bit surprising, because the keys that are created by both
NSJSONSerialization and MPWMASONParser happen to
be instances of NSTaggedPointerString. These strings do not
get allocated on the heap, the entire string contents are cleverly encoded in the object
pointer itself. Creating these should only be a couple of shifts and ORs, but
apparently that takes (significantly) longer than doing the lookup, or more
likely CF adds other overhead. This was certainly the case with the original
tagged CFNumber, where just doing the shift+OR yourself was massively
faster than calling CFNumberCreate().
What next?
Having MPWSmallStringTable immediately suggests ways of tackling the
other expensive parts we identified in the profile, -setObject:forKey:
and dictionary creation: use a string table with pre-computed key space, then
set the objects via char* keys.
Another alternative is to use the MPWXmlAttributes class from MAX, which
is optimized for the parsing and use-once case.
However, all this loses sight of
the fact that we aren't actually interested in producing a plist. We want to
create objects, ideally without creating that plist. This is
a common pitfall I see in optimization work: getting so caught up in the
details (because there is a lot of detail, and it tends to be important)
that one loses sight of the context, the big picture so to speak.
Can this, creating objects from JSON, now be done more quickly? That will be in the next instalment. But as
a taste of what's possible, we can just set the builder to nil,
in order to see how the parser does when not having to create a plist.
In the previous in instalments, we looked at and analysed the status quo for JSON parsing on Apple platforms in general and Swift in particular and it wasn't all that promising: we know
that parsing to an intermediate representation of Foundation plist types (dictionaries, arrays,
strings, numbers) is one of the worst possible ideas, yet it is the fastest we have. We know
that creating objects from JSON is, or at least should be, the slowest part of this, yet it
is by far the fastest, and last, not least, we also know is the slowest possible way
to transfer values to those objects, yet Swift Coding somehow manages to be several times slower.
So either we're wrong about all of these things we know, always a distinct possibility, or there is
something fishy going on. My vote is on the latter, and while figuring out exactly what
fishy thing is going on would probably be a fascinating investigation for an Apple performance
engineer, I prefer proof by creation:
Just make something that doesn't have these problems. In that case you not only know
where the problem is, you also have a better alternative to use.
MASON
Without much further ado, here is the definition of the MPWMASONParser class:
What it does is send messages of the MPWPlistStreaming protocol to
its builder property. So a Message-oriented parser for JaSON,
just like MAX is the Message oriented API for XML.
The implementation-history is also reflected in the fact that it is a subclass of
MPWXmlAppleProplistReader, which itself is a subclass of
MPWMAXParser>.
The core of the implementation is a loop that handles JSON syntax and sends one-way messages for the
different elements to the builder. It looks very similar to loops in other simple parsers (and probably not at all like the crazy SIMD contortioins of simdjson). When done, it returns whatever the builder constructed.
-parsedData:(NSData*)jsonData
{
[self setData:jsonData];
const char *curptr=[jsonData bytes];
const char *endptr=curptr+[jsonData length];
const char *stringstart=NULL;
NSString *curstr=nil;
while (curptr < endptr ) {
switch (*curptr) {
case '{':
[_builder beginDictionary];
inDict=YES;
inArray=NO;
curptr++;
break;
case '}':
[_builder endDictionary];
curptr++;
break;
case '[':
[_builder beginArray];
inDict=NO;
inArray=YES;
curptr++;
break;
case ']':
[_builder endArray];
curptr++;
break;
case '"':
parsestring( curptr , endptr, &stringstart, &curptr );
curstr = [self makeRetainedJSONStringStart:stringstart length:curptr-stringstart];
curptr++;
if ( *curptr == ':' ) {
[_builder writeKey:curstr];
curptr++;
} else {
[_builder writeString:curstr];
}
break;
case ',':
curptr++;
break;
case '-':
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
{
BOOL isReal=NO;
const char *numstart=curptr;
id number=nil;
if ( *curptr == '-' ) {
curptr++;
}
while ( curptr < endptr && isdigit(*curptr) ) {
curptr++;
}
if ( *curptr == '.' ) {
curptr++;
while ( curptr < endptr && isdigit(*curptr) ) {
curptr++;
}
isReal=YES;
}
if ( curptr < endptr && (*curptr=='e' | *curptr=='E') ) {
curptr++;
while ( curptr < endptr && isdigit(*curptr) ) {
curptr++;
}
isReal=YES;
}
number = isReal ?
[self realElement:numstart length:curptr-numstart] :
[self integerElementAtPtr:numstart length:curptr-numstart];
[_builder writeString:number];
break;
}
case 't':
if ( (endptr-curptr) >=4 && !strncmp(curptr, "true", 4)) {
curptr+=4;
[_builder pushObject:true_value];
}
break;
case 'f':
if ( (endptr-curptr) >=5 && !strncmp(curptr, "false", 5)) {
// return false;
curptr+=5;
[_builder pushObject:false_value];
}
break;
case 'n':
if ( (endptr-curptr) >=4 && !strncmp(curptr, "null", 4)) {
[_builder pushObject:[NSNull null]];
curptr+=4;
}
break;
case ' ':
case '\n':
while (curptr < endptr && isspace(*curptr)) {
curptr++;
}
break;
default:
[NSException raise:@"invalidcharacter" format:@"JSON invalid character %x/'%c' at %td",*curptr,*curptr,curptr-(char*)[data bytes]];
break;
}
}
return [_builder result];
}
It almost certainly doesn't correctly handle all edge-cases, but doing so is unlikely to impact
overall performance.
Dematerializing Property Lists with MPWPlistStreaming
Above, I mentioned that MASON is message-oriented, and that its main
purpose is sending messages of the MPWPlistStreaming protocol to its
builder. Here is that protocol:
What this enables is using property lists as an intermediate format without actually
instantiating them, instead sending the messages we would have sent if we had a
property list. Protocol Oriented Programming, anyone? Oh, I forgot, you can only
do that in Swift...
The same protocol can also be used on the output side, then you get something like
Standard Object Out.
Trying it out
By default, MPWMASONParser sets its builder to an instance of
MPWPlistBuilder, which, as the name hints, builds property lists.
Just like NSJSONSerialization.
Hmm...that's disappointing. We didn't do anything wrong, yet almost 50% slower
than NSJSONSerialization. Well, those dang Apple engineers do
know what they're doing after all, and we should probably just give up.
Well, not so fast. Let's at least check out what we did wrong. Unleash
the Cracken...er...Instruments!
So that's interesting: the vast majority of time is actually spent in Apple code building the plist.
And we have to build the plist. So how does NSJSONSerialization get the same
job done faster? Last I checked, with NSPropertyListSerialization, but close enough,
they actually use specialised CoreFoundation-based dictionaries that
are optimized for the case of having a lot of string keys and having them all in one place
during initialization. These are not exposed, CoreFoundation being C-based means non-exposure
is very effective and apparently Apple stopped open-sourcing CFLite a while ago.
In Part 1: The Status Quo, we
saw that something isn't quite right with JSON procsesing in Apple land: while something like simdjson can accomplish
the basic parsing task at a rate of 2.5 GB/s and creating objects happens at an equivalent rate of 310 MB/s, Swift's
JSON Codable support manages a measly 10 MB/s, underperforming the MacBook Pro's built in SSD by at least 200x and
a Gigabit network connection still by factor 10.
Some of the feedback I got indicated that the implications of the data presented in "Status Quo" were not as clear
as they should have been, so a little analysis before we dive into code.
The MessagePack decode is the only "pure" Swift Codable decoder. As it is so slow as to make the rest of the graph almost
unreadable and was only included for comparison, not actually being a JSON decoder, let's leave it out
for now. In addition, let's show how much time of each result is the underlying parser and how much time is spent in
object creation.
This chart immediately lays to rest two common hypotheses for the performance issues of Swift Codable:
It's the object creation.
No.
That is, yes, object creation is slow compared to many other things, but here
it represents only around 3% of the total runtime. Yes, finding a way to reduce that final 3% would also be
cool (watch this space!), but how about tackling the 97% first?
It's the fact that it is using NSJSONSerialization and therefore Objective-C under the hood that makes it slow.
No.
Again, yes, parsing something to a dictionary-based representation that is more expensive than the
final representation is not ideal and should be avoided. This is one of the things we will be doing. However:
The NSJSONSerialization part of decoding makes up only 13% of the running time, the
remaining 87% are in the Swift decoder part.
Turning the dictionaries into objects using Key-Value-Coding, which to me is just about the slowest
imaginable mechanism for getting data into an object that's not deliberately adding Rube-Goldberg
elements, "only" adds 740ms to the basic NSJSONSerialization's parse from JSON to
dictionaries. While this is ~50% more time than the parse to dictionaies and 5x the pure object
creaton time, it is still 5x less than the Codable overhead.
All the pure Swift parsers are also this slow or slower.
It also shows that stjson is not a contender (not that it ever claimed to be), because it is slower than even
Swift's JSONDecoder without actually going to full objects. JASON is significantly faster, but also doesn't
go to objects, and for not going to objects is still significantly slower than NSJSONSerialization.
That really only leaves the NSJSONSerialization variants as useful comparison
points for what is to come, the rest is either too slow, doesn't do what we need it to do, or both.
Here we can see fairly clearly that creating objects instead of dictionaries would be better. Better than
creating dictionaries and certainly much better than first creating dictionaries and then objects,
as if that weren't obvious. It is also clear that the actual parsing of JSON text doesn't add all that
much extra overhead relative to just creating the dictionaries. In fact, just adding the -copy to
convert from mutable dictionaries to immutable dictionaries appears to take more time than the parse!
In truth, it's actually not quite that way, because as far as I know, NSJSONSerialization, like
its companion NSPropertyListSerialization uses special dictionaries that are cheaper to
create from a textual representation.
simdjson
With all that in mind, it should be clear that simdjson, although it would likely take the pure parse time
for that down to around 17 ms, is not that interesting, at lest at this stage. What it optimizes is the part that
already takes the least time, and is already overwhelmed by even small changes in the way we create our
objects.
What this also means is that simdjson will only be useful if it doesn't make object creation slower. This is
also a lesson I learned when creating the MAX XML parser: you can't just make the XML parser part as fast
as possible, sometimes it makes sense to make the parser itself somewhat slower if that means other parts,
such as object creation, significantly faster. Or more generally: it's not enough to have fast components,
they have to play well together. Optimization is about systems and architecture. If you want to do it well.
MASON
In the next installment, we will start looking at the actual parser.
I just finished watching Daniel Lemire's talk on the current iteration of simdjson, a JSON parser that clocks in at 2.5GB/s! I've been following Daniel's work for some time now and can't really recommend it highly enough.
This reminded me of a recent twitter conversation where I had offered to contribute a fast, Swift-compatible JSON parser loosely based on MAX, my
fast and convenient XML parser. Due to various factors most of which are not under my control, I can't really offer anything that's
fast when compared to simdjson, but I can manage something quite a bit less lethargic than what's currently on offer
in the Apple and particularly the Swift world.
Environmental assumptions and constraints
My first assumption is that we are going to operate in the Apple ecosystem, and for simplicity's sake I am going to use macOS.
Next, I will assume that what we want from our parse(r) are domain objects for further processing within our application
(or structs, the difference is not important in this context).
We are going to use the following class with a mix of integer and string instance variables, in Swift:
@objc class TestClass: NSObject, Codable {
let hi:Int
let there:Int
let comment:String
...
}
To make it all easy to measure, we are going to use one million objects, which we are going to initialise with increasing integers and the constant string "comment". This yields the same 44MB JSON file with different serialisation methods, which can be correctly parsed by all the parsers tested. This is obviously a very simple class an file structure, but I think it gives a reasonable approximation for real-world use.
The first thing to check is how quickly we can create these objects straight in code, without any parsing.
That should give us a good upper
bound for the performance we can achieve when parsing to domain objects.
#define COUNT 1000000
-(void)createObjects
{
NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:COUNT+20];
for ( int i=0;i<COUNT;i++ ) {
TestClass *cur=[TestClass new];
cur.hi=i;
cur.there=i;
cur.comment=@"comment";
[objResult addObject:cur];
}
NSLog(@"Created objects in code w/o parsing %@ with %ld objects",objResult[0],[objResult count]);
}
On my Quad Core, 2.7Ghz MBP '18, this runs in 0.141 seconds. Although we aren't actually parsing, it would mean that
just creating all the objects that would result from parsingg our 44MB JSON file would yield a rate of 312 MB/s.
Wait a second! 312MB/s is almost 10x slower than Daniel Lemire's parser, the one that actually parses JSON, and we are only
creating the objects that would result if we were parsing, without doing any actual parsing.
This is one of the many unintuitive aspects of parsing performance: the actual low-level, character-level parsing is generally the
least important part for overall performance. Unless you do something crazy like use NSScanner. Don't use NSScanner. Please.
One reason this is unintuitive is that we all learned that performance is dominated by the innermost loop, and character level processing
is the innermost loop. But when you have magnitudes in performance differences and inner and outer loops
differ by less than that amount, the stuff happennnig in the outer loop can dominate.
NSJSONSerialization
Apple's JSON story very much revolves around NSJSONSerialization, very much like most of the rest of
its serialization story revolves around the very similar NSPropertyListSerialization class. It has
a reasonable quick implementation, turning the 44 MB JSON file into an NSArrray of NSDictionary
instances in 0.421 seconds when called from Objective-C, for a rate of 105 MB/s. From Swift, it takes 0.562 seconds, for 78 MB/s.
Of course, that gets us to a property list (array of dicts, in this case), not to the domain objects we actually want.
If you read my book (did I mention my book? Oh, I think I did), you will know that this type of dictonary
representation is fairly expensive: expensive to create, expensive in terms of memory consumption and
expensive to access. Just creating dictionaries equivalent to the objects we created before takes 0.321 seconds,
so around 2.5x the time for creating the equivalent objects and a "rate" of 137 MB/s relative to our 44 MB JSON file.
-(void)createDicts
{
NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:COUNT+20];
for ( int i=0;i<COUNT;i++ ) {
NSMutableDictionary *cur=[NSMutableDictionary dictionary];
cur[@"hi"]=@(i);
cur[@"there"]=@(i);
cur[@"comment"]=@"comment";
[objResult addObject:cur];
}
NSLog(@"Created dicts in code w/o parsing %@ with %ld objects",objResult[0],[objResult count]);
}
Creating the dict in a single step using a dictionary literal is not significantly faster, but creating
an immutable copy of the mutable dict after we're done filling brings the time to half a second.
Getting from dicts to objects is typically straightforward, if tedious: just fetch the entry of the dictionary
and call the corresponding setter with the value thus retrieved from the dictionary. As this isn't production
code and we're just trying to get some bounds of what is possible, there is an easier way: just use Key Value
Coding with the keys found in the dictionary.
The combined code, parsing and then creating the objects is shown below:
-(void)decodeNSJSONAndKVC:(NSData*)json
{
NSArray *keys=@[ @"hi", @"there", @"comment"];
NSArray *plistResult=[NSJSONSerialization JSONObjectWithData:json options:0 error:nil];
NSMutableArray *objResult=[NSMutableArray arrayWithCapacity:plistResult.count+20];
for ( NSDictionary *d in plistResult) {
TestClass *cur=[TestClass new];
for (NSString *key in keys) {
[cur setValue:d[key] forKey:key];
}
[objResult addObject:cur];
}
NSLog(@"NSJSON+KVC %@ with %ld objects",objResult[0],[objResult count]);
}
Note that KVC is slow. Really slow. Order-of-magnitude slower than just sending messages kind of slow, and so it has significant impact on the total time, which comes to a total of 1.142 seconds including parsing and object creation,
or just shy of 38 MB/s.
Swift JSON Coding
For the first couple of releases of Swift, JSON support by Apple was limited to a wrapped NSJSONSerialization, with the slight
performance penalty already noted. As I write in my book (see sidebar), many JSON "parsers" were published, but none of these
with the notable exception of the Big Nerd Ranch's Freddy were actual parses, they all just transformed the
arrays and dictionaries returned by NSJSONSerialization into Swift objects. Performance was
abysmal, with around 25x overhead in addition to the basic NSJSONSerialization parse.
Apple's Swift Codable promised to solve all that, and on the convenience front it certainly does
a great job.
func readJSONCoder(data:Data) -> [TestClass] {
NSLog("Swift Decoding")
let coder=JSONDecoder( )
let array=try! coder.decode([TestClass].self, from: data)
return array
}
(All the forcing is because this is just test code, please don't do this in production!). Alas, performance is
still not great: 4.39 seconds, or 10 MB/s. That's 10x slower than the basic NSJSONSerialization
parse and 4x slower than our slow but simple complete parse via NSJSONSerialization and KVC.
However, it is significantly faster than the previous third-party JSON to Swift objects "parsers", to
the tune of 3-4x. This is the old "first mark up 400% then discount 50%" sales trick applied to performance,
except that the relative numbers are larger.
Third Party JSON Parsers
I looked a little at third party JSON parsers, particularly JASON, STJSON and ZippyJSON.
STTJSON does not make any claims to speed and manages to clock in at 5 seconds, or just under 10 MB/s. JASON bills
itself as a "faster" JSON parser (they compare to SwiftyJSON), and does reasonably well at 0.75 seconds or 59 MB/s.
However both of these parse to their own internal representation, not to domain objects (or structs), and so should
be compared to NSJSONSerialization, at which point they both disappoint.
Probably the most interesting of these is ZippyJSON, as it uses Daniel Lemire's simdjson and is Codable
compatible. Alas, I couldn't get ZippyJSON to compile, so I don't
have numbers, but I will keep trying. They claim around 3x faster than Apple's JSONDecoder, which
would make it the only parser to be at least in the same ballpark as the trivial NSJSONSerialization + KVC method I showed above.
Another interesting tidbit comes from ZippyJSON's README, under the heading "Why is it so much faster".
Apple's version first converts the JSON into an NSDictionary using NSJSONSerialization and then afterwards makes things Swifty. The creation of that intermediate dictionary is expensive.
This is true by itself: first converting to an intermediate representation is slow, particularly one
that's as heavy-weight as property lists. However, it cannot be the primary reason, because creating that
expensive representation only takes 1/8th of the total running time. The other 7/8ths is Codable apparently
talking to itself. And speaking very s-l-o-w-l-y while doing that.
To corroborate, I also tried a the Flight-School implementation of Codable for MessagePack, which obviously does not use NSJSONSerialization.
It makes no performance claims and takes 18 seconds to decode the same
objects we used in the JSON files, of course with a different file that's 34 MB in size. Normalized to our 44 MB
file that would be 2.4 MB/s.
MAX and MASON
So where does that leave us? Considering what simdjs shows is theoretically possible with JSON parsing, we are
not in a good place, to put it mildly. 2.5 GB/s vs. 10 MB/s with Apple's JSONDecoder, several times slower than
NSJSONSerialization, which isn't exactly a speed daemon and around 30x slower than pure object creation. Comically bad might be another way of putting it. At least we're being entertained.
What can I contribute? Well, I've been through most of this once before with XML and the result was/is
MAX (Messaging API for XML), a parser that is not just super-fast itself (though no SIMD), but also
presents APIs that make it both super-convenient and also super-fast to go directly from the XML to
an object-representation, either as a tree or a stream of domain objects while using mostly constant
memory. Have I mentioned my book? Yeah, it's in the book, in gory detail.
Anyway, XML has sorta faded, so the question was whether the same techniques would work for a JSON parser.
The answer is yes, roughly, though with some added complexity and less convenience because JSON is a
less informative file format than XML. Open- and close-tags really give you a good heads-up as to what's
coming that "{" just does not.
The goal will be to produce domain objects at as close to the theoretical maximum of slightly more than 300 MB/s
as possible, while at the same time making the parser convenient to use, close to Swift Codable in convenience.
It won't support Codable per default, as the overheads seem to be too high, but ZippyJSON suggests that an
adapter wouldn't be too hard.
That parser is MPWMASONParser,
and no, it isn't done yet. In its initial state, it parses JSON to dictionaries in 0.58 seconds, or 76 MB/s and
slightly slower than NSJSONSerialization.
So we have a bit of way to go, come join me on this little parsing performance journey!
Back in 2014, I wrote a post titled Remove features for greater power, aka: Swift and Objective-C initializers. In this post, I compared the IMHO insane language rules for initialisation in Swift (at the time 14 pages in the Swift book) with the complete lack of such rules in Objective-C, or Smalltalk for that matter.
Chris was so kind to leave a comment stating that my desire for simplicity was incompatible with some specific goals they had for the language. My response was that maybe those goals were incompatible with simplicity. It's a matter of priorities.
A prediction I made was that these rules, despite or more likely because of their complexity, would not be sufficient. And that turned out to be correct, as predicted, people turned to workarounds, just like they did
with C++ and Java constructors.
Well, turns out I was correct beyond my wildest dreams: what are SwiftUI Function Builders if not a way to create/initialize complex object structures?
So I'll just come out and say that I called it. :-)
And while I obviously agree that a way to write down complex object structures is useful and important, and the mechanism is once again very clever, I will go out on a limb and claim that the pain that people are encountering now due to weird interactions with the language and type-system is not just due to an immature
implementation and growing pains. Of course things will get better, but the fundamental problems of complexity, restrictions, non-obvious interactions with the type-system etc. are essential, not accidental, and therefore can be expected to be with us for good.
UPDATE (2024)
I guess the Swift team finally cottoned on to it:
"By formalizing Objective-C's initialization conventions, we've ended up with a tower of complexity where users find it easier to do the wrong thing..."
A dynamic I see playing out again and again when it comes to software is the tension between incrementalism and radical change. On the one hand, there is a justified sense, backed by a lot of experience, that just tweaking what we have really doesn't cut it, that it's just rearranging the deck chairs on the Titanic. We obviously need radical change.
On the other hand, radical change that assumes we need to throw away what we (think we) know doesn't really cut it either, and the problem of all
that existing software and the techniques and technology we used to create it isn't just the pragmatics of the situation, with huge investments
in code and know-how. The fact that we are actually capable of creating all this software means that the radical position of "throw it all away, it's
wrong" isn't really tenable. Yes, there is something wrong with it, but it cannot actually be completely wrong.
So we are faced with a dilemma: incremental change and radical change are both obviously right and both obviously wrong. And so we get a lot
of shouting at each other, a lot of "change", but not a whole lot of progress.
The only way out I see is that change has to be both radical while also including the status quo, and the only way I can see of achieving that is if
it is a generalisation, sort of like quantum mechanics generalised classical mechanics, superseding classical mechanics but still including it as a special case. (Or how circles were generalised to ellipses etc.)