Sunday, April 12, 2020

Somewhat Less Lethargic JSON Support for iOS/macOS, Part 2: Analysis

In Part 1: The Status Quo, we saw that something isn't quite right with JSON procsesing in Apple land: while something like simdjson can accomplish the basic parsing task at a rate of 2.5 GB/s and creating objects happens at an equivalent rate of 310 MB/s, Swift's JSON Codable support manages a measly 10 MB/s, underperforming the MacBook Pro's built in SSD by at least 200x and a Gigabit network connection still by factor 10.

Some of the feedback I got indicated that the implications of the data presented in "Status Quo" were not as clear as they should have been, so a little analysis before we dive into code.

The MessagePack decode is the only "pure" Swift Codable decoder. As it is so slow as to make the rest of the graph almost unreadable and was only included for comparison, not actually being a JSON decoder, let's leave it out for now. In addition, let's show how much time of each result is the underlying parser and how much time is spent in object creation.

This chart immediately lays to rest two common hypotheses for the performance issues of Swift Codable:

  1. It's the object creation.

    No.

    That is, yes, object creation is slow compared to many other things, but here it represents only around 3% of the total runtime. Yes, finding a way to reduce that final 3% would also be cool (watch this space!), but how about tackling the 97% first?

  2. It's the fact that it is using NSJSONSerialization and therefore Objective-C under the hood that makes it slow.

    No.

    Again, yes, parsing something to a dictionary-based representation that is more expensive than the final representation is not ideal and should be avoided. This is one of the things we will be doing. However:

    • The NSJSONSerialization part of decoding makes up only 13% of the running time, the remaining 87% are in the Swift decoder part.
    • Turning the dictionaries into objects using Key-Value-Coding, which to me is just about the slowest imaginable mechanism for getting data into an object that's not deliberately adding Rube-Goldberg elements, "only" adds 740ms to the basic NSJSONSerialization's parse from JSON to dictionaries. While this is ~50% more time than the parse to dictionaies and 5x the pure object creaton time, it is still 5x less than the Codable overhead.
    • All the pure Swift parsers are also this slow or slower.
It also shows that stjson is not a contender (not that it ever claimed to be), because it is slower than even Swift's JSONDecoder without actually going to full objects. JASON is significantly faster, but also doesn't go to objects, and for not going to objects is still significantly slower than NSJSONSerialization. That really only leaves the NSJSONSerialization variants as useful comparison points for what is to come, the rest is either too slow, doesn't do what we need it to do, or both.

Here we can see fairly clearly that creating objects instead of dictionaries would be better. Better than creating dictionaries and certainly much better than first creating dictionaries and then objects, as if that weren't obvious. It is also clear that the actual parsing of JSON text doesn't add all that much extra overhead relative to just creating the dictionaries. In fact, just adding the -copy to convert from mutable dictionaries to immutable dictionaries appears to take more time than the parse!

In truth, it's actually not quite that way, because as far as I know, NSJSONSerialization, like its companion NSPropertyListSerialization uses special dictionaries that are cheaper to create from a textual representation.

simdjson

With all that in mind, it should be clear that simdjson, although it would likely take the pure parse time for that down to around 17 ms, is not that interesting, at lest at this stage. What it optimizes is the part that already takes the least time, and is already overwhelmed by even small changes in the way we create our objects.

What this also means is that simdjson will only be useful if it doesn't make object creation slower. This is also a lesson I learned when creating the MAX XML parser: you can't just make the XML parser part as fast as possible, sometimes it makes sense to make the parser itself somewhat slower if that means other parts, such as object creation, significantly faster. Or more generally: it's not enough to have fast components, they have to play well together. Optimization is about systems and architecture. If you want to do it well.

MASON

In the next installment, we will start looking at the actual parser.

No comments: