Some of the feedback I got indicated that the implications of the data presented in "Status Quo" were not as clear as they should have been, so a little analysis before we dive into code.
The MessagePack decode is the only "pure" Swift Codable decoder. As it is so slow as to make the rest of the graph almost unreadable and was only included for comparison, not actually being a JSON decoder, let's leave it out for now. In addition, let's show how much time of each result is the underlying parser and how much time is spent in object creation.
This chart immediately lays to rest two common hypotheses for the performance issues of Swift Codable:
- It's the object creation.
No.
That is, yes, object creation is slow compared to many other things, but here it represents only around 3% of the total runtime. Yes, finding a way to reduce that final 3% would also be cool (watch this space!), but how about tackling the 97% first?
- It's the fact that it is using
NSJSONSerialization
and therefore Objective-C under the hood that makes it slow.No.
Again, yes, parsing something to a dictionary-based representation that is more expensive than the final representation is not ideal and should be avoided. This is one of the things we will be doing. However:
- The
NSJSONSerialization
part of decoding makes up only 13% of the running time, the remaining 87% are in the Swift decoder part. - Turning the dictionaries into objects using Key-Value-Coding, which to me is just about the slowest
imaginable mechanism for getting data into an object that's not deliberately adding Rube-Goldberg
elements, "only" adds 740ms to the basic
NSJSONSerialization
's parse from JSON to dictionaries. While this is ~50% more time than the parse to dictionaies and 5x the pure object creaton time, it is still 5x less than the Codable overhead. - All the pure Swift parsers are also this slow or slower.
- The
NSJSONSerialization
.
That really only leaves the NSJSONSerialization
variants as useful comparison
points for what is to come, the rest is either too slow, doesn't do what we need it to do, or both.
Here we can see fairly clearly that creating objects instead of dictionaries would be better. Better than
creating dictionaries and certainly much better than first creating dictionaries and then objects,
as if that weren't obvious. It is also clear that the actual parsing of JSON text doesn't add all that
much extra overhead relative to just creating the dictionaries. In fact, just adding the -copy
to
convert from mutable dictionaries to immutable dictionaries appears to take more time than the parse!
In truth, it's actually not quite that way, because as far as I know, NSJSONSerialization
, like
its companion NSPropertyListSerialization
uses special dictionaries that are cheaper to
create from a textual representation.
simdjson
With all that in mind, it should be clear that simdjson, although it would likely take the pure parse time for that down to around 17 ms, is not that interesting, at lest at this stage. What it optimizes is the part that already takes the least time, and is already overwhelmed by even small changes in the way we create our objects.What this also means is that simdjson will only be useful if it doesn't make object creation slower. This is also a lesson I learned when creating the MAX XML parser: you can't just make the XML parser part as fast as possible, sometimes it makes sense to make the parser itself somewhat slower if that means other parts, such as object creation, significantly faster. Or more generally: it's not enough to have fast components, they have to play well together. Optimization is about systems and architecture. If you want to do it well.
MASON
In the next installment, we will start looking at the actual parser.TOC
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 1: The Status QuoSomewhat Less Lethargic JSON Support for iOS/macOS, Part 2: Analysis
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 3: Dematerialization
Equally Lethargic JSON Support for iOS/macOS, Part 4: Our Keys are Small but Legion
Less Lethargic JSON Support for iOS/macOS, Part 5: Cutting out the Middleman
Somewhat Faster JSON Support for iOS/macOS, Part 6: Cutting KVC out of the Loop
Faster JSON Support for iOS/macOS, Part 7: Polishing the Parser
Faster JSON Support for iOS/macOS, Part 8: Dematerialize All the Things!
Beyond Faster JSON Support for iOS/macOS, Part 9: CSV and SQLite
No comments:
Post a Comment