Monday, April 20, 2020

Somewhat Faster JSON Support for iOS/macOS, Part 6: Cutting KVC out of the Loop

Last time, we actually made some significant headway by taking advantage of the dematerialisation of the plist intermediate representation. So instead of first producing an array of dictionaries, we went directly from JSON to the final object representation.

This got us down from around 1.1 seconds to a little over 600 milliseconds.

It was accomplished by using the Key Value Coding method setValue:forKey: to directly set the attributes of the objects from the parsed JSON. Oh, and instantiating those objects in the first place, instead of dictionaries.

That this should be so much faster than most other methods, for example beating Swift's JSONDecoder() by a cool 7x, is a little surprising, given that KVC is, as I mentioned in the first article of the series, the slowest mechanism for getting data in and out of objcets short of deliberate Rube Goldber Mechanisms.

What is KVC and why is it slow?

Key Value Coding was a core part of NeXT's Enterprise Object Framework, introduced in 1994.
Key-value coding is a data access mechanism in which the properties of an object are accessed indirectly by key or name, rather than directly as fields or by invocation of accessor methods. It is used throughout Enterprise Objects but is perhaps most useful to you when accessing data in relationships between enterprise objects.

Key-value coding enables the use of keypaths to traverse relationships. For example, if a Person entity has a relationship called toPhoto whose destination entity (called PersonPhoto) contains an attribute called photo, you could access that data by traversing the keypath toPhoto.photo from within a Person object.

Keypaths are just one way key-value coding is an invaluable feature of Enterprise Objects. In general, though, it is most useful in providing a consistent way to access an object's data. Rather than needing to know if an object's data members have accessor methods, what the names of those accessor methods are, or if the data is accessible through fields, all you need to know are the keys that represent an object’s data. Key-value coding automatically finds the data, regardless of how the object provides its data. In this context, key-value coding satisfies the classic design pattern principle to “encapsulate the things that varies.”

It still is an extremely powerful programming technique that lets us write algorithms that work generically with any object properties, and is currently the basis for CoreData, AppleScript support, Key Value Observing and Bindings. (Though I am somewhat skeptical of some of these, not least for performance reasons, see The Siren Call of KVO and (Cocoa) Bindings). It was also part of the inspiration for Polymorphic Identifiers.

The core of KVC are the valueForKey: and setValue:forKey: messages, which have default implementations in NSObject. These default implementations take the NSString key, derive an accessor message from that key and then send the message, either setting or returning a value. If the value that the underlying message takes/returns is a non-object type, then KVC wraps/unwraps as necessary.

If this sounds expensive, then that's because it is. To derive the set accessor from the key, the first character of the key has to be capitalized, the the string "set" prepended and the string converted to an Objective-C selector (SEL). In theory, this has to be done on every call to one of the KVC methods, and it has to be done with NSString objects, which do a fantastic job of representing human-visible text, but are a bit heavy-weight for low-level work.

Doing the full computation on every invocation would be way too expensive, so Apple caches some of the intermediate results. As there is no obvious place to put those intermediate results, they are placed in global hash tables, keyed by class and property/key name. However, even those lookups are still significantly more expensive than the final set or get property accesss, and we have to do multiple lookups. Since theses tables have to be global, locking is also required.

ValueAccessor

All this expense could be avoided if we had a custom object to mediate the access, rather than a naked NSString. That object could store those computed values, and then provide fast and generic access to arbitrary properties. Enter MPWValueAccesssor (.h .m).

A word of warning: unlike MPWStringtable, MPWValueAccesssor is mostly experimental code. It does have tests and largely works, but it is incomplete in many ways and also contains a bunch of extra and probably extraneous ideas. It is sufficient for our current purpose.

The core of this class is the AccessPathComponent struct.


typedef struct {
    Class   targetClass;
    int     targetOffset;
    SEL     getSelector,putSelector;
    IMP0    getIMP;
    IMP1    putIMP;
    id      additionalArg;
    char    objcType;
} AccessPathComponent;

This struct contains a number of different ways of getting/setting the data:
  1. the integer offset into the object where the ivar is located
  2. a pair of Objective-C selectors/message names, one for getting, one for setting.
  3. a pair of function pointers to the Objective-C methods that the respective selectors resolve to
  4. the additional arg is the key, to be used for keyed access
The getIMP and putImp are initialized to objc_msgSend(), so they can always be used. If we bind the ValueAccessor to a class, those function pointers get resolved to the actual getter/setter methods. In addition the objcType gets set to the type of the instance variable, so we can do automatic conversions like KVC. (This was some code I actually had to add between the last instalment and the current one.)

The key takeaway is that all the string processing and lookup that KVC needs to do on every call is done once during initialization, after that it's just a few messages and/or pre-resolved function calls.

Hooking up the ValueAccessor

Adapting the MPWObjectBuilder (.h .m) to use MPWValueAccessor was much easier than I had expected. Thee following shows the changes made:
@property (nonatomic, strong) MPWSmallStringTable *accessorTable;

...

-(void)setupAcceessors:(Class)theClass
{
    NSArray *ivars=[theClass ivarNames];
    ivars=[[ivars collect] substringFromIndex:1];
    NSMutableArray *accessors=[NSMutableArray arrayWithCapacity:ivars.count];
    for (NSString *ivar in ivars) {
        MPWValueAccessor *accessor=[MPWValueAccessor valueForName:ivar];
        [accessor bindToClass:theClass];
        [accessors addObject:accessor];
    }
    MPWSmallStringTable *table=[[[MPWSmallStringTable alloc] initWithKeys:ivars values:accessors] autorelease];
    self.accessorTable=table;
}

-(void)writeObject:anObject forKey:aKey
{
    MPWValueAccessor *accesssor=[self.accessorTable objectForKey:aKey];
    [accesssor setValue:anObject forTarget:*tos];
}



The bulk of the changes come as part of the new -setupAccessors: method. It first asks the class what its instance variables are, creates a value accessor for that instance variabl(-name), binds the accessor to the class and finally puts the accessors in a lookup table keyed by name.

The -writeObject:forKey: method is modified to look up and use a value accessor instead of using KVC.

Results

The parsing driver code didn't have to be changed, re-running it on our non-representative 44 MB JSON file yields the following time:

441 ms.

Now we're really starting to get somewhere! This is just shy of 100 MB/s and 10x faster then Swift's JSONDecoder, and within 5% of raw NSJSONSerialization.

Analysis and next steps

Can we do better? Why yes, glad you asked. Let's have a look at the profile.

First thing to note is that object-creation (beginDictionary) is now the #1 entry under the parse, as it should be. This is another indicator that we are not just moving in the right direction, but also closing in on the endgame.

However, there is still room for improvement. For example, although actually searching the SmallStringTable for the ValueAccessor (offsetOfCStringWithLengthInTableOfLength()) takes only 2.7% of the time, about the same as getting the internal char* out of a CFString via the fast-path (CFStringGetCStringPtr()), the total time for the -objectForKey: is a multiple of that, at 13%. This means that unwrapping the NSString takes more time than doing the actual work. Wrapping the char* and length into an NSString also takes significant time, and all of this work is redundant...we would be better of just passing along the char* and length.

A similar wrap/unwrap situation occurs with integers, which we first turn into NSNumbers, only to immediately get the integer out again so we can set it.

objc_msgSend() also starts getting noticeable, so looking at a bit of IMP-caching and just eliminating unnecessary indirection also seems like a good idea.

That's another aspect of optimization work: while the occasional big win is welcome, getting to truly outstanding performance means not being satisfied with that, but slogging through all the small-ish seeming detail.

Note

I can help not just Apple, but also you and your company with performance and agile coaching, workshops and consulting. Contact me at info at metaobject.com.

No comments: