Thanks to Todd Blanchard for providing the necessary impetus to learn git.
Monday, January 31, 2011
Objective-XML and MPWFoundation now available on github
Tuesday, January 18, 2011
On switching away from CoreData
Rather, the issues we have had with CoreData were additional complexity and more importantly gratuitous dependencies that, at least for our application, were not offset by noticeable benefits.
One of the most significant structural dependencies is that CoreData requires all your model classes to be subclasses of NSManagedObject, a class provided by CoreData. This may not seem like a big problem at first, but it gets in the way of defining a proper DomainModel, which should always be independent. The Java community actually figured this out a while ago, which is why there was a recent move to persistence frameworks supporting POJOs. (Of course, POOO doesn't have quite the same ring to it, and also the Java frameworks were a lot more heavy-handed than CoreData). The model is where your value is, it should be unencumbered. For example, when we started looking at the iPhone, there was no CoreData there, so we faced the prospect of duplicating all our model code.
In addition to initially not having CoreData, the iPhone app also used (and still uses) a completely different persistence mechanism (more feed oriented), and there were other applications where yet a third persistence mechanism was used (more document centric than DB-centric, with an externally defined file format). A proper class hierarchy would have had an abstract superclass without any reference to a specific persistence mechanism, but capturing the domain knowledge of our model. With CoreData, this hierarchy was impossible.
Since we had externally defined file formats in every case, we had to write an Atomic Store adapter and thus also couldn't really benefit from CoreData's change management. When we did the move, it turned out that the Atomic Store adapter we had written was significantly more code than just serializing and de-serializing the XML ourselves.
Another benefit of CoreData is its integration with Bindings, but that also turned out to be of little use to us. The code we managed to save with Bindings was small and trivial, whereas the time and effort to debug bindings when they went wrong or to customize them for slightly specialized needs was very, very large. So we actually ditched Bindings a long time before we got rid of CoreData.
So why was CoreData chosen in the first place? Since I wasn't around for that decision, I don't know 100%, but as far as I can tell it was mostly "Shiny Object Syndrome". CoreData and Bindings were new Apple technologies at the time, therefore they had to be used.
So are there any lessons here? The first would be to avoid Shiny Object Syndrome. By all means have fun and play around, but not in production code. Second and related is to really examine your needs. CoreData is probably highly appropriate in many contexts, it just wasn't in ours. Finally, it would be a huge improvement if CoreData were to support Plain Old Objective-C Objects. In fact, if that were the case we probably would not have to ditch it.
Monday, January 10, 2011
Little Message Dispatch
I feel much the same way, that is although I think Grand Central Dispatch is awesome, I simply haven't been able to justify spending much time with it, because it usually turns out that my own threading needs so far have been far more modest than what GCD provides. In fact, I find that an approach that's even more constrained than the one based on NSOperationQueue that Brent describes has been working really well in a number of projects.
Instead of queueing up operations and letting them unwind however, I just spawn a single I/O thread (at most a few) and then have that perform the I/O deterministically. This is paired with a downloader that uses the NSURL loading system to download any number of requests in parallel.
This loads 3 types of objects: first the thumbnails, then article content, then images associated with the articles. The sequencing is both deliberate (thumbs first, article images cannot be loaded before the article content is present) and simply expressed in the code by the well-known means of just writing the actions one after the other, rather than having those dependencies expressed in call-backs, completion blocks or NSOperation subclasses.
- (void)downloadNewsContent { id pool=[NSAutoreleasePool new]; [[self downloader] downloadRequests:[self thumbnailRequests]]; [[self downloader] downloadRequests:[self contentRequests]]; [[self downloader] downloadOnlyRequests:[self imageRequests]]; [pool release]; }
So work is done semi-sequentially in the background, while coordination is done on the main thread, with liberal use of performSelectorOnMainThread. Of course, I make that a little simpler with a couple of HOMs that dispatch messages to threads:
- async runs the message on a new thread, I use it for long-running, intrinsically self contained work. It is equivalent to performSelectorInBackground: except for being able to take an arbitrary message.
- asyncOnMainThread and syncOnMainThread are the equivalents of performSelectorOnMainThread, with the waitUntilDone flag set to YES or NO
- afterDelay: sends he message after the specified delay
Brent sums it up quite well in his post:
-(void)loadSections { [[self asyncOnMainThread] showSyncing]; [[[self sections] do] downloadNewsContent]; [[self asyncOnMainThread] showDoneSyncing]; } ... -(IBAction)syncButtonClicked { [[self async] loadSections]; }
Here’s the thing about code: the better it is, the more it looks and reads like a children’s book.Yep.
Monday, January 3, 2011
Node.js performance? µhttpd performance!
Of course, there is also a significant body of research on this topic, showing for example that user-level thread implementations tend to get very similar performance to event-based servers. There is also the issue that the purity of "no blocking APIs" is somewhat naive on a modern Unix, because blocking on I/O can happen in lots of different non-obvious places. At the very least, you may encounter a page-fault, and this may even be desirable in order to use memory mapped files.
In those cases, the fact that you have purified all your APIs makes no difference, you are still blocked on I/O, and if you've completely foregone kernel threads like node.js appears to do, then your entire server is now blocked!
Anyway, baving seen some interesting node.js benchmarking, I was obviously curious to see how my little embedded Objective-C http-server based on the awesome GNU microhttp stacked up.
The baseline is a typical static serving test, where Apache (out-of-the box configuration on Mac OS X client) serves a small static file and the two app servers serve a small static string.
Platform | # requests/sec |
Static (via Apache) | 6651.58 |
Node.js | 5793.44 |
MPWHttp | 8557.83 |
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 88.48 |
MPWHttp | 47.04 |
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 9.62 |
MPWHttp | 7698.65 |
To make the comparison a little bit more fair, I added an xor with a randomly initialized value so that the optimizer could not remove the loop (verified by varying the loop count).
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 9.62 |
MPWHttp | 222.9 |
Cross-checking on my 8 core Mac Pro gave the following results:
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 10.72 |
MPWHttp | 1011.86 |
In conclusion, I think it is fair to say that node.js succeeds admirably in a certain category of tasks: lots of concurrency, lots of blocked I/O, very little computation, very little memory use so we don't page fault. In more typical mixes with some concurrency, some computation some I/O and a bit of memory use (so chances of paging), a more balanced approach may be better.