Friday, December 9, 2011

Ruby (and Rails) scalability?

Recently I wrote about Node.jsperformance, comparing it to my (still non-public, sorry!) Objective-C based libµhttp wrapper and Apache on my main development machine of the time, a MacBook Pro.

Node.js did really well on tasks that have lots of concurrent requests that are mostly waiting, did OK on basic static serving tasks and not so well on compute-intensive tasks.

Having developed an interest in minimal web-servers, I wondered how Sinatra and, by association, Ruby on Rails would do.

For Sinatra I used the scanty blog engine and the basic "Hello World" example:

require 'sinatra'

get '/hi' do
  "Hello World!"

For Ruby on Rails, I used the blog tutorial "out of the box", invoking it with "rails s" to start the server. In addition I also had RoR just serving a static file instead of the database-backed blog. All this on my new dev machine, a 2011 MacBook Air with 1.8 GHz Intel Core i7 and 4 GB of DRAM. I also discovered that httperf is a much better benchmark program for my needs than ab. I used it with 100 requests per connection, a burst length of 100 and a sufficient number of connections to get stable results without taking all day.

Platform# requests/sec
Sinatra Hello World357
Ruby on Rails static page312
Sinatra scanty blog101
Ruby on Rails blog17
This seems really slow, even when doing virtually nothing, and absolutely abysmal when using the typical RoR setup, serving records from an SQL database via templates. I distinctly remember my G5 serving in the thousands of requests/s using WebObjects 6-7 years ago.

Is this considered normal or am I doing something wrong?

Thursday, July 21, 2011

The surprising thing about Objective-C...

Alex Payne, when asked "Are you surprised at the popularity of any current languages?" during the preview of emerging languages for OSCON 2011'semerging languages track:
Alex Payne: I'm constantly surprised at the popularity and success of Objective-C. Almost everyone I know tells the same story about Objective-C: they started learning it and they hated it. They thought it was the worst of C and the worst of dynamic languages. And then eventually, they learned to love it. Most of the time, that software works pretty darn well, so who am I to judge? I'm pleasantly surprised by the continued success of Objective-C, and I think it should be taken as a lesson for the language designers out there.
This is echoed by the the first (and as of this writing only) comment to the post:
Alasdair Allan [18 July 2011 10:09 AM] I certainly agree with Alex about Objective-C, when I was initially learning the language I deeply despised it. Now I love it, and think it's one of the more elegant and powerful of the (many) languages I know. Definitely a lesson to language designers, do what you think is right and ignore the crowds. If you are right people will grow to love your language, just as soon as they figure it out.
I actually liked Objective-C pretty much from the start, but then again at that time (1986) there simply wasn't anything close that I had access to, and writing an Objective-C pre-processor and runtime on my Amiga was simply more feasible than a C++ frontend or a complete Smalltalk VM.

Modifying the sentiment expressed slightly, I'd say that from a theoretical point of view, I hate Objective-C and think it's a bad joke, a trainwreck. However, from practical experience, I love it and find it's one of the most productive languages out there for actually building stuff. And no, it's not just about the frameworks, as I've used Objective-C in non-NeXT, non-Apple environments where we had to build most of our own frameworks.

So while I support Alasdair's comment, my lesson for language designers is that our theory appears to not be particularly good at predicting reality. In other words: our theory sucks has many research opportunities.

Wednesday, April 27, 2011

Lazy Initialization

Traviscautions against lazy initialization. Spooky coincidence: I just managed to fix an extremely mysterious memory smasher in an Objective-C program's exception handling code by moving the lazily initialized localization code to app startup. Not sure wether localizing exceptions is really such a good idea in the first place, but having the localization code run inside the exception handling code does seem pushing it a bit.

So couldn't agree more.

Saturday, March 12, 2011

Speed matters

Greg Linden recounts Marissa Mayer's talk at Web 2.0 showing how even very small decreases in performance have highly measurable impacts on users, and for web businesses on the bottom line.

The change was an increase from 10 to 30 search results, which was expected to produce an increase in user satisfaction, because users had asked for more search results. Instead, there was a completely unexpected and at first inexplicable 20% drop in traffic after the change was implemented. Only after some time did the team discover that the new results page took half a second longer to display, and in further testing they found that every 100 ms delay caused a measurable drop in clicks.

While I am not aware of similar research on desktop apps, I am sure that the same principle applies: speed matters, a lot; and it matters pre-consciously, that is, long before users will mention speed as an issue.

Thursday, February 17, 2011

The experienced craftsman plans less

Christopher Alexander via 37 signals:
The essence of this process is very fundamental indeed. We may understand it best by comparing the work of a fifty-year-old carpenter with the work of a novice. The experienced carpenter keeps going. He doesn’t have to keep stopping, because every action he performs, is calculated in such a way that some later action can put it right to the extent that it is imperfect now. What is critical here, is the sequence of events. The carpenter never takes a step which he cannot correct later; so he can keep working, confidently, steadily.

The novice by comparison, spends a great deal of his time trying to figure out what to do. He does this essentially because he knows that an action he takes now may cause unretractable problems a little further down the line; and if he is not careful, he will find himself with a joint that requires the shortening of some crucial member – at a stage when it is too late to shorten that member. The fear of these kinds of mistakes forces him to spend hours trying to figure ahead: and it forces him to work as far as possible to exact drawings because they will guarantee that he avoids these kinds of mistakes.

The difference between the novice and the master is simply that the novice has not learnt, yet, how to do things in such a way that he can afford to make small mistakes. The master knows that the sequence of his actions will always allow him to cover his mistakes a little further down the line. It is this simple but essential knowledge which gives the work of a master carpenter its wonderful, smooth, relaxed, and almost unconcerned simplicity.

Mac App Store won't let me buy apps: solution

Just tried to buy an app via the Mac App Store and it was absolutely refusing to take my money. Various suggestions I've seen on the web such as clearing caches,resetting them via iTunes advanced preferences, rebooting, retrying, using slight variations of my account name all made no difference whatsoever.

The solution turned out to be manually signing in using the Store menu (manually sign out if you are already signed in). At that point I was allowed to update/verify my billing information and subsequent purchase attempts worked.

In previous attempts, I had not signed in manually, but rather had the App Store do the sign-in after I attempted to purchase.

So needs a little more work...

Train wreck management

I wish the article Train Wreck Management by Poppendieck didn't strike such a chord.

Tuesday, February 15, 2011

Only 1 GHz

Tim Bray wonders (well, wondered a while ago) about the excellent perceived performance of the iPad at 'only' 1Ghz.


1 GHz actually seems like quite a lot to me. 1000 times an Apple ][, 40 times a NeXT, and the latter was driving a megapixel display. I guess we have gotten used to wasted cycles.


One things that's been tripping me up a bit when writing code that's supposed to be portable between iOS and Cocoa is the removal of NSPoint, NSSize, NSRect and their associated functions from Foundation in iOS. This is a real shame, because otherwise the Foundations are highly compatible.

One way to rectify this situation would be to start using CG* structs and functions on the desktop as well. However, this introduces a dependency on CoreGraphics that shouldn't be there for Foundation-based code.

My alternative is to standardize on NSPoint and friends, and map those to their CG alternatives on iOS. That way, I have minimized my dependencies, with only a small header file to pay for it: PhoneGeomtry.h

This is now part of MPWFoundation (on github).

//  PhoneGeometry.h
//  MPWFoundation
//  Created by Marcel Weiher on 11/11/10.
//  Copyright 2010-2011 Marcel Weiher. All rights reserved.

#import <CoreGraphics/CoreGraphics.h>
typedef CGRect NSRect;
typedef CGPoint NSPoint;
typedef CGSize NSSize;
#define NSMakeRect  CGRectMake
#define NSMakePoint CGPointMake
#define NSMakeSize  CGSizeMake
#define NSEqualPoints  CGPointEqualToPoint
#define NSEqualRects   CGRectEqualToRect
#define NSIntersectsRect  CGRectIntersectsRect
static inline NSString *NSStringFromRect( CGRect r ) { return [NSString stringWithFormat:@"(%g,%g - %g,%g)",r.origin.x,r.origin.y,r.size.width,r.size.height]; }
static inline NSString *NSStringFromPoint( CGPoint p ) { return [NSString stringWithFormat:@"(%g,%g)",p.x,p.y]; }
static inline NSString *NSStringFromSize( CGSize s ) { return [NSString stringWithFormat:@"(%g,%g)",s.width,s.height]; }


Tuesday, February 1, 2011

Objective-XML and MPWFoundation now available on github

By popular demand, both Objective-XML and MPWFoundation are now available on github. Which means I am finally learning far it looks very nice and I am impressed with the performance.

Thanks to Todd Blanchard for providing the necessary impetus to learn git.

Tuesday, January 18, 2011

On switching away from CoreData

Like Brent Simmons, I have a project where I am currently in the process of switching away from CoreData. Unlike Brent, and somewhat surprisingly given my proclivities, the reason is not performance.

Rather, the issues we have had with CoreData were additional complexity and more importantly gratuitous dependencies that, at least for our application, were not offset by noticeable benefits.

One of the most significant structural dependencies is that CoreData requires all your model classes to be subclasses of NSManagedObject, a class provided by CoreData. This may not seem like a big problem at first, but it gets in the way of defining a proper DomainModel, which should always be independent. The Java community actually figured this out a while ago, which is why there was a recent move to persistence frameworks supporting POJOs. (Of course, POOO doesn't have quite the same ring to it, and also the Java frameworks were a lot more heavy-handed than CoreData). The model is where your value is, it should be unencumbered. For example, when we started looking at the iPhone, there was no CoreData there, so we faced the prospect of duplicating all our model code.

In addition to initially not having CoreData, the iPhone app also used (and still uses) a completely different persistence mechanism (more feed oriented), and there were other applications where yet a third persistence mechanism was used (more document centric than DB-centric, with an externally defined file format). A proper class hierarchy would have had an abstract superclass without any reference to a specific persistence mechanism, but capturing the domain knowledge of our model. With CoreData, this hierarchy was impossible.

Since we had externally defined file formats in every case, we had to write an Atomic Store adapter and thus also couldn't really benefit from CoreData's change management. When we did the move, it turned out that the Atomic Store adapter we had written was significantly more code than just serializing and de-serializing the XML ourselves.

Another benefit of CoreData is its integration with Bindings, but that also turned out to be of little use to us. The code we managed to save with Bindings was small and trivial, whereas the time and effort to debug bindings when they went wrong or to customize them for slightly specialized needs was very, very large. So we actually ditched Bindings a long time before we got rid of CoreData.

So why was CoreData chosen in the first place? Since I wasn't around for that decision, I don't know 100%, but as far as I can tell it was mostly "Shiny Object Syndrome". CoreData and Bindings were new Apple technologies at the time, therefore they had to be used.

So are there any lessons here? The first would be to avoid Shiny Object Syndrome. By all means have fun and play around, but not in production code. Second and related is to really examine your needs. CoreData is probably highly appropriate in many contexts, it just wasn't in ours. Finally, it would be a huge improvement if CoreData were to support Plain Old Objective-C Objects. In fact, if that were the case we probably would not have to ditch it.

Monday, January 10, 2011

Little Message Dispatch

Brent Simmons's recent notes on threading show a great, limited approach to threading that appears to work well in practice. If you haven't read it and are at all interested in threading on OS X or iOS, I suggest you head over there right now.

I feel much the same way, that is although I think Grand Central Dispatch is awesome, I simply haven't been able to justify spending much time with it, because it usually turns out that my own threading needs so far have been far more modest than what GCD provides. In fact, I find that an approach that's even more constrained than the one based on NSOperationQueue that Brent describes has been working really well in a number of projects.

Instead of queueing up operations and letting them unwind however, I just spawn a single I/O thread (at most a few) and then have that perform the I/O deterministically. This is paired with a downloader that uses the NSURL loading system to download any number of requests in parallel.

- (void)downloadNewsContent
        id pool=[NSAutoreleasePool new];
        [[self downloader] downloadRequests:[self thumbnailRequests]];
        [[self downloader] downloadRequests:[self contentRequests]];
        [[self downloader] downloadOnlyRequests:[self imageRequests]];
        [pool release];

This loads 3 types of objects: first the thumbnails, then article content, then images associated with the articles. The sequencing is both deliberate (thumbs first, article images cannot be loaded before the article content is present) and simply expressed in the code by the well-known means of just writing the actions one after the other, rather than having those dependencies expressed in call-backs, completion blocks or NSOperation subclasses.

So work is done semi-sequentially in the background, while coordination is done on the main thread, with liberal use of performSelectorOnMainThread. Of course, I make that a little simpler with a couple of HOMs that dispatch messages to threads:

  • async runs the message on a new thread, I use it for long-running, intrinsically self contained work. It is equivalent to performSelectorInBackground: except for being able to take an arbitrary message.
  • asyncOnMainThread and syncOnMainThread are the equivalents of performSelectorOnMainThread, with the waitUntilDone flag set to YES or NO
  • afterDelay: sends he message after the specified delay
Here is a bit of code that shows how to have a dispatch a long-running thread and have it communicate status to the main thread.

-(void)loadSections {
	[[self asyncOnMainThread] showSyncing];
	[[[self sections] do] downloadNewsContent];
	[[self asyncOnMainThread] showDoneSyncing];
 -(IBAction)syncButtonClicked {
	[[self async] loadSections];

Brent sums it up quite well in his post:
Here’s the thing about code: the better it is, the more it looks and reads like a children’s book.

Tuesday, January 4, 2011

Node.js performance? µhttpd performance!

There's been a lot of hoopla recently about node.js. Being an object-head, I've always liked the idea of reactive (event-driven) web servers, after all, that means it's just like a typical object, sitting there waiting for something to happen and then reacting to it.

Of course, there is also a significant body of research on this topic, showing for example that user-level thread implementations tend to get very similar performance to event-based servers. There is also the issue that the purity of "no blocking APIs" is somewhat naive on a modern Unix, because blocking on I/O can happen in lots of different non-obvious places. At the very least, you may encounter a page-fault, and this may even be desirable in order to use memory mapped files.

In those cases, the fact that you have purified all your APIs makes no difference, you are still blocked on I/O, and if you've completely foregone kernel threads like node.js appears to do, then your entire server is now blocked!

Anyway, baving seen some interesting node.js benchmarking, I was obviously curious to see how my little embedded Objective-C http-server based on the awesome GNU microhttp stacked up.

The baseline is a typical static serving test, where Apache (out-of-the box configuration on Mac OS X client) serves a small static file and the two app servers serve a small static string.

Platform # requests/sec
Static (via Apache) 6651.58
Node.js 5793.44
MPWHttp 8557.83
The sleep(2) example showed node.js at it's best. Here, each requests sleeps for 2 seconds before returning a small result.
Platform # requests/sec
Static (via Apache) -
Node.js 88.48
MPWHttp 47.04
The compute example is where MPWHTTP shines. The task is trivial, just counting up from 1 to 10000000 (ten million).
Platform # requests/sec
Static (via Apache) -
Node.js 9.62
MPWHttp 7698.65
So counting up, libµhttp with MPWHTTP is almost a thousand times faster? The reason is of course that such a simple task is taken care of by strength reduction in the optimizer, which replaces the loop 10 million increments with a single addition of 10 million. Cheating? On a benchmark, probably, but on the other hand that's the sort of benefit you get from a good optimizing compiler.

To make the comparison a little bit more fair, I added an xor with a randomly initialized value so that the optimizer could not remove the loop (verified by varying the loop count).

Platform # requests/sec
Static (via Apache) -
Node.js 9.62
MPWHttp 222.9
So still around 20 times faster. It was also using both cores of my Mac Book Pro, whereas node.js was limited to 1 core (so 10x single core speed difference).

Cross-checking on my 8 core Mac Pro gave the following results:

Platform # requests/sec
Static (via Apache) -
Node.js 10.72
MPWHttp 1011.86
Due to utilzing the available cores, MPWHTTP/µhttp is now 100 times faster than node.js on the compute-bound task.

In conclusion, I think it is fair to say that node.js succeeds admirably in a certain category of tasks: lots of concurrency, lots of blocked I/O, very little computation, very little memory use so we don't page fault. In more typical mixes with some concurrency, some computation some I/O and a bit of memory use (so chances of paging), a more balanced approach may be better.