Tuesday, January 4, 2011

Node.js performance? µhttpd performance!

There's been a lot of hoopla recently about node.js. Being an object-head, I've always liked the idea of reactive (event-driven) web servers, after all, that means it's just like a typical object, sitting there waiting for something to happen and then reacting to it.

Of course, there is also a significant body of research on this topic, showing for example that user-level thread implementations tend to get very similar performance to event-based servers. There is also the issue that the purity of "no blocking APIs" is somewhat naive on a modern Unix, because blocking on I/O can happen in lots of different non-obvious places. At the very least, you may encounter a page-fault, and this may even be desirable in order to use memory mapped files.

In those cases, the fact that you have purified all your APIs makes no difference, you are still blocked on I/O, and if you've completely foregone kernel threads like node.js appears to do, then your entire server is now blocked!

Anyway, baving seen some interesting node.js benchmarking, I was obviously curious to see how my little embedded Objective-C http-server based on the awesome GNU microhttp stacked up.

The baseline is a typical static serving test, where Apache (out-of-the box configuration on Mac OS X client) serves a small static file and the two app servers serve a small static string.

Platform # requests/sec
Static (via Apache) 6651.58
Node.js 5793.44
MPWHttp 8557.83
The sleep(2) example showed node.js at it's best. Here, each requests sleeps for 2 seconds before returning a small result.
Platform # requests/sec
Static (via Apache) -
Node.js 88.48
MPWHttp 47.04
The compute example is where MPWHTTP shines. The task is trivial, just counting up from 1 to 10000000 (ten million).
Platform # requests/sec
Static (via Apache) -
Node.js 9.62
MPWHttp 7698.65
So counting up, libµhttp with MPWHTTP is almost a thousand times faster? The reason is of course that such a simple task is taken care of by strength reduction in the optimizer, which replaces the loop 10 million increments with a single addition of 10 million. Cheating? On a benchmark, probably, but on the other hand that's the sort of benefit you get from a good optimizing compiler.

To make the comparison a little bit more fair, I added an xor with a randomly initialized value so that the optimizer could not remove the loop (verified by varying the loop count).

Platform # requests/sec
Static (via Apache) -
Node.js 9.62
MPWHttp 222.9
So still around 20 times faster. It was also using both cores of my Mac Book Pro, whereas node.js was limited to 1 core (so 10x single core speed difference).

Cross-checking on my 8 core Mac Pro gave the following results:

Platform # requests/sec
Static (via Apache) -
Node.js 10.72
MPWHttp 1011.86
Due to utilzing the available cores, MPWHTTP/µhttp is now 100 times faster than node.js on the compute-bound task.

In conclusion, I think it is fair to say that node.js succeeds admirably in a certain category of tasks: lots of concurrency, lots of blocked I/O, very little computation, very little memory use so we don't page fault. In more typical mixes with some concurrency, some computation some I/O and a bit of memory use (so chances of paging), a more balanced approach may be better.

2 comments: