There's been a lot of
hoopla recently about
node.js. Being an object-head, I've always liked the idea of reactive (event-driven) web servers, after all, that means it's just like a typical object, sitting there waiting for something to happen and then reacting to it.
Of course, there is also a significant body of research on this topic,
showing for example that user-level thread implementations tend to
get very similar performance to event-based servers. There is also
the issue that the purity of "no blocking APIs" is somewhat naive on
a modern Unix, because blocking on I/O can happen in lots of different
non-obvious places. At the very least, you may encounter a page-fault,
and this may even be desirable in order to use memory mapped files.
In those cases, the fact that you have purified all your APIs makes
no difference, you are still blocked on I/O, and if you've completely
foregone kernel threads like node.js appears to do, then your entire
server is now blocked!
Anyway, baving seen some interesting node.js benchmarking, I was obviously curious to see how
my little embedded Objective-C http-server based on the awesome GNU microhttp stacked up.
The baseline is a typical static serving test, where Apache
(out-of-the box configuration on Mac OS X client)
serves a small static file and the two app servers serve a small
static string.
Platform | # requests/sec |
Static (via Apache) | 6651.58 |
Node.js | 5793.44 |
MPWHttp | 8557.83 |
The sleep(2) example showed node.js at it's best. Here,
each requests sleeps for 2 seconds before returning a
small result.
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 88.48 |
MPWHttp | 47.04 |
The compute example is where MPWHTTP shines. The task is trivial,
just counting up from 1 to 10000000 (ten million).
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 9.62 |
MPWHttp | 7698.65 |
So counting up, libµhttp with MPWHTTP is almost a thousand times faster? The reason is of course that such a simple task is taken care of by
strength reduction in the
optimizer, which replaces the loop 10 million increments with a single addition of 10 million. Cheating? On a benchmark, probably, but
on the other hand that's the sort of
benefit you get from a good optimizing compiler.
To make the comparison a little bit more fair, I added an xor with
a randomly initialized value so that the optimizer could not remove
the loop (verified by varying the loop count).
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 9.62 |
MPWHttp | 222.9 |
So still around 20 times faster. It was also using both cores of my
Mac Book Pro, whereas node.js was limited to 1 core (so 10x single core
speed difference).
Cross-checking on my 8 core Mac Pro gave the following results:
Platform | # requests/sec |
Static (via Apache) | - |
Node.js | 10.72 |
MPWHttp | 1011.86 |
Due to utilzing the available cores, MPWHTTP/µhttp is now
100 times faster than node.js on the compute-bound task.
In conclusion, I think it is fair to say that node.js succeeds
admirably in a certain category of tasks: lots of concurrency,
lots of blocked I/O, very little computation, very little memory
use so we don't page fault. In more typical mixes with some
concurrency, some computation some I/O and a bit of memory use
(so chances of paging), a more balanced approach may be better.