Saturday, May 12, 2012

Found another reason for code signing to fail

I usually only build in Release mode, even when developing, both because of potential differences between release and debug modes and because of difficulties with dependencies in Xcode. I was always a little worried that this would bite me, especially with iOS, because here Release means building for distribution, right?

Surprisingly, it never did, until I installed an Xcode upgrade just now. Suddenly, my in-development iPad app stopped building for my attached device (development mode). The error was

"warning: Application failed codesign verification. The signature was invalid, contains disallowed entitlements, or it was not signed with an iPhone Distribution Certificate. (-19011)"
After doing all the usual certificate song-and-dance routines, I finally figured out that a flag was (now?) set in the build settings: "Validate Build Product", which was set to "Yes" for "Release" and "No" for "Debug".

Never having seen it before, I can't say wether the setting itself had changed or something in the environment, but setting it to "No" for "Release" as well fixed the problem.

Sunday, April 8, 2012

Why Testing Matters for Performance

While performance certainly matters for testing, due to fast tests not breaking the feedback cycle that's so important in being agile (rather than "Doing Agile", which just isn't the same thing), the reverse is also true: with a trusted set of unit tests, you are much more likely to show the courage needed to make bold changes to working code in order to make it go faster.

So performance and testing are mutually dependent.

Friday, March 30, 2012

Why Performance Matters for Testing

While speed certainly matters in a production/e-commerce environment, I think it also is a significant factor for unit testing and especially TDD. While my tests haven't been tuned and certainly aren't the fastest in the world, my frameworks do test themselves in around 3 seconds using at present pretty much exactly 1000 tests.

That level of performance makes testing qualitatively different from having tests that run in 7 minutes down from 15 after some serious performance tuning described in the article, or 5 minutes down from 10. It means running the unit tests can be a normal part of edit-compile-run cycle, rather than a separate activity, supporting the idea that the tests are simply an intrinsic part of the code.

These are not lightweight tests, for example setting up and tearing down Postscript interpreters or running PDF and Postscript documents through text extraction engines. However, they do run in memory almost exclusively and mostly in a compiled language.

However, even 3 seconds is still too long a delay, feedback should be instantaneous to give a true interactive programming experience, or at least not get in the way of that experience. I saw a nice approach to this at the Hasso Plattner Institute using code coverage analysis to interactively run tests sorted by relevance to the code being edited (in Smalltalk). A simpler approach might be to just run the unit tests in the background while editing.

Wednesday, March 28, 2012

Three Kinds of Agile

Wiliam Edwards writes that Agile Is a Sham. He got quite a few intelligent comments, including those pointing out that the values he espouses are exactly those from the Agile Manifesto:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

In my working life, I have encountered "Agile" three times. The last time was at a startup based in Oakland. It was Scrum, it was imposed from the top, a tool was introduced first in order to do agile tracking and planning, none of the actual technical practices were ever seriously considered. In short, exactly the kind of sham that Wiliam complains about. It had zero benefit.

Before that, we had two groups at BBC News Interactive starting to do their respective interpretations of XP. One did the simplest thing that could possibly work, for example never getting around to ever putting in the database, and did that in a test-first style. It didn't do standup meetings, paired at times, and every once in a while did a planning game. This team replaced a system that failed almost every single day and had abysmal performance using a 12 computer setup with a system that had two failures in 3 years, performed 100-1000 times better using only a single machine. The second team did standup meetings and planning games, but never managed to implement the technical practices such as TDD or DTSTTCPW/YAGNI. This team failed to deliver and the project was canceled/reset.

Finally, I first encountered what I would now recognize as XP before I knew that XP existed. Hard core YAGNI/DTSTTCPW, mixing pairing and working alone both organically and usefully. We introduced unit testing later and I became an immediate convert, having been responsible for shipping this product for some time and feeling the uncertainty over the question "when will it be ready". I later discovered that a bundle of these practices that we as coders had discovered for ourselves were known as XP, and I eagerly learned what I could from their experience.

So is Agile a sham? Maybe the dadaist manifesto put it best: if you are against this manifesto, you are a dadaist!

Tuesday, March 27, 2012

CocoaHeads Berlin Performance Talk

Last Wednesday, March 21st 2012, I held a talk on Cocoa Performance at the monthly Berlin CocoaHeads meeting. Thanks to everyone for the kind reception and benignly overlooking the fact that the end of the talk was a bit hurried.

Although the slides were primarily supportive and do not necessarily stand on their own, they are now available online by popular request (2.3MB, Keynote).

And a PDF version (6.3MB).

30k requests/s, aka wrk is fast

I just discovered wrk, a small and very fast http load testing tool. My previous experiments with µhttp-based MPWSideWeb first maxed out at around 5K requests per second, and after switching to httperf, I got up to around 10K per second. Using wrk on the same machine as previously, I now get this:
marcel@nomad[~]wrk  -c 100 -r 300000 http://localhost:8082/hi
Making 300000 requests to http://localhost:8082/hi
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.60ms    2.51ms  11.47ms   62.40%
    Req/Sec    14.98k     0.99k   16.00k    52.80%
  300002 requests in 9.71s, 21.46MB read
Requests/sec:  30881.96
Transfer/sec:      2.21MB
marcel@nomad[~]curl http://localhost:8082/hi
Hello World!


So a nice 30K requests per second on a MacBook Air, and wrk was only running at around 60% CPU, whereas httperf tended to be pegged at 100%. The web-server in question is a minimal server like sinatra.rb set up to return a simple "Hello world!".
marcel@localhost[scripts]cat memhttpserver.stsh 
#!/usr/local/bin/stsh
context loadFramework:'MPWSideWeb'
server := MPWHTTPServer new.
server setPort: 8082.
stdout println: 'memhttpserver listening on port ',server port stringValue.
server start:nil.

scheme:base := MPWSiteMap scheme.
base:/hi := 'Hello World!'.
server setDelegate: scheme:base .

shell runInteractiveLoop

marcel@localhost[scripts]./memhttpserver.stsh 
memhttpserver listening on port 8082
>

So I definitely have a new favorite http performance tester!

30k entries (II), aka computers have RAM, and they can do I/O, too...

Let's assume a document storage system with an assumed maximum working set of 30K documents. Let's also assume we want to store some tags, maybe 10 per document, encoded as 32 bit integers (8 bits tag type, 24 bits tag value used as an index). That would be:
    30K documents x 10 tags/document x 4 bytes/tag = 
                           300K tags x 4 bytes/tag =
                                      1200 K bytes = 1.2 MB
Even assuming 2:1 bloat due to to overhead gives us 2.4 MB, which should not just fit comfortably into the RAM of a modern computer or a cellphone, it actually fits comfortably into the L3 cache of an Intel Core i7 with 8-10MB to spare.

What about getting that data into RAM? The slowest hard drives (non-SSD) I could find using a quick web search had a transfer rate of better than 48MB/s and a seek time of around 10ms, so the 2.4MB in question should be in memory in around:

 10ms + 2.4MB / (48MB/s) = 
           10ms + 0.05 s =
           10ms +  50 ms =  60 ms
So less than 1/10th of a second to read it in, and a moderately fast SSD reduces that to 10ms.

EDIT: fixed embarrassing typo (L1 -> L3 cache).