Wednesday, May 28, 2014

Why I don't mock

Well, it's impolite, isn't it? But seriously, when I first heard about mock object testing, I was excited, because it certainly sounded like The Right Thing™: message-based, checking relationships instead of state, and the new hip thing.

However, when I looked at actual examples, they looked sophisticated and obscure, the opposite of what I feel unit tests should be: obvious and simple, simplistic to the point of stupidity. I couldn't figure out at a glance what the expected behavior was, what was being tested and what was environment.

So I never used mocks in practice, meaning my opinions could not go beyond being superficial. Fortunately, I was given the task of porting a fairly large Objective-C project to OS X (yes, you read that right: "to OS X" ), and it was heavily mock-tested.

As far as I could tell, most of the vague premonitions I had about mock testing were borne out in that project: obscure mock tests, mock tests that didn't actually test anything except their own expectations and mock tests that were deeply coupled to implementation details.

Again, though, that could just be my misunderstandings, certainly people for whom I have a great deal of respect advocate for mock tests, but I was heartened when I heard in the recent DHH/Fowler/Beck TDD death-matches friendly conversations that neither Kent nor Martin are great fans of mocking, and certainly not of deeply nested mocks.

However, it was DHH's comments that finally made me realize that what really bothered was something more subtle, and much more pervasive. The talk is about "mocking the database", or mocking some other component. While not proof positive, this kind of mocking seems indicative of not letting the tests drive the design towards simplicity, because the design is already set in stone.

As a result, you're going to have constant pain, because the tests will continuously try to drive you towards simplifying your design, which you resist by putting in mocks.

Instead of putting in mocks of presumed components, let the tests tell you what counterparts they want. Then build those counterparts, again in simplest way possible. You will likely discover that a lot of your assumptions about the required environment for your application turn out not to be true.

For example, when building SportStats v2 at the BBC we thought we needed a database for persistence. But we didn't build it in until we needed it, and we didn't mock it out either. We waited until the code told us that we now needed a database.

It never did.

So we discovered that our problem was simpler than we had originally thought, and therefore our architecture could be as well. Mocking eliminates that feedback.

So don't mock. Because it's impolite to not listen to what your code is trying to tell you.

Tuesday, May 27, 2014

Live objects vs. static types for code completion in Objective-Smalltalk

Objective-Smalltalk is now getting into a very nice virtuous cycle of being more useful, therefore being used more and therefore motivating changes to make it even more useful. One of the recent additions was autocomplete, for both the tty-based and the GUI based REPLs.

I modeled the autocomplete after the one in bash and other Unix shells: it will insert partial completions without asking up the point that they become ambiguous. If there is no unambiguous partial completion, it displays the alternatives. So a usual sequence is <TAB> -> something is inserted <TAB> again -> list is displayed, type one character to disambiguate, <TAB> again and so on. I find that I get to my desired result much quicker and with fewer backtracks than with the mechanism Xcode uses.

Fortunately, I was able to wrestle NSTextView's completion mechanism (in ShellView borrowed from the excellent FSCript) to provide these semantics rather than the built in ones.

Another cool thing about the autocomplete is that it is very precise, unlike for example FScript which as far as I can tell just offers all possible symbols. How can this be, when Objective-Smalltalk is (currently) dynamically typed and we all know that good autocomplete requires static types? The reason is simply that there is one thing that's even better than having the static types available: having the actual objects themselves available!

The two REPLs aren't just syntax-aware, they also evaluate the expression as much as needed and possible to figure out what a good completion might be. So instead of having to figure out the type of the object, we can just ask the object what messages it understands. This was very easy to implement, almost comically trivial compared to a full blown static type-system.

So while static types are good for this purpose, live objects are even better! The Self team made a similar discovery when they were working on their optimizing compiler, trying both static type inference and dynamic type feedback. Type feedback was both simpler and performed vastly better and is currently used even for optimizing statically typed languages such as Java.

Finally, autocomplete also works with Polymorphic Identifiers, for example file:./a<TAB> will autocomplete files in the current directory starting with the letter 'a' (and just fi<TAB> will autocomplete to the file: scheme). Completion is scheme-specific, so any schemes you add can provide their own completion logic.

Like all of Objective-Smalltalk, this is still a work in progress: not all syntactic constructs support completions, for example Polymorphic Identifiers don't support complex paths and there is no bracket matching. However, just like Objective-Smalltalk, what is there is quite useful and often already better what else is out there in small areas.

HN

Sunday, May 4, 2014

Satisfying the hunger for type safety?

Tom Adriaenssen riffs on the id subset in show me some id:
Let me explain: even though you might assume that all those objects are actually going to be DataPoint objects, there’s no actual guarantee that they will actual be DataPoint objects at runtime. Casting them only satisfies your hunger for type safety, but nothing else really.
More importantly, it only seems to satisfy your hunger for type safety, it doesn't actually provide any. It's less nutritious than sugar water in that respect, not even calories, never mind the protein, fiber, vitamins and other goodness. More like a pacifier, really, or the product of a cargo cult.

Saturday, May 3, 2014

The sp(id)y subset, or Avoiding Copeland 2010 with Objective-C 1984

In my recent post on Cargo Cult Typing, I mentioned a concept I called the id subset. Briefly, it is the subset of Objective-C that deals only with object pointers, or id's. There has been some misunderstanding that I am opposed to types. I am not, but more on that another time.

One of the many nice properties of the (transitive) id subset is that it is dynamically (memory) safe, just like Smalltalk. That is, as long as all arguments and return values of your message are objects, you can never dereference a pointer incorrectly, the worst that can happen is that you get a "Message not understood" that can be caught and handled by the object in question or raised as an exception. The reason this is safe is that objc_msgSend() will make sure that methods will only ever be invoked on objects of the correct class, no matter what the (possibly incorrect, or unavailable) static type says.

So no de-referencing an incorrect pointer, no scribbling over random bits of memory. In fact, this is the vaunted "pointer safety" that John Siracusa says requires ditching native compiled languages like Objective-C for VM based languages. The idea that a VM with an interpreter or a JIT was required for pointer safety was never true, of course, and it's interesting that both Google and Microsoft are turning to Ahead of Time (AOT) compilation in their newest SDKs, for performance reasons.

Did someone mention "performance"? :-)

Another nice aspect of the id subset is that it makes reflective code a lot simpler. And simplicity usually also translates to speed. How much speed? Apple's NSInvocation class has to deal with interpreting C type information at runtime to then construct proper stack frames dynamically for all possible C types. I think it uses libffi, though it may be some equivalent library. This is slow, around 340.1ns per message send on my 13" MBPR. By restricting itself to the id subset, my own MPWFastInvocation class's dispatch is much simpler, just a switch invoking objc_msgSend() with a different number of arguments.

The simplicity of MPWFastInvocation also pays off in speed: 6.2ns per message-send on the same machine. That's 50 times faster than NSInvocation and only 2-3x slower than a normal message send. In fact, once you're that close, things like IMP-caching (4 ns) start to make sense, especially since they can be hidden behind a nice interface. Using a C Macro and the IMP stashed in a public instance var takes the time down to 3 ns, making the reflective call via an object effectively as fast as the non-reflective code emitted by the compiler. Which is nice, because it makes reflective techniques much more feasible for wider varieties of code, which would be a good thing.

The speed improvement is not because MPWFastInvocation is better than NSInvocation, it is decidedly not, it is because it is solving a much, much simpler problem. By sticking to the safe id subset.

On HN.