Damien Pollet thinks my
comparison between Objective-C blocks and HOM is not completely fair:
… from my (Smalltalk) experience, the block passed to #collect: is often not a single message send, but rather a small adhoc expression, for which it does not really make sense to define a named method. Or you might need both the element and its key/index… how does HOM deal with that?
These are certainly valid observations, and were some of the reasons
that I didn't really think that much of HOM for the first couple of
years after coming up with it back in 1997 or so. Since then, I've
become less and less convinced that the problems raised are a big concern, for a number of reasons.
Inline vs. Named
One reason is that I actually looked at usage of blocks in the Squeak
image, and found that the majority of blocks with at least one argument
(so not ifTrue:, whileTrue: and other control structures) actually did
contain just a single message send, and so could be immediately expressed
as HOMs. Second, I noticed that there were a lot of fairly large (3+ LOC)
blocks that
should have been separate methods but weren't.
That's when I discovered that the presence of blocks actually
encourages bad code, and the 'limitation' of HOMs actually was
encouraging better(-factored) code.
Of course, I wasn't particularly convinced by that line of reasoning,
because it smelled too much like "that's not a bug, that's a feature".
Until that is, I saw others with less vested interest reporting the same
observation:
But are these really limitations? After using higher order messages for a while I've come to think that they are not. The first limitation encourages you move logic that belongs to an object into that object's implementation instead of in the implementation of methods of other objects. The second limitation encourages you to represent application concepts as objects rather than procedural code. Both limitations have the surprising effect of guiding the code away from a procedural style towards better object-oriented design.
My experience has been that Nat is right, having a mechanism that
pushes you towards factoring and naming is better for your code
that one that pushes you towards inlining and anonymizing.
Objective-C I
In fact, the Cocoa
example that Apple gives for blocks illustrates this idea
very well. They implement a "Finder like" sorting mechanism using blocks:
static NSStringCompareOptions comparisonOptions = NSCaseInsensitiveSearch | NSNumericSearch |
NSWidthInsensitiveSearch | NSForcedOrderingSearch;
NSLocale *currentLocale = [NSLocale currentLocale];
NSComparator finderSort = ^(id string1, id string2) {
NSRange string1Range = NSMakeRange(0, [string1 length]);
return [string1 compare:string2 options:comparisonOptions range:string1Range locale:currentLocale];
};
NSLog(@"finderSort: %@", [stringsArray sortedArrayUsingComparator:finderSort]);
The block syntax is so verbose that there is no hope of actually defining the block inline, the supposed raison d'etre for blocks. So we actually need to take the
block out-of-line and name it. So it looks suspiciously like an
equivalent implementation using functions:
static NSStringCompareOptions comparisonOptions = NSCaseInsensitiveSearch | NSNumericSearch |
NSWidthInsensitiveSearch | NSForcedOrderingSearch;
NSLocale *currentLocale = [NSLocale currentLocale];
static NSComparisonResult finderSort(id string1, id string2) {
NSRange string1Range = NSMakeRange(0, [string1 length]);
return [string1 compare:string2 options:comparisonOptions range:string1Range locale:currentLocale];
};
NSLog(@"finderSort: %@", [stringsArray sortedArrayUsingFunction:finderSort context:nil hint:nil]);
Of course, something as useful as a Finder-like comparison sort
really deserves to be exposed and made available for reuse, rather
than hidden inside one specific sort. Objective-C categories are
just the mechanism for this sort of thing:
@implementation NSString(finderCompare)
-(NSSComparisonResult)finderCompare:(NSString*)string2) {
NSRange myRange = NSMakeRange(0, [self length]);
return [self compare:string2 options: NSCaseInsensitiveSearch | NSNumericSearch |
NSWidthInsensitiveSearch | NSForcedOrderingSearch range:string1Range locale:[NSLocale currentLocale]];
}
@end
NSLog(@"finderSort: %@", [stringsArray sortedArrayUsingSelector:@selector(finderCompare:)]);
Note that some of these criticisms are specific to Apple's implementation of blocks, they do not apply in the same way to
Smalltalk blocks, which are a lot less noisy.
Objective-C II
Objective-C has at least one other pertinent difference from
Smalltalk, which is that it already contains control structures
in the basic language, without blocks. (Of course, those control
structures can also take blocks as arguments, but these are the
different types of blocks that are delimited by curly braces and
cannot be passed around as first class objects).
This means that in Objective-C, we already have the ability to
do all the iterating we need, mechanisms such as blocks and
HOM are mostly conveniences, not required building blocks. If
we need indices, use a for loop. If we require keys, use a
key-enumerator and iterate over that.
In fact, I remember when my then colleagues started working
with a enum-filters, a HOM-precursor that's strikingly similar
to the Google Toolbox's GTMSEnumerator+Filter.m. They really took to
the elegance, but then also wanted to use it for various special
cases. They laughed when they realized that those special-cases
were actually already handled better by existing C control structures
such as for-loops.
FP, HANDs and Aggregate Operations
While my dislike of blocks is easy to discount by the usual
inventor's pride (your child must be ugly for mine to be pretty),
that interpretation actually reverses the causation: I came
up with HOM because I was never very fond of blocks. In fact,
when I first encountered Smalltalk during my university
years I was enthralled until I saw the iteration methods.
That's not to say that do:, collect: and friends were not light-years
ahead of Algol-type control structures, they most definitely were
and still are. Having some sort of higher-order mechanism is
vastly superior than not having a higher-order mechanism.
I do wish that "higher order mechanism" and "blocks" weren't
used as synonyms quite as much, because they are not, in fact,
synonymous.
When I first encountered Smalltalk blocks, I had just previously been
exposed to Backus's FP, and that was just so much prettier! In
FP functions are composed using functionals without ever talking
about actual data, and certainly without talking about individual
elements. I have always been on the lookout for higher levels
of expression, and this was such a higher level. Now taking
things down to "here's another element, what do you want to
do with that" was definitely a step back, and quite frankly
a bit of a let-down.
The fundamental difference I see is that in Smalltalk there
is still an iteration, even if it is encapsulated: we iterate
over some collection and then execute some code for each element.
In FP, and in HOM, there is instead an aggregate operation: we
take an existing operation and lift it up as applying to an entire collection.
This difference might seem contrived, but the research done with
the HANDS system demonstrates that it is very real:
After creating HANDS, I conducted another user study to examine the effectiveness of three features of HANDS: queries, aggregate operations, and data visibility. HANDS was compared with a limited version that lacked these features. In the limited version, programmers were able to achieve the desired results but had to use more traditional programming techniques. Children using the full-featured HANDS system performed significantly better than their peers who used the limited version.
I also find this difference to be very real.
The difference between iterating with blocks and lifting operations
to be aggregate operations also shows up in the fact that the lifting can be done on any
combination of the involved parameters, whereas you tend to only
iterate over one collection at a time, because the collection and
the iteration are in focus.
Symmetry
Finally, the comparison to functional languages shows a couple of
interesting asymmetries: in a functional language, higher order
functions can be applied both to named functions and to anonymous
functions. In essence, the higher order mechanism just takes
functions and doesn't care wether they are named or not. Also
the higher order mechanism uses the same mechanisms (functions)
as the base system,
With block-based higher order mechanisms, on the other hand,
we must make the argument an anonymous function (that's what
a block is), and we cannot use a named function, bringing
us back to the conundrum mentioned at the start that this
mechanisms encourages bad code. Not only that, it also turns
out that the base mechanism (messages and methods) is different
from the higher order mechanism, which requires anonymous functions,
rather than methods.
HOM currently solves only the latter part of this asymmetry, making
the higher order mechanism the same as the base mechanism, that
mechanism being messaging in both cases. However, it currently
cannot solve the other asymmetry: where blocks support unnamed,
inline code and not named code, HOM supports named but not unnamed
code. While I think that this is the better choice in the larger
number of cases, it would be nice to actually suport both.
One solution to this problem might be to simply support both blocks
and Higher Order Messaging, but it seems to me that the more
elegant solution would be to support inline definition of more-or-less
anonymous methods that could then be integrated into the Higher Order
Messaging framework.