I like static (manifest) typing. This may come as a shock to those who
have read other posts of mine, but it is true. I certainly am more
comfortable with having a MPWType1FontInterper *interpreter
rather than id interpreter. Much more comfortable, in fact, and
this feeling extends to
Xcode saying "0 warnings" and the clang static analyzer agreeing.
Safety
The question though is: are those feelings actually justified? The rhetoric
on the subject is certainly strong, and very rigid/absolute. I recently had a Professor
of Computer Science state unequivocally that anyone who doesn't use static typing should
have their degree revoked. In a room full of Squeakers. And that's not an
extreme or isolated case. Just about any discussion on the subject seems to
quickly devolve into proponents of static typing claiming absolutely that
dynamic typing invariably leads to programs that are steaming piles of bugs and crash left
and right in production, whereas statically typed programs have their bugs
caught by the compiler and are therefore safe and sound. In fact, Milner has supposedly made the claim that "well typed programs cannot go wrong". Hmmm...
That the compiler is capable of catching (some) bugs using static type checks
is undeniably true. However, what is also obviously true is that not all bugs are type
errors (for example, most of the 25 top software errors don't look like type errors
to me, and neither goto fail; nor Heartbleed look like type errors either, and neither
do the top errors in my different projects),
so having the type-checker give our programs a clean bill of health does not
make them bug free, it eliminates a certain type or class of bugs.
With that, we can take the question from the realm of religious zealotry to the
realm of reasoned inquiry: how many bugs does static type checking catch?
Alas, this is not an easy question to answer, because we are looking for something
that is not there. However, we can invert the question: what is the incidence
of type-errors in dynamically typed programs, ones that do not benefit from the
bug-removal that the static type system gives us and should therefore be steaming
piles of those type errors?
With the advent of public source repositories, we now have a way of answering that
question, and Robert Smallshire did the grunt work to come up with an answer: 2%.
So all those nasty type errors were actually not
having any negative impact on debug times, in fact the reverse was true. Which of
course makes sense if the incidence of type errors is even near 2%, because then other factors
are almost certain to dominate. Completely.
Some people are completely religious about type systems and as a mathematician I love the idea of type systems, but nobody has ever come up with one that has enough scope. If you combine Simula and Lisp—Lisp didn’t have data structures, it had instances of objects—you would have a dynamic type system that would give you the range of expression you need.
Even stringent advocates of strong typing such as Uncle Bob Martin, with whom I sparred
many a time on that and other subjects in comp.lang.object have now come around to this
point of view: yeah, it's nice, maybe, but just not that important, and in fact he
has actually reversed his position, as seen in this video of him debating static typing with Chad Fowler.
Truthiness and Safyness
What I find interesting is not so much whether one or the other is right/wronger/better/whatever, but rather
the disparity between the vehemence of the rhetoric, at least on one side of the
debate ("revoke degrees!", "can't go wrong!") and both the complete lack of empirical evidence
for (there is some against) and the lack of magnitude of the effect.
Stephen Colbert coined the term "truthiness" for "a "truth" that a person making an argument or assertion claims to know intuitively 'from the gut' or because it 'feels right' without regard to evidence, logic, intellectual examination, or facts." [Wikipedia]
To me it looks like a similar effect is at play here: as I notice myself, it just feels
so much safer if the computer tells you that there are no type errors. Especially if it
is quite a bit of effort to get to that state, which it is. As I wrote, I notice that
effect myself, despite the fact that I actually know the evidence is not there,
and have been a long-time friendly skeptic.
So it looks like static typing is "safy": people just know intuitively that it
must be safe, without regard to evidence. And that makes the debate both so
heated and so impossible to decide rationally, just like the political debate on
"truth" subjects.
One of the things I find curious is how Apple's new Swift language rehashes mistakes that were made in other languages. Let's take construction or
initializers.
Objective-C/Smalltalk
These are the rules for initializers in Smalltalk and Objective-C:
An "initializer" is a normal method and a normal message send.
There is no second rule.
There's really nothing more to it, the rest follows organically and naturally
from this simple fact and various things you like to see happen. For example, is
there a rule that you have to send the initial initializer (alloc
or new) to the class?
No there isn't, it's just a convenient and obvious place to put it since we don't
have the instance yet and the class exists and is an obvious place to go to for
instances of that class. However, we could just
as well ask a different class to create the object for us.
The same goes with calling super. Yes, that's usually
a good idea, because usually you want the superclass's behavior,
but if you don't want the superclass's behavior, then don't call.
Again, this is not a special rule for initializers, it usually
follows from what you want to achieve. And sometimes it doesn't, just
like with any other method you override: sometimes you call super,
sometimes you do not.
The same goes for assigning the return value, doing the
self=[super init]; dance. Again, this is not
at all required by the language or the frameworks, although
apparently it is a common misconception that it is, a
misconception that is, IMHO, promoted by careless creation
of "best practices" as "immutable rules", something I wrote
about earlier when talking about the useless typing out of the
id type in method declarations.
However, returning self and using the returned value is a useful
convention, because it makes it possible for init
methods to return a different object than what they started
with (for example a specific subclass or a singleton).
Swift initializers
Apple's new Swift language has taken a page from the C++ and Java
playbooks and made initialization a special case. Well, lots
of special cases actually. The Swift book has 30 pages on
initialization, and they aren't just illustration and explanation,
they are dense with rules and special cases. For example:
You can set a default value of a property in the variable definition.
Or you can set the default value in an initializer.
Designated initializers are now a first class language construct.
Parameterized initializers have local and external parameter names, line methods.
Except that the first parameter name is different and so Swift automatically provides and external parameter name for all arguments, which
it doesn't with methods.
Constant properties aren't constant in initializers.
Swift creates a default initializer for both classes and structs.
Swift also creates a default member wise initializer, but only
for structs.
Initializers can (only) call other initializers, but there are special
rules for what is and is not allowed and these rules are different
for structs and classes.
Providing specialized initializers removes the automatically-provided
default initializers.
Initializers are different from other methods in that they are
not inherited, usually.
Except that there are specific circumstances where they are
inherited.
Confused yet? There's more!
If your subclass provides no initializers itself, it inherits
all the superclass's initializers
If your subclass overrides all the superclass's designated
initializers, it inherits all the convenience initializers (that's
also a language construct). How does this not break if the
superclass adds initializers? I think we've just re-invented
the fragile-base-class problem.
Oh, and you can initialize instance variables with the values
returned by closures or functions.
Well, that was easy, but that's probably only because I missed a few.
Having all these rules means that
this new way of initialization is less powerful than the one
before it, because all of these rules restrict the power that
a general method has.
Particularly, it is not possible to substitute a different value
or return nil to indicate failure to initialize, nor
is it possible to call other methods (as far as I can tell).
To actually provide these useful features, we need something else:
Use the Factory method pattern to actually
do the powerful stuff you need to do ...
...which gets you back to where we were at the beginning
with Objective-C or Smalltalk, namely sending a normal message.
Of course, we are familiar with this because both C++ and Java also have
special constructor language features, plagued by the same
problems. They are also the source of the Factory method pattern,
at least as a separate "pattern". Smalltalk and Objective-C simply
made that pattern the default for object creation, in fact Brad Cox
called classes "Factory Objects", long long before the GOF patterns book.
First rule of baking programming conventions into the language: Don't do it! The second rule of baking programming conventions into the language (experts only): Don't do it yet!
Well, it's impolite, isn't it? But seriously,
when I first heard about mock object testing, I was excited, because it
certainly sounded like The Right Thing™: message-based, checking
relationships instead of state, and the new hip thing.
However, when I looked at actual examples, they looked sophisticated
and obscure, the opposite of what I feel unit tests should be:
obvious and simple, simplistic to the point of stupidity. I couldn't figure
out at a glance what the expected behavior was, what was being
tested and what was environment.
So I never used mocks in practice, meaning my opinions could not
go beyond being superficial. Fortunately, I was given the
task of porting a fairly large Objective-C project to OS X
(yes, you read that right: "to OS X" ), and it was heavily
mock-tested.
As far as I could tell, most of the vague premonitions I had
about mock testing were borne out in that project: obscure
mock tests, mock tests that didn't actually test anything except their
own expectations and mock tests that were deeply coupled to
implementation details.
Again, though, that could just be my misunderstandings, certainly
people for whom I have a great deal of respect advocate for
mock tests, but I was heartened when I heard in the recent
DHH/Fowler/Beck TDD death-matchesfriendlyconversations that neither Kent nor Martin are
great fans of mocking, and certainly not of deeply nested mocks.
However, it was DHH's comments that finally made me realize that
what really bothered was something more subtle, and
much more pervasive. The talk is about "mocking the database",
or mocking some other component. While not proof positive, this
kind of mocking seems indicative of not letting the tests drive
the design towards simplicity, because the design is already
set in stone.
As a result, you're going to have constant pain, because the
tests will continuously try to drive you towards simplifying
your design, which you resist by putting in mocks.
Instead of putting in mocks of presumed components, let the
tests tell you what counterparts they want. Then build those
counterparts, again in simplest way possible. You will likely
discover that a lot of your assumptions about the required
environment for your application turn out not to be true.
For example, when building SportStats v2 at the BBC we thought
we needed a database for persistence. But we didn't build it
in until we needed it, and we didn't mock it out either. We
waited until the code told us that we now needed a database.
It never did.
So we discovered that our problem was simpler than we had
originally thought, and therefore our architecture could
be as well. Mocking eliminates that feedback.
So don't mock. Because it's impolite to not listen to what
your code is trying to tell you.
Objective-Smalltalk is now getting into a very nice virtuous cycle of
being more useful, therefore being used more and therefore motivating changes
to make it even more useful. One of the recent additions was autocomplete,
for both the tty-based and the GUI based REPLs.
I modeled the autocomplete after the one in bash and other Unix shells:
it will insert partial completions without asking up the point that they
become ambiguous. If there is no unambiguous partial completion, it
displays the alternatives. So a usual sequence is <TAB> -> something
is inserted <TAB> again -> list is displayed, type one character to disambiguate, <TAB> again and so on. I find that I get to my
desired result much quicker and with fewer backtracks than with the
mechanism Xcode uses.
Fortunately, I was able to wrestle NSTextView's
completion mechanism (in ShellView borrowed from
the excellent FSCript) to provide these semantics rather than the
built in ones.
Another cool thing about the autocomplete is that it is very precise,
unlike for example FScript which as far as I can tell just offers all
possible symbols.
How can this be, when Objective-Smalltalk is (currently) dynamically
typed and we all know that good autocomplete requires static types?
The reason is simply that there is one thing that's even better
than having the static types available: having the actual objects
themselves available!
The two REPLs aren't just syntax-aware, they also evaluate the
expression as much as needed and possible to figure out what
a good completion might be. So instead of having to figure
out the type of the object, we can just ask the object what
messages it understands. This was very easy to implement,
almost comically trivial compared to a full blown static type-system.
So while static types are good for this purpose, live objects are
even better! The Self team made a similar discovery when they
were working on their optimizing compiler, trying both static
type inference and dynamic type feedback. Type feedback was
both simpler and performed vastly better and is currently used
even for optimizing statically typed languages such as Java.
Finally, autocomplete also works with Polymorphic Identifiers, for
example file:./a<TAB> will autocomplete files
in the current directory starting with the letter 'a' (and just
fi<TAB> will autocomplete to the file:
scheme). Completion is scheme-specific, so any schemes you add
can provide their own completion logic.
Like all of Objective-Smalltalk, this is still a work in progress:
not all syntactic constructs support completions, for example
Polymorphic Identifiers don't support complex paths and there
is no bracket matching. However, just like Objective-Smalltalk,
what is there is quite useful and often already better what else
is out there in small areas.
Let me explain: even though you might assume that all those objects are actually going to be DataPoint objects, there’s no actual guarantee that they will actual be DataPoint objects at runtime. Casting them only satisfies your hunger for type safety, but nothing else really.
More importantly, it only seems to satisfy your hunger for type safety,
it doesn't actually provide any. It's less nutritious than sugar water in
that respect, not even calories, never mind the protein, fiber, vitamins and
other goodness. More like a pacifier, really, or the product of a
cargo cult.
In my recent post on Cargo Cult Typing, I mentioned a
concept I called the id subset. Briefly, it is the subset of
Objective-C that deals only with object pointers, or id's.
There has been some misunderstanding that I am opposed to types. I am
not, but more on that another time.
One of the many nice properties of the (transitive) id subset is that it
is dynamically (memory) safe, just like Smalltalk. That is, as long as all arguments and return values
of your message are objects, you can never dereference a pointer incorrectly,
the worst that can happen is that you get a "Message not understood" that can
be caught and handled by the object in question or raised as an exception.
The reason this is safe is that objc_msgSend() will make sure that methods
will only ever be invoked on objects of the correct class, no matter what the
(possibly incorrect, or unavailable) static type says.
So no de-referencing an incorrect pointer, no scribbling over random bits
of memory.
In fact, this is the vaunted "pointer safety" that John Siracusa says requires
ditching native compiled languages like Objective-C for VM based languages. The idea
that a VM with an interpreter or a JIT was required for pointer safety
was never true, of course, and it's interesting that both Google and
Microsoft are turning to Ahead of Time (AOT) compilation in their newest
SDKs, for performance reasons.
Did someone mention "performance"? :-)
Another nice aspect of the id subset is that it makes reflective code
a lot simpler. And simplicity usually also translates to speed. How
much speed? Apple's NSInvocation class has to deal with
interpreting C type information at runtime to then construct proper stack
frames dynamically for all possible C types. I think it uses libffi, though
it may be some equivalent library. This is slow, around 340.1ns
per message send on my 13" MBPR. By restricting itself to the id subset,
my own MPWFastInvocation class's dispatch is
much simpler, just a switch invoking objc_msgSend() with
a different number of arguments.
The simplicity of MPWFastInvocation also pays off in
speed: 6.2ns per message-send on the same machine. That's 50 times
faster than NSInvocation and only 2-3x slower than
a normal message send. In fact, once you're that close, things like
IMP-caching (4 ns) start to make sense, especially since they can
be hidden behind a nice interface. Using a C Macro and the IMP
stashed in a public instance var takes the time down to 3 ns, making
the reflective call via an object effectively as fast as the
non-reflective code emitted by the compiler. Which is nice, because
it makes reflective techniques much more feasible for wider varieties
of code, which would be a good thing.
The speed improvement is not because MPWFastInvocation is better
than NSInvocation, it is decidedly not, it is because it is solving
a much, much simpler problem. By sticking to the safe id subset.
I have to admit I am a bit startled to see pople seriously (?) advocate exploitation of "undefined behavior" in the C standard to just eliminate that code altogether, arguing that
undefined means literally anything is OK. I've certainly seen it justified
many times. Apart from being awful, this idea smacks of hubris on part of the compiler writers.
The job of the compiler is to do the best job it can at turning the
programmer's intent into executable machine code, as expressed by
the program. It is not to
show how clever the optimizer writer is, how good at lawyering the language
standard, or to wring out a 0.1% performance
improvement on <benchmark-of-choice>, at least not when it
conflicts with the primary goal.
For let's not pretend that these optimizations are actually useful
or significant: Proebsting's law shows that all compiler optimizations
have been at best 1/10th as effective at improving performance as hardware
advances, and recent research suggests that even that may be optimistic.
That doesn't mean that I don't like my factor 2 or 3 improvement in
code performance for codes where basic optimizations apply. But almost
all of those performance gains come at the lowest levels of optimization,
the more sophisticated stuff just doesn't bring much if any additional
benefit. (There's a reason Apple recommends -Os and not -O3 as default).
So don't get ahead of yourselves, other non-compiler optimizations can often
achieve 2-3 orders of magnitude improvement, and for a lot of
Objective-C code, for example,
the compiler's optimizations barely register at all. Again: perspective!
Furthermore, the purpose of "undefined behavior" was (not sure it still is)
to be inclusive, so for example compilers for machines with slightly odd
architectures could still be called ANSI-C without having to do unnatural
things on that architecture in order to conform to over-specification.
Sometimes, undefined behavior is needed for programs to work.
So when there is integer overflow, for example, that's not a license to
silently perform dead code elimination at certain optimization levels, it's
license to do the natural thing on the platform, which on most platforms
these days is let the integer overflow, because that is what a C programmer
is likely to expect. In addition, feel free to emit a warning. The
same goes for optimizing away an out of bounds array access that is
intended to terminate a loop. If you are smart enough to figure out
the out-of-bounds access, warn about it and then proceed to emit the
code. Eliminating the check and turning a terminating loop into an
infinite loop is never the right answer.
So please don't do this, you're not producing value: those optimizations
will cease to "help" when programmers "fix" their code. You are also
not producing value: any additional gains are extremely modest compared
to the cost. So please stop doing, certainly stop doing it on purpose,
and please carefully evaluate the cost/benefit ratio when introducing optimizations that cause this to happen as a side effect...and then
don't. Or do, and label them appropriately.
This quote from Steve Jobs is one that's been an inspiration to me for some time:
[...] when you first attack a problem it
seems really simple because you
don't understand it. Then when you
start to really understand it, you
come up with these very complicated
solutions because it's really hairy.
Most people stop there. But a few
people keep burning the midnight oil
and finally understand the underlying
principles of the problem and
come up with an elegantly simple
solution for it. But very few people
go the distance to get there.
In other words:
Naive Simplicity
Sophisticated Complexity
Sophisticated Simplicity
It's from the February 1984 Byte Interview introducing the Macintosh.
UPDATE: Well, it seems that Heinelein got there first:
Every technology goes through three stages: first, a crudely simple and quite unsatisfactory gadget; second, an enormously complicated group of gadgets designed to overcome the shortcomings of the original and achieving thereby somewhat satisfactory performance through extremely complex compromise; third, a final stage of smooth simplicity and efficient performance [..]
I like bindings. I also like Key Value Observing. What they do is undeniably cool: you do some initial setup, and presto: magic! You change a value over here, and another
value over there changes as well. Action at a distance. Power.
What they do is also undeniably valuable. I'd venture that nobody actually
likes writing state
maintenance and update code such as the following: when the user clicks this button, or finishes entering
text in that textfield, take the value and put it over here. If the underlying
value changes, update the textfield. If I modify this value, notify
these clients that the value has changed so they can update themselves accordingly.
That's boring. There is no glory in state maintenance code, just potential for
failure when you screw up something this simple.
Finally, their implementation is also undeniably cool: observing an attribute
of a generic
object creates a private subclass for that object (who says we can't do
prototype-based programming in Objective-C?), swizzles the object's
class pointer to that private subclass and then replaces the attribute's
(KVO-compliant) accessor methods with new ones that hook into the
KVO system.
Despite these positives, I have actively removed bindings code from
projects I have worked on, don't use either KVO or bindings myself and
generally recommend staying away from them. Why on earth would I
do that?
Excursion: Constraint Solvers
Before I can answer that question, I have to go back a little and talk about
constraint solvers.
The idea of setting up relationships once and then having the system maintain them
without manually shoveling values back and forth is not exactly new, the first variant
I am aware of was Sketchpad,
Ivan Sutherland's PhD Thesis from 1961/63 (here with narration by Alan Kay):
I still love Ivan's answer to the question as to how he could invent computer graphics,
object orientation and constraint solving in one fell swoop: "I didn't know it was hard".
The first system I am aware of that integrated constraint solving with an object-oriented
programming language was ThingLab, implemented on top of Smalltalk by Alan Borning at Xerox PARC around 1978 (where else...):
While the definition
of a paths is simple, the idea behind it has proved quite powerful and has been essential
in allowing constraint- and object-oriented metaphors to be integrated. [..] The notion
of a path helps strengthen [the distinction between inside and outside of an object] by
providing a protected way for an object to provide external reference to its parts and
subparts.
Yes, that's a better version of KVC. From 1981.
Alan Borning's group at the University of Washington continued working on constraint solvers
for many years, with the final result being the Cassowary linear constraint solver (based on the simplex
algorithm) that was picked up by Apple for Autolayout. The papers on Cassowary and
constraint hierarchies should help with understanding why Autolayout does what it does.
A simpler form of constraints are one-way dataflow constraints.
A one-way, dataflow constraint is an equation of the form y = f(x1,...,xn) in which the formula on the right side
is automatically re-evaluated and assigned to the variable y whenever any variable xi.
If y is modified from
outside the constraint, the equation is left temporarily unsatisfied, hence the attribute “one-way”. Dataflow constraints are recognized as a powerful programming methodology in a variety of contexts because of their versatility and simplicity. The most widespread application of dataflow constraints is perhaps embodied by spreadsheets.
The most important lessons they found were the following:
constraints should be allowed to contain arbitrary code that is written in the underlying toolkit language and does not require any annotations, such as parameter declarations
constraints are difficult to debug and better debugging tools are needed
programmers will readily use one-way constraints to specify the graphical layout of an application, but must be carefully and time-consumingly trained to use them for other purposes.
However, these really are just the headlines, and particularly for Cocoa programmers
the actual reports are well worth reading as they contain many useful pieces of
information that aren't included in the summaries.
Back to KVO and Cocoa Bindings
So what does this history lesson about constraint programming have to do with KVO
and Bindings? You probably already figured it out: bindings are one-way
dataflow constraints, specifically with the equation limited to y = x1.
more complex equations can be obtained by using NSValueTransformers. KVO
is more of an implicit invocation
mechanism that is used primarily to build ad-hoc dataflow constraints.
The specific problems of the API and the implementation have been documented
elsewhere, for example by Soroush Khanlou and Mike Ash, who not only suggested and
implemented improvements back in 2008, but even followed up on them in 2012. All
these problems and workarounds
demonstrate that KVO and Bindings are very sophisticated, complex and error prone
technologies for solving what is a simple and straightforward task: keeping
data in sync.
To these implementation problems, I would add performance: even
just adding the willChangeValueForKey: and didChangeValueForKey:
message sends in your setter (these are usually added automagically for you) without triggering any notifications makes that setter 30 times slower (from 5ns to
150ns on my computer) than a simple setter that just sets and retains the object.
Actually having that access trigger a notification takes the penalty to a factor of over 100
( 5ns vs over 540ns), even when there is only a single observer. I am pretty sure
it gets worse when there are lots of observers (there used to be an O(n^3)
algorithm in there, that was fortunately fixed a while ago). While 500ns may
not seem a lot when dealing with UI code, KVO tends to be implemented at
the model layer in such a way that a significant number of model data accesses
incur at least the base penalties. For example KVO notifications were one of the primary
reasons for NSOperationQueue's somewhat anemic performance back when
we measured it for the Leopard release.
Not only is the constraint graph not available at run time, there is also no
direct representation at coding time. All there is either code or IB settings
that construct such a graph indirectly, so the programmer has to infer the
graph from what is there and keep it in her head. There are also no formulae, the best
we can do are ValueTransformers and
keyPathsForValuesAffectingValueForKey.
As best as I can tell, the reason for this state of affairs is that there simply
wasn't any awareness of the decades of
research and practical experience with constraint solvers at the time (How
do I know? I asked, the answer was "Huh?").
Anyway, when you add it all up, my conclusion is that while I would really,
really, really like a good constraint solving system (at least for spreadsheet
constraints), KVO and Bindings are not it. They are too simplistic, too
fragile and solve too little of the actual problem to be worth the trouble.
It is easier to just write that damn state maintenance code, and infinitely
easier to debug it.
I think one of the main communication problems between advocates for and
critics of KVO/Bindings is that the advocates are advocating more for
the concept of constraint solving, whereas critics are critical of the
implementation. How can these critics not see that despite a few flaws,
this approach is obviously
The Right Thing™? How can the advocates not see the
obvious flaws?
Functional Reactive Programming
As far as I can tell, Functional Reactive Programming (FRP) in general and Reactive
Cocoa in particular are another way of scratching the same itch.
[..] is an integration of declarative [..] and imperative object-oriented programming. The primary goal of this integration is to use constraints to express relations among objects explicitly -- relations that were implicit in the code in previous languages.
Sounds like FRP, right? Well, the first "[..]" part is actually "Constraint Imperative Programming" and the second is "constraints",
from the abstract of a 1994 paper. Similarly, I've seen it stated that FRP is like a spreadsheet.
The connection between functional programming and constraint programming is also well
known and documented in the literature, for example the experience report above states the
following:
Since constraints are simply functional programming dressed up with syntactic sugar, it should not be surprising that 1) programmers do not think of using constraints for most programming tasks and, 2) programmers require extensive training to overcome their procedural instincts so that they will use constraints.
However, you wouldn't be able to tell that there's a relationship there from reading
the FRP literature, which focuses exclusively on the connection to functional
programming via functional reactive animations and Microsoft's Rx extensions. Explaining and particularly motivating FRP this way has the
fundamental problem that whereas functional programming, which is per definition
static/timeless/non-reactive, really needs something to become interactive,
reactivity is already inherent in OO. In fact, reactivity is the quintessence of
objects: all computation is modeled as objects reacting to messages.
So adding reactivity to an object-oriented language is, at first blush, non-sensical
and certainly causesconfusion when explained this way.
I was certainly confused, because until I found this one
paper on reactive imperative programming,
which adds constraints to C++ in a very cool and general way,
none of the documentation, references or papers made the connection that seemed so
blindingly obvious to me. I was starting to question my own sanity.
Architecture
Additionally, one-way dataflow constraints creating relationships between program variables
can, as far as I can tell, always be replaced by a formulation where the dependent
variable is simply replaced by a method that computes the value on-demand. So
instead of setting up a constraint between point1.x and point2.x,
you implement point2.x as a method that uses point1.x to
compute its value and never stores that value. Although this may evaluate more
often than necessary rather than memoizing the value and computing just once, the
additional cost of managing constraint evaluation is such that the two probably
balance.
However, such an implementation creates permanent coupling and requires dedicated
classes for each relationship. Constraints thus become more of an architectural
feature, allowing existing, usually stateful components to be used together without
having to adapt each component for each individual ensemble it is a part of.
Panta Rhei
Everything flows, so they say. As far as I can tell, two different
communities, the F(R)P people and the OO people came up with very similar
solutions based on data flow. The FP people wanted to become more reactive/interactive,
and achieved this by modeling time as sequence numbers in streams of values, sort
of like Lucid or other dataflow languages.
The OO people wanted to be able to specify relationships declaratively and have
their system figure out the best way to satisfy those constraints, with
a large and useful subset of those constraints falling into the category of
the one-way dataflow constraints that, at least to my eye, are equivalent
to FRP. In fact, this sort of state maintenance and update-propagation
pops up in lots of different places, for example makefiles or other
build systems, web-server generators, publication workflows etc. ("this
OmniGraffle diagram embedded as a PDF into this LaTeX document that
in turn becomes a PDF document" -> the final PDF should update
automatically when I change the diagram, instead of me having to
save the diagram, export it to PDF and then re-run LaTeX).
What's kind of funny is that these two groups seem to have converged
in essentially the same space, but they seem to not be aware of
each other, maybe they are phase-shifted with respect to each other?
Part of that phase-shift is, again, communication. The FP guys
couch everything in must destroy all humans er state rethoric,
which doesn't do much to convince OO guys who know that for most
of their programs, state isn't an implementation detail but fundamental
to their applications. Also practical experience does not support the
idea that the FP approach is obvious:
Unfortunately, given the considerable amount of time required to train students to use constraints in a non-graphical manner, it does not seem reasonable to expect that constraints will ever be widely used for purposes other than graphical layout. In retrospect this result should not have been surprising. Business people readily use constraints in spreadsheets because constraints match their mental model of the world. Similarly, we have found that students readily use constraints for graphical layout since constraints match their mental model of the world, both because they use constraints, such as left align or center, to align objects in drawing editors, and because they use constraints to specify the layout of objects in precision paper sketches, such as blueprints. However, in their everyday lives, students are much more accustomed to accomplishing tasks using an imperative set of actions rather than using a declarative set of actions.
Of course there are other groups hanging out in this convergence zone, for example the
Unix folk with their pipes and filters. That is also not too surprising if
you look at the history:
So, we were all ready. Because it was so easy to compose processes with shell scripts. We were already doing that. But, when you have to decorate or invent the name of intermediate files and every function has to say put your file there. And the next one say get your input from there. The clarity of composition of function, which you perceived in your mind when you wrote the program, is lost in the program. Whereas the piping symbol keeps it. It's the old thing about notations are important.
I think the familiarity with Unix pipes also increases the itch: why can't I have
that sort of thing in my general purpose programming language? Especially when
it can lead to very concise programs, such as the Quartz-like graphics subsystem
Gezira written in
under 400 lines of code using the Nile dataflow language.
Moving Forward
I too have heard the siren sing.
I also think that a more spreadsheet-like programming model would not just make my life
as a developer easier, it might also make software more approachable for end-user adaptation and tinkering,
contributing to a more meaningful version of open source. But how do we get there?
Apart from a reasonable implementation and better debuggingsupport, a new system would need much tighter
language integration. Preferably there would be a direct syntax for expressing constraints
such as that available in constraint imperative programming languages or constraint extensions to existing
languages like
Ruby or JavaScript.
This language support should be unified as much as
possible between different constraint systems, not one mechanism for Autolayout and a
completely different one for Bindings.
Supporting constraint programming has always been one of the goals of my Objective-Smalltalk project, and so far that has informed the
PolymorphicIdentifiers that support a uniform interface for data backed by different types of
stores, including one or more constraint stores supporting cooperating solvers, filesystems or web-sites. More needs
to be done, such as extending the data-flow connector hierarchy to conceptually integrate
constraints. The idea is to create a language that does not actually include constraints
in its core, but rather provides sufficient conceptual, expressive and implementation
flexibility to allow users to add such a facility in a non-ad-hoc way so that it is
fully integrated into the language once added. I am not there yet, but all the results
so far are very promising. The architectural focus of Objective-Smalltalk also ties
in well with the architectural interpretation of constraints.
There is a lot to do, but on the other hand I think the payback is huge, and there is
also a large body of existing theoretical,
practical and empirical groundwork to fall back on, so I think the task is doable.
Your feedback, help and pull requests would be very much appreciated!
After thinking about the id subset and being pointed to WebScript, Brent Simmons imagines a scripting language. I have to admit I have been imagining pretty much the same language...and at some
time decided to stop imagining and start building Objective-Smalltalk:
Peer of Objective-C: objects are Objective-C objects, methods are Objective-C methods,
added to the runtime and indistinguishable from the outside.
"You can subclass UIViewController, or write a category on it."
The example is from the site, it was copied
from an actual program. As you can see, interoperability with the C parts of
Objective-C is still necessary, but not bothersome.
This example was also copied from an actual small educational game that was
ported over from Flash.
You also get Higher Order Messaging, Polymorpic Identifiers etc.
Works with the toolchain: this is a a little more tricky, but I've made
some progress...part of that is an llvm based native compiler, part is
tooling that enables some level of integration with Xcode, part is
a separate toolset that has comparable or better capabilities.
While Objective-Smalltalk would not require shipping source code with your applications,
due to the native compiler, it would certainly allow it, and in fact my own
BookLightning imposition program
has been shipping with part of its Objective-Smalltalk source hidden its Resources
folder for about a decade or so. Go ahead, download it, crack it open and have
a look! I'll wait here while you do.
Did you have a look?
The part that is in Smalltalk is the distilled (but very simple) imposition algorithm
shown here.
What this means is that any user of BookLightning could adapt it to suit their needs,
though I am pretty sure that none have done so to this date. This is partly due to
the fact that this imposition algorithm is too limited to allow for much variation,
and partly due to the fact that the feature is well hidden and completely unexpected.
There are two ideas behind this:
Open Source should be more about being able to tinker with well-made
apps in useful ways, rather than downloading and compiling gargantuan and
incomprehensible tarballs of C/C++ code.
There is no hard distinction between programming and scripting. A
higher level scripting/programming language would not just make developer's
jobs easier, it could also enable the sort of tinkering and adaptation that
Open Source should be about.
I don't think the code samples shown above are quite at the level needed to really
enable tinkering, but maybe they can be a useful contribution to the discussion.
The feedback was, effectively: "This code is incorrect, it is missing a return type". Of course, the code isn't incorrect in the least bit, the return type is id, because that is the default type, and in fact, you will see this style in both Brad Cox's book:
and the early NeXTStep documentation:
Having a default type for objects isn't entirely surprising, because at that time id was not just the default type, it was the only type available for objects, the optional static typing for objects wasn't introduced into Objective-C until later. In addition the template for Objective-C's object system was Smalltalk, which doesn't use static types, you just use variable names.
Cargo-cult typing
So while it is possible (and apparently common) to write -(id)objectAtIndex:(NSUInteger)anIndex, it certainly isn't any more correct. In fact, it's
worse, because it is just syntactic noise [1][2], although it is arguably even worse than what Fowler describes because it isn't actually mandated by
the language, the noise is inflicted needlessly.
And while we could debate as to whether it is better or not to write things that are redundant
syntactic noise, we could also not, as that was settled almost 800 years ago: entia non sunt multiplicanda praeter necessitatem. You could also say KISS or "when in doubt, leave it out", all of which just
say the the burden of proof is on whoever wants to add the redundant pieces.
What's really odd about this phenomenon is that we really don't gain anything from typing
out these explicit types, the code certainly doesn't become more readable. It's as if
we think that by following the ritual of explicitly typing out a type, we made the
proper sacrifice to the gods of type-safety and they will reward us with correctness.
But just like those Pacific islanders that built wooden planes, radios and control
towers, the ritual is empty, because it conveys no information to the type system,
or the reader.
The id subset
Now, I personally don't really care whether you put in a redundant (id)
or not, I certainly have been reading over it (and not even really noticing) for
my last two decades of Objective-C coding. However, the mistaken belief that it
has to be there, rather than this is a personal choice you make, does worry me.
I think the problem goes a little deeper than just slightly odd coding styles, because it seems to be part and parcel of a drive towards making Objective-C look like an explicitly statically typed language along the lines of C++ or maybe Java,
with one of the types being id. That's not the case: Objective-C
is an optionally statically typed language. This means that you
may specify type information if you want to, but you generally
don't have to. I also want the emphasize that you can at best get Objective-C
to look like such a language, the holes in the type system are way too big for
this to actually gain much safety.
Properties started this trend, and now the ARC variant of the language turns what used to be warnings about unknown selectors needlessly into hard compiler errors.
Of course, there are some who plausibly argue that this always should have been an error,
or actually, that it always was an error, we just didn't know about it.
That's hogwash, of course. There is a subset of the language, which I'd like
to call the id subset, where all the arguments and returns are object
pointers, and for this it was always safe to not have additional type information,
to the point where the compiler didn't actually have that additional type information.
You could also call it the Smalltalk subset.
Another thing that's odd about this move to rigidify Objective-C in the face of
success of more dynamic languages is that we actually have been moving into the
right direction at the language base-level (disregarding the type-system): in general programming style, with new syntax support
for object literals and subscripting, SmallInteger style NSNumbers modern
Objective-C consists much more of pure objects than was traditionally the case.
And as long as we are dealing with pure objects, we are in the id subset.
A dynamic language
What's great about the id subset is that it makes incremental, explorative
programming very easy and lots of fun, much like other dynamic languages
such as Smalltalk, Python or Ruby.
(Not entirely like them, due to the need to compile to native code, but compilers are fast these
days and there are possible fixes such as Objective-Smalltalk.)
The newly enforced rigidity is starting to make explorative programming in Objective-C much
harder, and a lot less fun. In fact, it feels much more like C++ or Java and much less
like the dynamic language that it is, and in my opinion is the wrong direction: we should
be making our language more dynamic, and of course that's what I've been doing. So while I wouldn't agree with that tradeoff even if
it were true, the fact is that we aren't actually
getting static type safety, we are just getting a wood prop that will not fly.
Discussion on Hacker News.
UPDATE: Inserted a little clarification that I don't care about bike-shedding your code
with regard to (id). The problem is that people's mistaken belief both that and why it has to be there is symptomatic of that deeper trend I wrote about.
Just had a case of codesign telling me my app was fine, just for the same app to be rejected by GateKeeper. The spctl tool fortunately was more truthful, but didn't really say where the problem was.
A little sleuthing determined that although I had signed all my frameworks with the Developer ID, two auxiliary executables were signed with my development certificate.
Lesson learned: don't trust codesign, use spctl to verify your binaries.
Actually: no it isn't, Transact-SQL got the honors. Apart from the obvious question, "Transact-Who?", it really should have been Objetive-C, because Tiobe readjusted the index mid-year in a way that resulted in a drop of 0.5% for the popular languages, which is fine, but without readjusting the historical data! Which is...not...especially if you make judgements based on relative performance.
In this case, Transact-SQL beat Objective-C by 0.17%, far less than the roughly 0.5% drop suffered by Objective-C mid-year. So Objective-C would have easily done the
hat-trick, but I guess Tiobe didn't want that and rigged the game to make sure
it doesn't happen.
Not that it matters...
UPDATE: I contacted Tiobe and they confirmed, both the lack of rebaselining and that Objective-C would likely have won an unprecedented third time in a row.
So my girlfriend finally got an iMessage capable phone, but Messages on my phone still insisted on sending SMSes. Even after starting to receive iMessages in the same conversation. Even after a message sent from the Messages app on OS X was duly noted as being an iMessage!
Various attempts to fix this state of affairs had no effect: changing the contact number to iPhone, deleting the conversation(s), twiddling with Messsages settings on both phones, including the "Send as SMS" preference.
What did work was performing a reset of the Network Settings.
Now that we have that out of the way, I want to have a look at Drew Crawford's You should use Core Data, which manages to come up with a less nuanced answer in its 2943 words. It's an older article (2012), but recently came to my attention via Drew McCormack (@drewmccormack): "Great post", he wrote, and after reading the article I not just disagreed, but found that Twitter wasn't really adequate for writing up all the things wrong with that article.
Where to start? Maybe at the beginning, right in the first paragraph the following is called out as a "myth":
Among them, Core Data is designed for something–not really sure what–but whatever it is, it’s a lot more complicated than what I need to do in my project.
First here is a categorical mistake, because unless Drew knows the exact requirements and engineering trade-offs in every iOS application, he can't know whether this is true or false, fact or myth.
The second mistake is that the basic statement "CoreData was designed for something else" is actually true. CoreData's design dates back to NeXT's Enterprise Object Framework or EOF for short, and EOF was designed as an ORM for talking to corporate relational database servers, with a variety of alternate back-ends for non-relational DBs (including the way cool 3270 adapter!).
Obviously the implementation is different and the design has diverged by now, but that is the basic design, and yes, that does do something that is more complicated than what some (many?) developers need.
Details.
Next sentence, next problem:
I just want to save some entities to disk.
I may well be reading too much into this, but using the word entities already bakes so many assumptions into the problem statement. When I actually do want to save entities, databases in general and CoreData in particular are somewhat higher on my list of technologies, but quite often I don't start with ERM, and therefore just have objects, XML or other data. While these could be ER-modeled with varying degrees of difficulty/success, they certainly don't have to be, and it's often not the best choice.
In the enterprise system described in REST: Advanced Research Topics and Practical Applications, we removed the database from the rewrite, because the variety of the data meant converting the DB into a key-value store one way or another, or having a schema that's about an A0 page in small print (a standards body had developed such a schema and I think we even got a poster). We instead ended up converging on an
EventPoster
architecture that kept the original XML feed files around and parsed them into
objects as necessary. No ERM here.
The next couple of paragraphs go off on an ad-hominem straw man tangent making WAG assumptions about the provenance of iOS developers and more WAGs about why that (assumed) provenance causes
said developers to have these misconceptions. Those "misconceptions" that actually
turn out to be true. Although largely irrelevant, it does contain some actual
misinformation. For example the fact that categories don't get linked with static libraries is not an LLMV bug, it's a consequence of the combined semantics of static libraries and Objective-C categories irrespective of compiler/linker versions.
Details.
Joel
Then there's another tangent to Joel Spolsky's article on Things You Should Never Do (the link in Drew's article is dead), such as rewriting legacy code from scratch,
which Joel describes categorically as "single worst strategic mistake" that any
software company can make. In the words of the great Hans Bethe: "Ach, that
isn't in wrong!":
Just because Joel wrote something makes it right or applicable why?
While Joel makes some good points and is right to counter a tendency to
not want to deal with existing code, his claim is most certainly wrong
in its absoluteness, both empirically (the system referenced above was
a rewrite with tremendous benefits, then there's OS X vs. trying to keep fixing Classic Mac OS indefinitely etc.) and logically: it only holds true if
all your accumulated complexity is of the essentialkind, and none if of the accidentalkind. That idea seems ludicrous to me,
virtually all software is rushed, has shifting requirements that are only
understood after the fact, has historical limitations (such as having to run
in 128KB of RAM on a 7MHz CPU with no MMU) etc.
Sometimes a rewrite is warranted, though you should obviously be wary and
not undertake such a project lightly.
Of course the biggest non-sequitur in the whole tangent is "How
on earth does the problem of reading source code and rewriting legacy systems
apply to this situation?". Apart from "not at all"? We don't have access to
CoreData source code and there is no question about us
rewriting it. Well, actually, I used to have such access, but even though
my remit would have allowed some rewrites, I don't think I could have done
a better job for the technology constraints chosen. The question is whether to use it or not, including whether the technology constraints are
appropriate.
And no, not every persistence solution ends up as effectively a rewrite of
CoreData.
Details
After that ill-conceived tangent, the article goes right back to the next unsubstantiated ad-hominem:
Here are some things you probably haven’t thought about when architecting your so-called data stack:
Apart from the attitude somewhat unbefitting to someone who gets so much wrong, well,
Nein, but lets's look at some of these in detail:
Handling future changes to the data schema: this just isn't hard if you're not using a relational database. In fact high variability of the schema was one of
the reasons for ditching the DB in you know...
Bi-directional synchronization with servers...hahhah...!
Bi-directional synchronization with peers...see above
Undo support: gosh, I wish there were some sort of facility that would
manage undo support on my model objects, if I had my way, Apple would
add this and call it NSUndoManager. Without such a useful class, I'll just
have to do complicated things like renaming old versions of my data file
or storing deltas.
Multithreading. Really? Multithreading?? If you think CoreData makes multithreading
easier rather than harder, I have both a nice bridge and some oceanfront
property in Nevada to sell you.
Device/time constraints: the performance ca(na)rd. CoreData is slow and
memory intensive. It makes up for this by adding the ability and the
requirement for the
developer to continuously tune the working set, if that is an option.
If continuously minimizing/tuning the working set is not an option (i.e.
you have some large datasets), you're hosed.
Details
Magic
Then we're back to the familiar ad-hominems/straw-men and non-sequitur analogies for a
couple of paragraphs, nothing to see there. The whole analogy to Cocoa is useless:
yes, Cocoa is good. How does this make CoreData good (it may or may not be, there
simply is no connection) or appropriate for my tasks? Also, Cocoa in
particular and Apple code in general is not "magic". I've been there, I've seen
it and I fixed some of it. A lot of it is good, some excellent, but some not
so much (sleep(5) anyone?).
The section entitled But, there are times not to use CoreData, right? could
have been the saving grace, but alas the author blows it again. Having a
"mark all as read" option in a NewsReader application is a "strange" "corner case"?
Right. Let me turn this section around: there is a very narrow range of
application areas where CoreData is or might be appropriate. Mostly read-only
data, highly regular (table-like) structure, no need for bulk operations ever,
preferably no trees either and no binary data. Fairly loose performance
requirements.
What Apple says
The section on "what Apple says" is also gold, though I have to give credit
to the Apple doc writers for managing to strongly suggest things without actually
explicitly claiming them, strongly enough to fool this particular writer:
Apple’s high-level APIs can be much faster than ‘optimized’ code at lower levels.
This is obviously true, but completely meaningless. My tricycle can be
faster than a Ferrari, for example if the Ferrari is out of gas or has a flat tire,
or just parked.
When we actually measured the performance of applications adopting CoreData at
Apple we invariably got a significant performance regression. Then
a lot of effort would be expended by all the teams involved in order to fix
the regression, optimization effort that hadn't been expended on the original
application, usually making up some of the shortfall.
I find the code reduction claim also specious, at least when stated as an
unquestionable fact like this. In my experience, code size was reduced
when removing CoreData and its predecessor EOF. Had we had exactly the
requirements that CoreData/EOF were designed for (talking to existing
relational enterprise databases), the result would almost certainly
been different, but those were not our requirements, and I doubt that
most iOS apps have those requirements (and in fact, CoreData didn't
even support taking to external SQL databases, at all, a puzzling
development for all the EOF veterans).
For managing small amounts
of data inside in iOS application, CoreData is almost always overkill,
because it effectively simulates an entire client/server enterprise
application stack within your phone or desktop.
The performance and complexity costs to pay for that overkill are
substantial.
So Should You Use Core Data?
As I wrote: it depends.
The article in question at least adds no useful information
to answering that question, only inappropriate analogies, repeated
claims without any supporting evidence and lots of ad-hominems that
are probably meant to be witty. If you believe its last section,
the only reason developers don't use CoreData is because they are
naive, lazy or ignorant.
This is evidently not true and ignores even the most basic concepts
that one would apply to answering the question, such as, say,
fitness for purpose. CoreData may well be the best implementation
of an in-process-ORM simulating a client-server enterprise app there
is, and there is good evidence that this is in fact true (certainly the
people on it are top notch and some of the code is amazing).
However, all that doesn't help if that's not what you need.
Considering the gleefully paraded
ignorance of not just alternatives but various other programming aspects,
I'd take the unconditional advice this article
gives with a pinch of salt. Mountain-sized pinch, that is.
So Google shut down Reader. When it happened all my news feeds went dead. I looked through the settings in my news reader, NetNewsWire 3.3.2, found the checkbox for "sync with Google Reader". Unchecked it.
It started syncing again, I did a "mark all as read" and things were back to normal. Now about that Snowden fellow...
In his Rarely Asked Questions,
Paul Graham once more espouses LISP as the ultimate, untoppable programming
language (and yes, autocomplete wants to approriately turn that into "unstoppable").
One of the points he makes to support this, is that for any language to
become as good as LISP, it would actually need to become LISP. While it is
a cogent point and well argued, I don't buy it. More precisely, this is what he writes on what it would take to add the equivalent of LISP macros to another language:
But it would be hard to do that without creating a notation for parse trees; and once you do, your language has become a skin on Lisp, in much the same way that in OS X, the Mac OS became a skin on Unix.
Hmmm...I am not sure that this analogy is making his point, because Mac OS X is far, far
superior to raw Unix for most people, and preferred by hackers, as he himself writes in Return of the Mac:
All the best hackers I know are gradually switching to Macs.
Of course, I may be stretching the analogy too far, but it seems to me that
it doesn't support Paul's thesis of LISP superiority, but rather clearly
points to some language Y that delivers LISP's power in a much more useful
and usable form.
Marco Arment notes that those complaining of RSS Inbox overload are not using RSS correctly: it should be used to check on a possibly large number of rarely updated but valuable sites.
He recommends simply deleting feeds that are updated frequently. I have a slightly different approach: two basic feed directories. One is my A-List, which contains feeds that roughly fall into the "rarely updated but valuable" category, and the other is my River of News, which are feeds that I enjoy and which keep me up-to-date, but which I don't mind clearing out by marking all as read.
Incidentally, pretty much the same goes for my E-Mail Inbox: as much of the lower priority stuff as possible is sorted into specific mailboxes automatically. The rest I triage quickly into one of 3 categories: action, read-ponder, will-file.
I used to sort by categories and that didn't work at all. In fact, it frequently managed to invert the actual priorities, leaving me focused on e-mails that weren't actually important, just hard to categorize, whereas most of the important e-mails would have their own categories and disappear from view.
For some reason my Mussel Wind application was running fine without a root view controller, but with the annoying "Applications are expected to have a root view controller at the end of application launch" message. I vaguely remember autorotation being somewhat tricky, but in the end it was working and all was good.
Until I got the request to add zoom to the large version of the web cam view (you get it when you tap the image). Put in the scroll view, set it to zoomable, added the tap gesture recognizer. All good. Except nor more autorotation. Put in all the debugging messages for rotation masks, and shouldRotate etc. All called, but only on creation, not when the device is rotated. Nada.
On a whim, decided to fix the missing root view controller message by adding the view controller manually. First, no joy because UIKit was sending a "viewControllers" message that wasn't being handled. This despite the fact that my class is a straight UIViewController subclass and rootViewController is documented as taking just that. Oh well, just implement that as returning an empty NSArray and presto: autorotation!