Showing posts with label Cocoa. Show all posts
Showing posts with label Cocoa. Show all posts

Saturday, March 9, 2019

Software-ICs, Binary Compatibility, and Objective-Swift

Swift recently achieved ABI stability, meaning that we can now ship Swift binaries without having to ship the corresponding Swift libraries. While it's been a long time coming, it's also great to have finally reached this point. However, it turns out that this does not mean you can reasonably ship binary Swift frameworks, for reasons described very well by Peter Steinberger of PSPDFKit and the good folks at instabug.

To reach this not-quite-there-yet state took almost 5 years, which is pretty much the total time NeXT shipped their hardware, and it mirrors the state with C++, which is still not generally suitable for binary distribution of libraries. Objective-C didn't have these problems, and as it turns out this is not a coincidence.

Software ICs

Objective-C was created specifically to implement the concept of Software-ICs. I briefly referenced the concept in a previous article, and also mentioned its relationship to the scripted components pattern, but the comments indicated that this is no longer a concept people are familiar with.

As the name suggests the intention was to bring the benefits the hardware world had reaped from the introduction of the Integrated Circuits to the software world.

It is probably hard to overstate the importance of ICs to the development of the computer industry. Instead of assembling computers from discrete components, you could now put entire subsystem onto one component, and then compose these subsystems to form systems. The interfaces are standardised pins, and the relationship between the outside interface and the complexity hidden inside can be staggering. Although the socket of the CPU I am writing is a beast, with 1151 pins, the chip inside has a staggering 2.1 billion transistors. With a ratio of one million to one, that's a very deep interface, even if you disregard the fact that the bulk of those pins are actually voltage supply and ground pins.

The important point is that you do not have to, and in fact cannot, look inside the IC. You get the pins, very much a binary interface, and the documentation, a specification sheet. With Software-ICs, the idea was the same: you get a binary, the interface and a specification sheet. Here are two BYTE articles that describe the concepts:


A lot of what they write seems quaint now, for example a MailFolder that inherits from Array(!), but the concepts are very relevant, particularly with a couple of decades worth of perspective and the new circumstances we find ourselves in.

Although the authors pretty much equate Software-ICs with objects and object-oriented programming, it is a slightly different form of object-oriented programming than the one we mostly use today. They do write about object/message programming, similar to Alan Kay's note that 'The big idea is "messaging"'.

With messaging as the interconnect, similar to Unix pipes, our interfaces are sufficiently well-defined and dynamic that we really can deliver our Software-ICs in binary form and be compatible, something our more static languages like C++ and Swift struggle with.

Objective-C is middleware with language features.

Message Oriented Middleware

Virtually all systems based on static languages eventually grow an additional, separate and more dynamic component mechanism. Windows has COM, IBM has SOM, Qt has signals and slots and the meta-object system, Be had BMessages etc.

In fact, the problem of binary compatibility of C++ objects was one of the reasons for creating COM:

Unlike C++, COM provides a stable application binary interface (ABI) that does not change between compiler releases.
COM has been incredibly successful, it enables(-ed?) much of the Windows and Office ecosystems. In fact, there is even a COM implementation on macOS: CFPlugin, part of CoreFoundation.

CFPlugIn provides a standard architecture for application extensions. With CFPlugIn, you can design your application as a host framework that uses a set of executable code modules called plug-ins to provide certain well-defined areas of functionality. This approach allows third-party developers to add features to your application without requiring access to your source code. You can also bundle together plug-ins for multiple platforms and let CFPlugIn transparently load the appropriate plug-in at runtime. You can use CFPlugIn to add plug-in capability to, or write a plug-in for, your application.
That COM implementation is still in use, for example for writing Spotlight importers. However, there are, er, issues:
Creating a new Spotlight importer is tricky because they are based on CFPlugIn, and CFPlugIn is… well, how to say this diplomatically?… super ugly )-: One option here is to use Xcode 9 to create your plug-in based on the old template. Honestly though, I don’t recommend that because the old template… again, diplomatically… well, let’s just say that the old template lets the true nature of CFPlugIn shine through! (-:
Having written both Spotlight importers and even some COM component on Windows (I think it was just for testing), I can confirm that COM's success is not due to the elegance or ease-of-use of the implementation, but due to the fact that having an interoperable, stable binary interface is incredibly enabling for a platform.

That said, all this talk of COM is a bit confusing, because we already have NSBundle.

Apple uses bundles to represent apps, frameworks, plug-ins, and many other specific types of content.
So NSBundle already does everything a CFPlugin does and a lot more, but is really just a tiny wrapper around a directory that may contain a dynamic shared library. All the interfacing, introspection and binary compatibility features come automagically with Objective-C. In fact, NeXT had a Windows product called d'OLE that pretty automagically turned Objective-C libraries into COM-comptible OLE servers (.NET has similar capabilities). Again, this is not a coincidence, the Software-IC concept that Objective-C is based on is predicated on exactly this sort of interoperation scenario.

Objective-C is middleware with language features.

Frameworks and Microservices

To me, the idea of a Software-IC is actually somewhat higher level than a single object, I tend to see it at the level of a framework, which just so happens to provide all the trappings of a self-contained Software-IC: a binary, headers to define the interface and hopefully some documentation, which could even be provided in the Resources directory of the bundle. In addition, frameworks are instances of NSBundle, so they aren't limited to being linked into an application, they can also be loaded dynamically.

I use that capability in Objective-Smalltalk, particularly together with the stsh the Smalltalk Scripting Shell. By loading frameworks, this shell can easily be transformed into an application-specific scripting language. An example of this is pdfsh, a shell for examining an manipulating PDF files using EGOS, the Extensible Graphical Object System.


#!/usr/local/bin/stsh
#-<void>pdfsh:<ref>file
framework:EGOS_Cocoa load.
pdf := MPWPDFDocument alloc initWithData: file value.
shell runInteractiveLoop

The same binary framework is also used in in PdfCompress, PostView and BookLightning. With this framework, my record for creating a drag-and-drop applicaton to do something useful with a PDF file was 5 minutes, and the only reason I was so slow was that I thought I had remembered the PDF dictionary entry...and had not.

Framework-oriented programming is awesome, alas it was very much deprecated by Apple for quite some time, in fact even impossible on iOS until dynamic libraries were allowed. Even now, though, the idea is that you create an app, which consists of all the source-code needed to create it (exception: Apple code!), even if some of that code may be organised into framework units that otherwise don't have much meaning to the build.

Apps, however are not Software-ICs, they aren't the right packaging technology for reuse (AppleScript notwithstanding). And so iOS and macOS development shops routinely get themselves into big messes, also known as the Big Ball of Mud architectural pattern.

Of course, there are reasons that things didn't quite work out the way we would have liked. Certainly Apple's initial Mac OS X System Architecture book showed a much more flexible arrangement, with groups of applications able to share a set of frameworks, for example. However, DLL hell is a thing, and so we got a much more restricted approach where every app is a little fortress and frameworks in general and binary frameworks in particular are really something for Apple to provide and for the rest to use. However, the fact that we didn't manage to get this right doesn't mean that the need went away.

Swift has been making this worse, by strongly "suggesting" that everything be compiled together and leading to such wonderful oxymorons as "whole module optimisation in debug mode", meaning without optimisation. That and not having a binary modularity story for going on half a decade. The reason for compiling whole modules together is that the modularity mechanism is, practically speaking, very much source-code based, with generics and specialisation etc. (Ironically, Swift also does some pretty crazy things to enable separate compilation, but that hasn't really panned out so far).

On the other hand, Swift's compiler is so slow that teams are rediscovering forms of framework-oriented programming as a self-defense mechanism. In order to get feedback cycles down from ludicrously bad to just plain awful, they split up their projects into independent frameworks that they then compile and run independently during development. So in a somewhat roundabout way, Swift is encouraging good development practices.

I find it somewhat interesting that the industry is rediscovering variants of the Software-IC, in this case on the backend in the form of Microservices. Why do I say that Microservices are a form of Software-IC? Well, they are a binary unit of deployability, fairly loosely coupled and dynamically typed. In fact, Fred George, one of the people who came up with the idea refers to them as Smalltalk objects:

Of course, there are issues with this approach, one being that reliable method calls are replaced with unreliable network calls. Stepping back for a second should make it clear that the purported benefits of Microservices also largely apply to Software-ICs. At least real Software-ICs. Objective-C made the mistake of equating Software-ICs with objects, and while the concepts are similar with quite a bit of overlap, they are not quite the same. You certainly can use Objective-C to build and connect Software-ICs if you want to do that. It will also help you in this endeavour, but of course you have to know that this is something you want. It doesn't do this automatically and over time the usage of Objective-C has shifted to just a regular old object-oriented language, something it is OK but not that brilliant at.

Interoperability

One of the interesting aspects of Microservices is that they are language-agnostic, it doesn't matter what language a particular services is written in, as long as they can somehow communicate via HTTP(S). This is another similarity to Software-ICs (and other middleware such as COM, SOM, etc.): there is a fairly narrowly defined, simple interface, and as long as you can somehow service that interface, you can play.

Microservices are pretty good at this, Unix filters probably the most extreme example and just about every language and every kind of application on Windows can talk to and via COM. NeXT only ever sold 50000 computers, but in a short number of years the NeXT community had bridges to just about every language imaginable. There were a number of Objective- languages, including Objective-Fortran. Apple alone has around 140K employees (though probably a large number of those in retail), and there are over 2.8 million iOS developers, yet the only language integration for Swift I know of is the Python support, and that took significant effort, compiler changes and the original Swift creator, Chris Lattner.

This is not a coincidence. Swift is designed as a programming language, not as middleware with language features. Therefore its modularity features are an add-on to the language, and try to transport the full richness of that programming model. And Swift's programming model is very rich.

Objective-Swift

The middlewares I've talked about use the opposite approach, from the outside in. For SOM, it is described as such:
SOM allows classes of objects to be defined in one programming language and used in another, and it allows libraries of such classes to be updated without requiring client code to be recompiled.
So you define interfaces separately from their implementations. I am guessing this is part of the reason we have @interface in Objective-C. Having to write things down twice can be a pain (and I've worked on projects that auto-generated Objective-C headers from implementation files), but having a concrete manifestation of the interface that precedes the implementation is also very valuable. (One of the reasons TDD is so useful is that it also forces you to think about the interface to your object before you implement it).

In Swift, a class is a single implementation-focused entity, with its interface at best a second-class and second-order effect. This makes writing components more convenient (no need to auto-generate headers...), but connecting components is more complicated.

Which brings us back to that other complication, the lack of stable binary compatibility for libraries and frameworks. One consequence of this is to write frameworks exclusively in Objective-C, which was particularly necessary before ABI stability had been reached. The other workaround, if you have Swift code, is to have an Objective-C wrapper as an interface to your framework. The fact that Swift interoperates transparently with the Objective-C runtime makes this fairly straightforward.

Did I mention that Objective-C is middleware with language features?

So maybe this supposed "workaround" is actually the solution? Have Objective-C or Objective- as our message-oriented middleware, the way it was always intended? Maybe with a bit of a tweak so that it loses most of the C legacy and gains support for pipes and filters, REST/Microservices and other architectural patterns?

Just sayin'.

Wednesday, February 6, 2019

Why Objective-Smalltalk is (not) a Smalltalk

One of the main goals of Objective-Smalltalk is to make it possible to express programs in mixtures of diverse architectural styles, and variations of those styles, naturally and comprehensibly. That needs a mechanism that abstracts from any specific style, while at the same time accommodating all, or at least a wide variety of styles.

I don't know how to do that.

Yet.

A Smalltalk

Since I don't know how to do that yet, I can't build what I need, and since I can't build what I need I can't experiment with it and learn how to do what I need to do.

Instead of waiting for divine inspiration, designing something from thin air (that will obviously be perfect) or just throwing my hands up in despair, I need to pick some starting point from which I can start iterating. In terms of styles, there are a few to choose from, for example call-return, pipes-and-filters and REST. Call-return seems like an obvious choice, because it is something that is familiar, has widespread support and is very capable of implementing other styles.

Objective-C would have been a pleasant choice: I am very familiar with it and it is very suitable for building architectural and language adapters. This suitability is not an accident either, connectivity was the primary design goal, and the designers succeeded admirably given the constraints.

However, Objective-C also has significant drawbacks as a starting point: having already merged two languages, Smalltalk and C, it is rather large and unwieldy, with weird overlaps and other oddities. Being a superset of C, it also pretty much demands being compiled, whereas I want at least the option of interpretation.

WebScript would be an alternative, but it is not just dead, but also proprietary. It also very closely mimics the somewhat baroque Objective-C syntax, but without the actual reasons for those syntactic compromises or the benefits that this sacrifice brings in Objective-C.

These and most other existing language choices would mean extending an existing language, grammar and implementation, meaning that the architectural features of the language would be almost certainly be 2nd class citizens, being tacked on wherever they can fit. That's not a good starting point (see: Objective-C).

So it really had to be a brand new language, one whose syntax and semantics would be under full control, and a syntactic starting point for the procedural part of that language. For this, I don't know of a better choice than the Smalltalk (keyword) message syntax. The syntax itself is tiny, with almost no reserved words or special characters reserved by the language, and almost all "language" features implemented as objects and messages without special syntax.

Having the procedural base as trivial as possible to implement is important, because I don't want to spend too much effort on it. I have (much) bigger fish to fry. Smalltalk also integrates well conceptually and syntactically with Cocoa and Objective-C, my preferred environment, a point not lost on the plethora of Smalltalk-based scripting languages available for Cocoa and GNUstep. And having a rich environment to integrate with is important for a language intended to connect existing pieces, which is what an architectural language does, or should do.

Not a Smalltalk

So given all the arguments for Smalltalk, surely Objective-Smalltalk is based on one of the existing Smalltalks, such as Squeak, enjoying the great development environment, malleable infrastructure etc.

Not so fast.

Although Smalltalk fits very well conceptually and syntactically, it doesn't fit at all well technically. It generally runs in its own world, the image, requires a complex and sophisticated VM, with all the integration headaches that entails (FFI, JNI, etc.), and comes with and requires a GC. Integrating multiple tracing GCs is essentially an unsolved problem, so that's a bit of a downer for a language that wants to be able to glue existing pieces together.

The philosophy of current Smalltalks is the exact opposite: it wants to be the entire world. This is understandable: when Smalltalk was created there simply wasn't a "rest of the world" to talk to, and there certainly wasn't room for it on the same machine, an Alto with 128-512KB of RAM and a 2.5MB removable hard disk.

A current Smalltalk has a lot of code, and communities that think it's just the greatest thing on earth, so resistance to change is significant and reasonable, both to smaller changes, because they just aren't worth it in that context, and to larger changes because they make things just too different. However, I want to make both large and small changes and not be hobbled by linguistic backwards compatibility.

[..] when ST hit the larger world, it was pretty much taken as "something just to be learned", as though it were Pascal or Algol. Smalltalk-80 never really was mutated into the next better versions of OOP. Given the current low state of programming in general, I think this is a real mistake.
Alan Kay: prototypes vs classes was: Re: Sun's HotSpot

So Objective-Smalltalk takes cues from Smalltalk where this is helpful, but it is not really a new version of Smalltalk. It's not a "better old thing" (>45 years), but a (probably worse) "new thing", and for that reason it has to strike out on its own.

Sunday, December 30, 2018

UIs Are Not Pure Functions of the Model - React.js and Cocoa Side by Side

When I first saw React.js, I had a quick glance and thought that it was cool, they finally figured out how to do a Cocoa-like MVC UI framework in JavaScript.

Of course, they could have just used Cappuccino, but "details". Some more details were that their version of drawRect: was called render() and returned HTML instead of drawing into a graphics context, but that's just the reality of living inside a browser. And of course there was react-canvas.

Overall, though, the use of a true MVC UI framework seemed like a great step forward compared to directly manipulating the DOM on user input, at least for actual applications.

Imagine my surprise when I learned about React.native! It looked like they took what I had assumed to be an implementation detail as the main feature. Looking a little more closely confirmed my suspicions.

Fortunately, the React.js team was so kind as to put their basic ideas in writing: React - Basic Theoretical Concepts (also discussed on HN) ). So I had a look and after a bit of reading decided it would be useful to do a side-by-side comparison with equivalents of those concepts in Cocoa as far as I understand them.


React

Cocoa

Transformation

The core premise for React is that UIs are simply a projection of data into a different form of data. The same input gives the same output. A simple pure function.

function NameBox(name) {
  return { fontWeight: 
          'bold', labelContent: name };
}

Transformation

A core premise of Cocoa, and MVC in general, is that UIs are a projection of data into a different form of data, specifically bits on a screen. The same input gives the same output. A simple method:
 -(void)drawRect:(NSRect)aRect  {
	...
}
Due to the fact that screens and the bits on them are fairly expensive, we use the screen as an accumulator instead of returning the bits from the method.

We do not make the (unwarranted) assumption that this transformation can or should be expressed as a pure function. While that would be nice, there are many reasons why this is not a good idea, some pretty obvious.

Abstraction

You can't fit a complex UI in a single function though. It is important that UIs can be abstracted into reusable pieces that don't leak their implementation details. Such as calling one function from another.

function FancyUserBox(user) {
  return {
    borderStyle: '1px solid blue',
    childContent: [
      'Name: ',
      NameBox(user.firstName + ' ' +
user.lastName)
    ]
  };
}
{ firstName: 'Sebastian', 
   lastName: 'Markbåge' } ->
{
  borderStyle: '1px solid blue',
  childContent: [
    'Name: ',
    { fontWeight: 'bold',
    labelContent: 'Sebastian Markbåge' }
  ]
};

Abstraction

Although it is be possible to render an entire UI in a single View's drawRect:: method, and users of NSOpenGLView tend to do that, it is generally better practice to split complex UIs into reusable pieces that don't leak their implementation details.

Fortunately we have such reusable pieces, they are called objects, and we can group them into classes. Following the MVC naming conventions, we call objects that represent UI views, in Cocoa they are instance of NSView or its subclasses, in CocoaTouch the common superclass in called UIView.

Composition

To achieve truly reusable features, it is not enough to simply reuse leaves and build new containers for them. You also need to be able to build abstractions from the containers that compose other abstractions. The way I think about "composition" is that they're combining two or more different abstractions into a new one.

function FancyBox(children) {
  return {
    borderStyle: '1px solid blue',
    children: children
  };
}

function UserBox(user) {
  return FancyBox([
    'Name: ',
    NameBox(user.firstName + ' ' +
    user.lastName)
  ]);
}

Composition

To achieve truly reusable features, it is not enough to simply reuse leaves and build new containers for them. You also need to be able to build abstractions from the containers that compose other abstractions. The way I think about "composition" is that they're combining two or more different abstractions into a new one.

Examples of this are the NSScrollView, which composes the actual scrollers, themselves composed of different parts, a NSClipView to provide a window onto the user-provided contentView.

Other examples are NSTableViews coordinating their columns, rows, headers and the system- or user-provided Cells.

State

A UI is NOT simply a replication of server / business logic state. There is actually a lot of state that is specific to an exact projection and not others. For example, if you start typing in a text field. That may or may not be replicated to other tabs or to your mobile device. Scroll position is a typical example that you almost never want to replicate across multiple projections.

We tend to prefer our data model to be immutable. We thread functions through that can update state as a single atom at the top.

function FancyNameBox(user, likes,
  onClick) {
  return FancyBox([
    'Name: ', NameBox(user.firstName + ' ' +
  user.lastName),
    'Likes: ', LikeBox(likes),
    LikeButton(onClick)
  ]);
}

// Implementation Details

var likes = 0;
function addOneMoreLike() {
  likes++;
  rerender();
}

// Init

FancyNameBox(
  { firstName: 'Sebastian', 
   lastName: 'Markbåge' },
  likes,
  addOneMoreLike
);

NOTE: These examples use side-effects to update state. My actual mental model of this is that they return the next version of state during an "update" pass. It was simpler to explain without that but we'll want to change these examples in the future.

State

A UI is NOT simply a replication of server / business logic state. There is actually a lot of state that is specific to an exact projection and not others. For example, if you start typing in a text field. That may or may not be replicated to other tabs or to your mobile device. Scroll position is a typical example that you almost never want to replicate across multiple projections.

Fortunately, the view objects we are using already provide exactly this UI-specific state, so yay objects.

Memoization

Calling the same function over and over again is wasteful if we know that the function is pure. We can create a memoized version of a function that keeps track of the last argument and last result. That way we don't have to reexecute it if we keep using the same value.

function memoize(fn) {
  var cachedArg;
  var cachedResult;
  return function(arg) {
    if (cachedArg === arg) {
      return cachedResult;
    }
    cachedArg = arg;
    cachedResult = fn(arg);
    return cachedResult;
  };
}

var MemoizedNameBox = memoize(NameBox);

function NameAndAgeBox(user, currentTime)
 {
  return FancyBox([
    'Name: ',
    MemoizedNameBox(user.firstName +
     ' ' + user.lastName),
    'Age in milliseconds: ',
    currentTime - user.dateOfBirth
  ]);
}

Memoization

Calling the same method over and over again is wasteful.

So we don't do that.

First, we did not start with the obviously incorrect premise that the UI is a simple "pure" function of the model. Except for games, UIs are actually very stable, more stable than the model. You have chrome, viewers, tools etc. What is a (somewhat) pure mapping from the model is the data that is displayed in the UI, but not the entire UI.

So if we don't make the incorrect assumption that UIs are unstable (pure functions of model), then we don't have to expend additional and fragile effort to re-create that necessary stability.

In terms of optimizing output, this is also handled within the model, rather than in opposition to it: views are stable, so we keep a note of which views have collected damage and need to be redrawn. The application event loop coalesces these damage rectangles and initiates an optimized operation: it only redraws views whose bounds intersect the damaged region, and also passes the rectangle(s) to the drawRect:: method. (That's why it's called drawRect::).

Lists

Most UIs are some form of lists that then produce multiple different values for each item in the list. This creates a natural hierarchy.

To manage the state for each item in a list we can create a Map that holds the state for a particular item.

function UserList(users, likesPerUser,
 updateUserLikes) {
  return users.map(user => FancyNameBox(
    user,
    likesPerUser.get(user.id),
    () => updateUserLikes(user.id, 
likesPerUser.get(user.id) + 1)
  ));
}

var likesPerUser = new Map();
function updateUserLikes(id, likeCount) {
  likesPerUser.set(id, likeCount);
  rerender();
}

UserList(data.users, likesPerUser,
            updateUserLikes);

NOTE: We now have multiple different arguments passed to FancyNameBox. That breaks our memoization because we can only remember one value at a time. More on that below.

Lists

Most UIs are some form of lists that then produce multiple different values for each item in the list. This creates a natural hierarchy.

Fortunately there is nothing to do here, the basic hierarchical view model already takes care of it. There are specific view classes for lists, but there is nothing special about them in terms of the conceptual or implementation model.

Continuations

Unfortunately, since there are so many lists of lists all over the place in UIs, it becomes quite a lot of boilerplate to manage that explicitly.

We can move some of this boilerplate out of our critical business logic by deferring execution of a function. For example, by using "currying" (bind in JavaScript). Then we pass the state through from outside our core functions that are now free of boilerplate.

This isn't reducing boilerplate but is at least moving it out of the critical business logic.

function FancyUserList(users) {
  return FancyBox(
    UserList.bind(null, users)
  );
}

const box = FancyUserList(data.users);
const resolvedChildren =
     box.children(likesPerUser,
        updateUserLikes);
const resolvedBox = {
  ...box,
  children: resolvedChildren
};

Continuations

Fortunately, it doesn't matter how many lists of lists there are all over the place in UIs, since our composition mechanism actually works for this use case.

State Map

We know from earlier that once we see repeated patterns we can use composition to avoid reimplementing the same pattern over and over again. We can move the logic of extracting and passing state to a low-level function that we reuse a lot.

function FancyBoxWithState(
  children,
  stateMap,
  updateState
) {
  return FancyBox(
    children.map(child => child.continuation(
      stateMap.get(child.key),
      updateState
    ))
  );
}

function UserList(users) {
  return users.map(user => {
    continuation:
      FancyNameBox.bind(null, user),
    key: user.id
  });
}

function FancyUserList(users) {
  return FancyBoxWithState.bind(null,
    UserList(users)
  );
}

const continuation =
     FancyUserList(data.users);
continuation(likesPerUser,
 updateUserLikes);

State Map

Don't need it.

(Just how many distinct mechanism do we have now for re-introducing state? At what point do we revisit our initial premise that UIs are pure functions of the model??)

Memoization Map

Once we want to memoize multiple items in a list memoization becomes much harder. You have to figure out some complex caching algorithm that balances memory usage with frequency.

Luckily, UIs tend to be fairly stable in the same position. The same position in the tree gets the same value every time. This tree turns out to be a really useful strategy for memoization.

We can use the same trick we used for state and pass a memoization cache through the composable function.

function memoize(fn) {
  return function(arg, memoizationCache) {
    if (memoizationCache.arg === arg) {
      return memoizationCache.result;
    }
    const result = fn(arg);
    memoizationCache.arg = arg;
    memoizationCache.result = result;
    return result;
  };
}

function FancyBoxWithState(
  children,
  stateMap,
  updateState,
  memoizationCache
) {
  return FancyBox(
    children.map(child =>
    child.continuation(
      stateMap.get(child.key),
      updateState,
      memoizationCache.get(child.key)
    ))
  );
}

const MemoizedFancyNameBox =

      memoize(FancyNameBox);

Memoization Map

Huh?

Seriously?

Algebraic Effects

It turns out that it is kind of a PITA to pass every little value you might need through several levels of abstractions. It is kind of nice to sometimes have a shortcut to pass things between two abstractions without involving the intermediates. In React we call this "context".

Sometimes the data dependencies don't neatly follow the abstraction tree. For example, in layout algorithms you need to know something about the size of your children before you can completely fulfill their position.

Now, this example is a bit "out there". I'll use Algebraic Effects as proposed for ECMAScript. If you're familiar with functional programming, they're avoiding the intermediate ceremony imposed by monads.

function ThemeBorderColorRequest() 
{ }

function FancyBox(children) {
  const color = raise 
    new ThemeBorderColorRequest();
  return {
    borderWidth: '1px',
    borderColor: color,
    children: children
  };
}

function BlueTheme(children) {
  return try {
    children();
  } catch effect
     ThemeBorderColorRequest -> 
     [, continuation] {
    continuation('blue');
  }
}

function App(data) {
  return BlueTheme(
    FancyUserList.bind(null, data.users)
  );
}

Algebraic Effects

It turns out that it is kind of a PITA to pass every little value you might need through several levels of abstractions....

So don't do that.

This is another reason why it's advantageous to have a stable hierarchy of stateful objects representing your UI. If you need more context, you just ask around. Ask your parent, ask your siblings, ask your children, they are all present. Again, no special magic needed.


So that was the comparison. I have to apologise for getting somewhat less detail-oriented near the end, but the level of complexity just started to overwhelm me. Add to that, for React.native, the joy of having to duplicate the entire hierarchy of view classes, just to have the "components" generate not-quite-temporary temporary views that than generate layers that draw the actual UI. Maybe there's one too many layers. Or two.

The idea of UI being a pure function of the model seems so obviously incorrect, and leads to such a plethora of problems, that it is a bit puzzling how one could come up with it in the first place, and certainly how one would stick with it in face of the avalanche of problems that just keeps coming. A part of this is certainly the current unthinking infatuation with functional programming ideas. These ideas are broadly good, but not nearly as widely or universally applicable as some of their more starry-eyed proponents propose (I hesitate to use the word "think" in this context).

Another factor is the the usefulness of immediate-mode graphics, compared to retained-state graphics. This is actually an interesting topic by itself, IMHO, one of those eternal circles where we move from a fully retained model such as the early GKS or laterr Phigs to immedate drawing models such as Postscript, Quartz, und OpenGL, only to then re-invent the retained model (sort of) with things like CoreAnimation, Avalon and, of course, the DOM (and round and round we go). Cocoa's model represents a variable approach, where you can mix-and-match object-oriented and immediate-mode drawing as you see fit. But more on that later.

Last not least, it's probably not entirely coincidental that this idea was hatched for Facebook and Instagram feed applications. Similar to games, these sorts of apps have displays that really are determined mostly by their "model", the stream of data coming from their feed. I am not convinced that feed application generalizes well to application.

Anyway, for me, this whole exercise has actually motivated me to start using react.js a little. I still think that Cappuccino is probably the better, less confused framework, but it helps to know about what the quasi-mainstream is doing. I also think that despite all the flaws, react.js and react.native are currently eating native development's lunch. And that's certainly interesting. Stay tuned!

UPDATE:
Dan Abramov responds:

I think you’re overfocusing on the “pure” wording and theoretical definitions over practical differences.
To elaborate a bit, React components aren’t always “pure” in strict FP sense of the word. They’re also not always functions (although we’re adding a stateful function API as a preferred alternative to classes soon). Support for local state and side effects is absolutely a core feature of React components and not something we avoid for “purity”.

I added a PR to remove the misleading "pure" from the concepts page.

Sunday, December 23, 2018

A Minimal Test Runner

A long time ago when I was working a MPWTest, "The simplest Objective-C Unit Test Framework that could possibly work...", I had a brief chat with Kent Beck about it, and one of the things he said was that everyone should build their own unit test "framework".

Why the scare quotes?

If your testing framework is actually a framework, it's probably too big. I recently started porting MPWFoundation and Objective-Smalltalk to GNUstep again, in order to get it running In the Cloud™. In order to see how it's going, it's probably helpful to run the tests.

Initially, I needed to test some compiler issues with such modern amenities as keyed subscripting of dictionaries:


#import <Foundation/Foundation.h>

int main( int argc, char *argv[] ) {
  MPWDictStore *a=[MPWDictStore store];
  a[@"hi"]=@"there";
  NSLog(@"hi: %@",a[@"hi"]);
  return 0;

}

Once that was resolved with the help of Alex Denisov, I wanted to minimally run some tests, but the idea of first getting all of MPWTest to run wasn't very appealing. So instead I just did the simplest thing that could possible work:
static void runTests()
{
  int tests=0;
  int success=0;
  int failure=0;
  NSArray *classes=@[
    @"MPWDictStore",
    @"MPWReferenceTests",
  ];

  for (NSString *className in classes ) {
    id testClass=NSClassFromString( className );
    NSArray *testNames=[testClass testSelectors];
    for ( NSString *testName in testNames ) {
      SEL testSel=NSSelectorFromString( testName );
      @try {
        tests++;
        [testClass performSelector:testSel];
        NSLog(@"%@:%@ -- success",className,testName);
        success++;
      } @catch (id error)  {
        NSLog(@"%@:%@ == failure: %@",className,testName,error);
        failure++;
      }
    }

  }
  printf("\033[91;3%dmtests: %d total, %d successes %d failures\033[0m\n",
         failure>0 ? 1:2,tests,success,failure);
}

That's it, my minimal testrunner. With hard-coded list of classes to test. In a sense, that is the entire "test framework", the rest just being conventions followed by classes that wish to be tested.

And of course MPWTest's slogan was a bit...optimistic.

Sunday, November 11, 2018

Refactoring Towards Language

When implementing MVC in iOS and macOS, one common pattern is to use NSNotification to implement the notification that the model has changed and the view should redraw itself. This is a good pattern, as that is how MVC is intended to work, but it does lead to quite a bit of boilerplate in this incarnation.

Specifically, you typically have some version of the following code in one of the initialisation methods of your ViewControllers, which play the MVC role of the View in Cocoa and Cocoa Touch.


- (void)viewDidLoad
{
	...
    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector:@selector(modelDidChange:)
                                                 name:MPWAppNotificationModelDidChange
                                               object:nil];
	...
}

Refactor I

That's all good, except that you tend to repeat that code in every single ViewController, and there are usually quite a few of them. So one way of avoiding all that duplication is to refactor by extracting the duplicated functionality into a method.
- (void)subscribeToModelDidChangeNotification
{
    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector:@selector(modelDidChange:)
                                                 name:MPWAppNotificationModelDidChange
                                               object:nil];
}


- (void)unsubscribeFromModelDidChangeNotification
{
    [[NSNotificationCenter defaultCenter] removeObserver:self
                                                    name:MPWAppNotificationModelDidChange
                                                  object:nil];
}

- (void)subscribeToModelWasDeletedNotification
{
    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector:@selector(modelDidChange:)
                                                 name:MPWAppNotificationModelWasDeleted
                                               object:nil];
}


- (void)unsubscribeFromModelWasDeletedNotification
{
    [[NSNotificationCenter defaultCenter] removeObserver:self
                                                    name:MPWAppNotificationModelWasDeleted
                                                  object:nil];
}
...

Refactor II

The interesting thing that happens when you remove duplication is that it will often reveal more duplication. In particular, if you look at more of these, you will notice that the method bodies are largely identical. You always send a message to the defaultCenter of the NSNotificationCenter class, you pretty much always add yourself as the Observer, and I've rarely seen the object: parameter used. The only things that vary are the selector to send and the name of the notification.
    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector: <some selector>
                                                 name: <some notification>
                                               object:nil];


So yet another classic case for refactoring by extracting a helper method that encapsulates the things that do not change and takes the things that do vary as arguments.
-(void)subscribeToNotification:(NSString *)notificationName usingSelector:(SEL)selectorName
{
    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector:selectorName
                                                 name:notificationName
                                               object:nil];
}

We can also apply the same refactoring to unsubscribing.
-(void)unsubscribeFromNotification:(NSString *)notificationName
{
    [[NSNotificationCenter defaultCenter] removeObserver:self
                                                    name:notificationName
                                                  object:nil];
}

With the parameters extracted, our helper methods simplify to the following.
- (void)subscribeToModelDidChangeNotification
{
    [self subscribeToNotification: MPWAppNotificationModelDidChange usingSelector:@selector(modelDidChange:)];
}

- (void)unsubscribeFromModelDidChangeNotification
{
    [self unsubscribeFromNotification: MPWAppNotificationModelDidChange];
}

- (void)subscribeToModelWasDeletedNotification
{
    [self subscribeToNotification: MPWAppNotificationModelWasDeleted usingSelector:@selector(modelDidChange:)];
}

- (void)unsubscribeFromModelWasDeletedNotification
{
    [self unsubscribeFromNotification: MPWAppNotificationModelWasDeleted];
}

Refactor III

As happens very often when you extract code and remove duplication we discover more duplication that was previously somewhat hidden, for example because code that was too far away to really notice the duplication is now close enough together to make it more obvious. Specifically: we always define the same methods: subscribeTo<notification> and unsubscribeFrom<notification>. We also repeat the notification name in each of the method names, and also in each of the method bodies (with different prefixes/suffixes).
- (void)subscribeTo<notification-name>Notification
{
    [self subscribeToNotification: MPWAppNotification<notification-name> usingSelector:@selector(<notification-selector>:)];
}

- (void)unsubscribeFrom<notification-name>Notification
{
    [self unsubscribeFromNotification: MPWAppNotification<notification-name>];
}
- (void)<notification-selector>:(NSNotification *)notification
{
}

Up to this point, we managed to achieve our goals using plain refactoring techniques. Often, that is enough, and it’s generally a good idea to stop there, because beyond lies metaprogramming, which tends to make the code significantly more obscure. In Objective-C, we generally have runtime tricks and macro-programming using the C pre-processor (well: and code generation). In this case, I decided to give the C preprocessor a try, because I wanted things done at compile-time and the problem was parametrising parts of identifiers, i.e. string processing.

Doing some pattern matching, we have two actual “parameters", the “base” of the notification name ( for example ‘ModelDidChange’ or ‘ModelWasDeleted’ ) and the name of the method to call. So we create a Macro that takes these parameters and applies them to a template of the methods we need:


#define SUBSCRIBE_UNSUBSCRIBE_WITHMETHODNAME( name , notification, theMessage ) \
- (void)subscribeTo##name \
{\
    [self subscribeToNotification:notification usingSelector:@selector(theMessage:)];\
}\
\
- (void)unsubscribeFrom##name \
{\
    [self unsubscribeFromNotification:notification];\
}\
-(void)theMessage:(NSNotification *)notification\
{\
}\

#define SUBSCRIBE_UNSUBSCRIBE( commonName, theMessage ) SUBSCRIBE_UNSUBSCRIBE_WITHMETHODNAME( commonName##Notification, MPWAppNotification##commonName , theMessage )

The first macro has three arguments, a version of the notification name suitable for adding to the method names, the full notification name and the message name. The ‘##’ is the token pasting operator, which allows us to create new lexical tokens out of existing ones. We “need” it in this case because we have commonality we want to express that is on a sub-token, sub-identifier level.

There are a bunch of issues with this approach. First, it is pretty unreadable. You need to terminate every line with backslashes because Macro definitions can only be on a single logical line, the backslashes continue the logical line over physical lines. Dealing with compiler errors and warnings is rather tricky, because the error will be with the result of the macro expansion, which the error message will usually not contain. (You can tell the compiler to run just the pre-processor and give you the result, in case you need to debug and can’t do the macro expansion in your head…). Finally, you cannot search for the tokens that get generated by the pre-processor. So for example no command-clicking, and you have to be careful if you ever change one of the notification names.

The advantage is that you get a representation that succinctly states what you want to accomplish, without any duplicated boilerplate. No duplication also means no chance to get those duplicates wrong: the first few code samples in this post actually contain an error. (MPWAppNotificationModelWasDeleted also sends the modelDidChange: message instead of its own modelWasDeleted:) This error, almost certainly a copy-and-paste bug, was actually in the original code and had gone unnoticed for months until I tried to remove that duplication.

I don't know about you, but my eyes just glaze over when scanning large swathes of mostly duplicated code.

Anyway, with the Macros, our help methods are now defined as follows:


SUBSCRIBE_UNSUBSCRIBE( ModelDidChange, modelDidChange )
SUBSCRIBE_UNSUBSCRIBE( ModelWasDeleted, modelWasDeleted )

SUBSCRIBE_UNSUBSCRIBE( ActivityCountDidChange, activityCountDidChange )
SUBSCRIBE_UNSUBSCRIBE( ObjectIdDidChange, objectIdDidChange )
SUBSCRIBE_UNSUBSCRIBE( UserDidLogin, userDidLogin )

Note that previously we showed just two notifications, now we are showing all five, which would have been quite unwieldy before.

Refactor IV

As is typical when you remove duplication, you notice more duplication, because stuff is closer together: the two parameters are almost identical, except for capitalisation. This doesn't have to be the case, but it is a good convention to have. Alas, the pre-processor can’t change the capitalisation of strings so we are stuck.

To recap, this was a process of detecting duplication, removing that duplication using available mechanisms and then detecting more duplication, over several iterations. What then usually happens is that you either manage to remove all the duplication, or you notice that you cannot reduce duplication any further. Not being able to remove duplication typically means that you have reached limitations of your language, with metaprogramming facilities allowing you to push those limitations at least a little, and sometimes quite a bit.

This particular exploration relied on a somewhat formulaic use of NSNotificationCenter, one that always matches a notification with the same message. Since messages are already late-bound, this doesn't really present much of a restriction and seems a useful simplification, and it is a simplification that I've seen used widely in practice. The other pattern is that specific classes typically observe notifications for essentially their entire lifetime, meaning that the ability to dynamically turn notifications on and off is often not needed.

If we imagine language support for notifications (the implicit invocation architectural style), we would probably like to be able to declare notifications, declare that a class listens to a specific notification and and check conformance. This would make usage even more convenient than the macros we refactored to, while at the same time addressing the problems of the macro solution.

And of course that is what Notification Protocols became: a slight misappropriation of Objective-C and Swift Protocols to get something that is extremely close to actual language support. I can now actually declare my notifications as almost a programming-language thing (at least better than a string), and also specify the relationship between a message and that notification, which otherwise is purely lost in convention:


@protocol ModelDidChange <MPWNotificationProtocol>

-(void)modelDidChange:(NSNotifiction*)notification;

@end

I can also easily and declaratively specify that a particular class listens to a notification:
@interface NotifiedView:NSView <ModelDidChange>

@end

Not only does this remove the problems with the Macro approach, unreadability and untraceability, it is actually quite a bit better than the approach without Macros, all while being at least as compact. Last not least, notification sending is not just compact, but also obvious and at least somewhat compiler-checked:
[@protocol(ModelDidChange) notify];

This is very, very close to native language support, due to some lucky coincidences and the wise decision to make Protocols first class objects in Objective-C. It also does not match the flexibility of the NSNotificationCenter APIs, but I doubt whether that additional flexibility is actually used/useful.

Had I not experimented with removing duplication, and iterated on removing duplication, I never would have come to the point where notification protocols became an obvious solution. And now that I have notification protocols, I also have a good idea what actual language support should look like, because I've been using something pretty close.

So for me, plain old refactoring, different kinds of metaprogramming and language support all live on a continuum of improving expressiveness, and all are, or at least should be part of our feedback loops. How can we improve our language so we don't need metaprogramming for common use cases? How can we improve our metaprogramming facilities so they are sufficient and we don't feel a need for replacing them with actual language support? How can we do both, turn things that currently require metaprogramming look more like plain base-level programming while making it integrate better than metaprogramming solutions?

That, Detective, is the right question. Program terminated

Wednesday, October 31, 2018

Even Easier Notification Sending with Notification Protocols in Objective-C

Having revisited Notification Protocols in order to ensure they work with Swift, I had another look at how to make them even more natural in Objective-C.

Fortunately, I found a way:


[@protocol(ModelDidChange) notify:@"Payload"];
This will send the ModelDidChange notification with the object parameter @"Payload". Note that the compiler will check that there is a Protocol called ModelDidChange and the runtime will verify that it is an actual notification protocol, raising an exception if that is not true. You can also omit the parameter:
[@protocol(ModelDidChange) notify];
In both cases, the amount of boilerplate or semantic noise is minimised, whereas the core of what is happening is put at the forefront. Compare this to the traditional way:
[[NSNotificationCenter defaultCenter] postNotificationName:@"ModelDidChange" object:@"Payload"]
Here, almost the entire expression is noise, with the relevant pieces being buried near the end of the expression as parameters. No checking is (or can be) performed to ensure that the argument string actually refers to a notification name.

Of course, you can replace that literal string with a string constant, but that constant is also not checked, and since it lives in a global namespace with all other identifiers, it needs quite a bit of prefixing to disambiguate:


[[NSNotificationCenter defaultCenter] postNotificationName:WLCoreNotificationModelDidChange object:@"Payload"]
Would it be easy to spot that this was supposed to be WLCoreNotificationModelWasDeleted?

The Macro PROTOCOL_NOTIFY() is removed, whereas the sendProtocolNotification() function is retained for Swift compatibility.

Notification Protocols from Swift

When I introduced Notification Protocols, I mentioned that they should be Usable from Swift.

This is code from a sample Swift Playground that shows how to do this. The Playground needs to have access to MPWFoundation, for example by being inside a Xcode workspace that includes it.


import Foundation
import MPWFoundation

@objc protocol ModelDidChange:MPWNotificationProtocol {
    func modelDidChange( payload:NSNotification );
}

class MyView : NSObject,ModelDidChange {
    override public init() {
        super.init()
        self.installProtocolNotifications()
    }
    func modelDidChange( payload:NSNotification ) {
        print("I was notified, self: \(self) payload: \"\(payload.object!)\"")
    }
}

let target1 = MyView()
let target2 = MyView()

sendProtocolNotification( ModelDidChange.self , "The Payload")

A brief walkthrough:
  1. We declare a ModelDidChange notification protocol.
  2. We indicate that it is a notification protocol by adopting MPWNotificationProtocol.
  3. The notification protocol has the message modelDidChange
  4. We declare that MyView conforms to ModelDidChange. This means we declaratively indicate that we receive ModelDidChange notifications, which will result in MyView instance being sent modelDidChange() messages.
  5. It also means that we have to implement modelDidChange(), which will be checked by the compiler.
  6. We need to call installProtocolNotifications() in order to activate the declared relationships.
  7. We use sendProtocolNotification() with the Protocol object as the argument and a payload.
  8. The fact that we need a protocol object instead of any old String gives us additional checking.
Enjoy!

Saturday, September 26, 2015

Very Simple Dataflow Constraints with Objective-Smalltalk

Early last year, I wrote a lengthy piece on the connection between Apple technologies such as Key Value Observing (KVO) and Bindings and general Computer Science concepts such as constraint solving, particularly dataflow constraints (aka. Spreadsheet Constraints).

I also wrote that I was working on something, and despite being somewhat distracted with becoming a father, joining a startup and being acquired, I now have working code.

The sample application contains two examples, one a classic temperature converter that I will cover later, the other an implementation of the ReactiveCocoa password validation example. The basic idea is super-simple, we want to enable a login button when the password field and the confirmation field contain the same value, expressed as follows in Objective-C:


loginButton.enabled = [password.stringValue isEqual:passwordConfirm.stringValue];

Again, this is super simple to write down, but it's not the entire story, because we want to evaluate this statement continuously as the password field and the passwordConfirm field change. The mess of callbacks require to keep those states in sync vastly exceeds the one-time evaluation, as explained in a very good article on Reactive Cocoa by NSHipster. That article uses a slightly more elaborate example, the one in the ReactiveCocoa documentation is the following:


RAC(self, createEnabled) = [RACSignal 
    combineLatest:@[ RACObserve(self, password), RACObserve(self, passwordConfirmation) ] 
    reduce:^(NSString *password, NSString *passwordConfirm) {
        return @([passwordConfirm isEqualToString:password]);
    }];

What's noticeable, apart from the macros that are necessary, is the semantic noise apparently inherent in this solution. Instead of focusing on what we want to accomplish (hidden inside the last return), the focus is on generic RAC classes such as RACSignal and methods such as combineLatest: and reduce:. I didn't really want to combine and reduce, I just wanted to keep some different states in sync, and with Objective-Smalltalk, I can do just that.

Let's first recast the original Objective-C expression into Objective-Smalltalk. Since Smalltalk is not burdened by the syntactic legacy of C, we can lose the square brackets. Because we have binary selectors (a bit like operators) and use ':=' for assignment, we can use '=' to check for equality instead of having to write out 'isEqual:'. The dots get replaced by slashes because Polymorphic Identifiers use URI syntax, and finally we use periods instead of semicolons at the end of sentences, er, statements.


loginButton/enabled := password/stringValue = passwordConfirm/stringValue.

Again, this is semantically the same statement as the original Objective-C, it assigns the right hand side to the left hand side. This can be viewed as a a one way constraint: the left hand side should be the same as the right hand side. The constraint has a fundamental flaw, though, because it is only maintained instantaneously as the line of code is executed. After that, left hand side and right hand side can diverge again. What we want is for that constraint to be maintained indefinitely: whenever the right hand side changes, the left hand side should be updated. In Objective-Smalltalk, you can now express this by replacing the ":=" assignment operator (technically: connector), with the "|=" constraint connector:
loginButton/enabled |= password/stringValue = passwordConfirm/stringValue.

A single character change, so no syntactic and no semantic noise.

Thursday, September 11, 2014

iPhone 6 Plus and The End of Pixels

It's been a long time coming. NeXTStep in 1989 featured DisplayPostscript, and therefore a device independent imaging model that meant you did not specify graphics in pixels, but rather in physical units. The default was a variant of the printer's point at 1/72nd of an inch, which happened to be close the the typical pixel resolution of displays at the time. However, 1 point never meant 1 pixel, it meant 1/72nd of an inch, and the combination of floating point coordinates and transformation matrices meant you could use pretty much any unit you wanted. When NeXT bought Apple, it brought this imaging model with it, although with some modifications due to Adobe intransigence about licensing and the addition of anti-aliasing.

However, despite the device-independent APIs, we still have pixel-based content, and "pixel-accurate" graphics. This has made less and less sense over time, with retina displays making pixel-accuracy moot (no more screen fonts!) scaled modes making it impossible and both iOS 7 and OS X 10.10 going for a more geometric look. Still, the design community has resisted, talking about @3 pixel art etc.

No more.

The iPhone 6 Plus has a 1920x1080 panel, but the simulator renders at 3x. These two resolutions don't match and so the pixels will need to be downsampled to the display resolution. Whether that is accomplished by downsampling pixel art (which happens automagically with Quartz and the proper device transform set) or as a separate step that downsamples the entire rendered framebuffer doesn't matter (much). Either way, there are no more "pixel perfect" pre-rendered designs.

Device-independent graphics, here we come at last. We're only a quarter century late.

Update: "Its 401 PPI display is the first display I’ve ever used on which, no matter how close I hold it to my eyes, I can’t perceive the pixels. " - John Gruber (emphasis mine)

Wednesday, September 10, 2014

collect is what for does

I recently stumbled on Rob Napier's explanation of the map function in Swift. So I am reading along yadda yadda when suddenly I wake up and my eyes do a double take:
After years of begging for a map function in Cocoa [...]
Huh? I rub my eyes, probably just a slip up, but no, he continues:
In a generic language like Swift, “pattern” means there’s a probably a function hiding in there, so let’s pull out the part that doesn’t change and call it map:
Not sure what he means with a "generic language", but here's how we would implement a map function in Objective-C.
#import <Foundation/Foundation.h>

typedef id (*mappingfun)( id arg );

static id makeurl( NSString *domain ) {
  return [[[NSURL alloc] initWithScheme:@"http" host:domain path:@"/"] autorelease];
}

NSArray *map( NSArray *array, mappingfun theFun )
{
  NSMutableArray *result=[NSMutableArray array];
  for ( id object in array ) {
    id objresult=theFun( object );
    if ( objresult ) {
       [result addObject:objresult];
    }
  }
  return result;
}

int main(int argc, char *argv[]) {
  NSArray *source=@[ @"apple.com", @"objective.st", @"metaobject.com" ];
  NSLog(@"%@",map(source, makeurl ));
}

This is less than 7 non-empty lines of code for the mapping function, and took me less than 10 minutes to write in its entirety, including a trip to the kitchen for an extra cookie, recompiling 3 times and looking at the qsort(3) manpage because I just can't remember C function pointer declaration syntax (though it took me less time than usual, maybe I am learning?). So really, years of "begging" for something any mildly competent coder could whip up between bathroom breaks or during a lull in their twitter feed?

Or maybe we want a version with blocks instead? Another 2 minutes, because I am a klutz:


#import <Foundation/Foundation.h>

typedef id (^mappingblock)( id arg );

NSArray *map( NSArray *array, mappingblock theBlock )
{
  NSMutableArray *result=[NSMutableArray array];
  for ( id object in array ) {
    id objresult=theBlock( object );
    if ( objresult ) {
       [result addObject:objresult];
    }
  }
  return result;
}

int main(int argc, char *argv[]) {
  NSArray *source=@[ @"apple.com", @"objective.st", @"metaobject.com" ];
  NSLog(@"%@",map(source, ^id ( id domain ) {
    return [[[NSURL alloc] initWithScheme:@"http" host:domain path:@"/"] autorelease];
        }));
}

Of course, we've also had collect for a good decade or so, which turns the client code into the following, much more readable version (Objective-Smalltalk syntax):
NSURL collect URLWithScheme:'http' host:#('objective.st' 'metaobject.com') each path:'/'.

As I wrote in my previous post, we seem to be regressing to a mindset about computer languages that harkens back to the days of BASIC, where everything was baked into the language, and things not baked into the language or provided by the language vendor do not exist.

Rob goes on the write "The mapping could be performed in parallel [..]", for example like parcollect? And then "This is the heart of good functional programming." No. This is the heart of good programming.

Having processed that shock, I fly over a discussion of filter (select) and stumble over the next whopper:

It’s all about the types

Again...huh?? Our map implementation certainly didn't need (static) types for the list, and all the Smalltalkers and LISPers that have been gleefully using higher order techniques for 40-50 years without static types must also not have gotten the memo.

We [..] started to think about the power of functions to separate intent from implementation. [..] Soon we’ll explore some more of these transforming functions and see what they can do for us. Until then, stop mutating. Evolve.
All modern programming separates intent from implementation. Functions are a fairly limited and primitive way of doing so. Limiting power in this fashion can be useful, but please don't confuse the power of higher order programming with the limitations of functional programming, they are quite distinct.

Tuesday, September 9, 2014

No Virginia, Swift is not 10x faster than Objective-C

About a month ago, Jesse Squires published a post titled Apples to Apples, documenting benchmark results that he claims show Swift now with a roughly 10x performance advantage over Objective-C. Although completely bogus, the post was retweeted by Chris Lattner (who should know better, and was supposedly mostly interested in highlighting the improvements in the Swift optimizer, rather than the bogus comparison) and has now been referenced a number of times as background knowledge as to the state of Swift. More importantly, though the actual mistake Jesse makes is pretty basic and not that interesting, it does point to some deeper misunderstandings about performance and language that I at least do find interesting.

So what's the mistake? Ironically, given the post's title, is that he is comparing apples to oranges, so to speak. The following table, which shows the time to sort an array of 10000 numbers 10 times in millisecond, illustrates the problem:

NSNumbernative integer
Objective-C6.040.97
Swift11.920.8
Jesse compared the two versions highlighted, so native Swift integers with Objective-C NSNumber object wrappers. All times are for binaries with optimization enabled, the machine was a 13" MBR with 2.9 GHz Intel Core i7 and 8GB of RAM. The integer sort was done using a C integer array and the system qsort() function. When you compare apples to apples, Objective-C has a roughly 2x edge with NSNumbers and is around 18% slower for native integers, at least when using qsort()

Why the 18% disadvantage? The qsort() function is made generically applicable to different types of arrays using a function pointer parameter for the comparison function that itself is parametrized using pointers to the elements to be compared. This means there is a per-comparison overhead of one function call and two pointer dereferences per comparison. That overhead overwhelms the actual comparison operation, which is a single machine instruction on most processors.

Swift, on the other hand, appears to produce a version of the sort function that is specialized to the integer type, with the comparison function inlined to the generated function so there is no function call or pointer dereference overhead. That is very clever and a Good Thing™ for performance. Sort of. The drawback is that this breaks separate compilation, because the functions actually have to be combined during the compile/link process every time it is used (I assume there is caching going on so we only got one per type combination).

Apart from making the compiler/linker slower , possibly significantly so (like C++ headers, though I presume they use LLVM bitcode to optimize the process), it also likely bloats the executable, causing cache and memory pressure. So it's a tradeoff, as usual, and while I think having the ability to specialize at compile-time is good, not being able to control it is not.

Objective-C doesn't have this ability to automagically adapt a function or method to parameters, if you want inlining the relationship has to be known at definition not at point of use. However, if the benefit of inlining is only 21% for the most primitive type, a machine integer, then it is clear that that the set of types for which compile-time specialization is beneficial at all is small.

Cocoa of course already provides specialized collection classes for the byte and unichar types, NSData and NSString respectively. I never quite understood why this wasn't extended to the other primitive types, particularly integer and float/double. On the other hand, the omission never bothered me much, I just implemented those classes myself in MPWFoundation. MPWRealArray even has support for DisplayPostscript binary object sequences, it's that old!

Both MPWRealArray and the corresponding MPWIntArray classes are small and fairly trivial to implement, and once I have them, using a specialized integer or real array is at least as convenient as using an NSArray, just a lot faster. They could also be quite a bit smaller than they are, sharing code either via subclassing or poor-man's generic programming via include files. Once I have a nice OO interface, I can swap out the implementation for something really quick like a dual-pivot integer sort I found in Java-land and adapted to C. (It is surprising just how similar they are at that level). With that sort, the test time drops to 0.56 ms, so 42% faster than the Swift version and almost twice as fast as the system qsort() function.

So the takeaway is that if you are using NSNumber objects in performance-sensitive code: stop. This is always a mistake. The native number types for Objective-C are int, float, double and friends, not NSNumber. After all, how do you perform arithmetic? Either directly on a primitive or by unboxing the NSNumber and then performing arithmetic on the primitive and then reboxing. Use primitive scalar types as much as possible where they make sense.

A second takeaway is that the question "which language is faster" doesn't really make sense, a more relevant and interesting question is "does this language make it hard/possible/easy to write fast code". Objective-C lets you write really fast code, if you want to, because it has the low-level chops and an understandable performance model. Swift so far can achieve reasonable performance at times, ludicrously bad at other times (especially with the optimizer turned off, which hardly fazes Objective-C), with as far as I can tell fairly little predictability or control. Having 10% faster (or slower) performance for code I don't particularly care about is not worth nearly as much as the knowledge that I can get the 1-5% of code that I do care about in shape no matter what. Swift is definitely not there yet, and given the direction it is taking I am not sure whether it will allow that kind of control, at least in comprehensible ways.

A third point is something more general about language. The whole argument that NSNumber and NSArray are "built in" somehow and int is not displays a lack of understanding of Objective-C that to me seems staggering. Even more so, the whole idea that you must only use what comes provided with Cocoa and are not allowed to build your own flies in the face of modern language design, throwing us back to the times of BASIC (Arthur Luehrmann, in the comments):

I had added graphics primitives to Dartmouth Basic around 1976 and developed an X-Y pen-plotter to carry out graphics commands mixed in with the text being sent to Teletype terminals.
The idea is that is that a language is a bundle of features, or to put it linguistically, a language is a list of words to be used as is.

Both C and Pascal introduced me to a new notion: that languages are not lists of words, but means of constructing your own words. For example, C did/does not have I/O as a language feature. I/O was just another set of functions placed in a library that you included just like any of your own functions/libraries. And there were two sets of them, the stdio package and the raw Unix I/O.

At around the same time I was introduced to both top-down and bottom-up programming. Both assume there is a recursive de-composition of the problem at hand (assuming the problem sufficiently complex to warrant it).

In bottom-up programming, you build up the vocabulary (the procedures and functions) that are necessary to succinctly describe your top-level problem, and then you describe your program in terms of that vocabulary you created. In top-down programming, you start at the other end and write your top-level program in terms of the vocabulary you wish you had to optimally describe the problem. Then you fill in the blanks.

In both, you define your own language to fit the problem, then you solve the problem using the language you defined. You would not add plotting commands to the language, you would either add plotting commands as a library or, if that were not possible, a way of adding plotting commands as a library. You would not look at whether plotting comes with the "standard library" or not. To quote Guy Steele in Growing a Language:

This is the nub of what I want to say. A language design can no longer be a thing. It must be a pattern—a pattern for growth—a pattern for growing the pattern for defining the patterns that programmers can use for their real work and their main goal.
So build your own libraries, your own abstractions. It's easy, fun and useful. It's the heart of Domain Driven Design, probably the most productive and effective software construction technique we as an industry have come up with to date. See what abstractions you can build easily and which ones are hard. Analyze the latter and you have started on the road to modern language design.

CORRECTION (June 4th 2015): I misattributed the Dartmouth BASIC quote to Cathy Doser, when the comment line on the Macintosh folklore entry clearly said Arthur Luehrmann. (Cathy's comment was a bit earlier).

Sunday, May 4, 2014

Satisfying the hunger for type safety?

Tom Adriaenssen riffs on the id subset in show me some id:
Let me explain: even though you might assume that all those objects are actually going to be DataPoint objects, there’s no actual guarantee that they will actual be DataPoint objects at runtime. Casting them only satisfies your hunger for type safety, but nothing else really.
More importantly, it only seems to satisfy your hunger for type safety, it doesn't actually provide any. It's less nutritious than sugar water in that respect, not even calories, never mind the protein, fiber, vitamins and other goodness. More like a pacifier, really, or the product of a cargo cult.

Tuesday, October 15, 2013

Should you use CoreData?

The answer, of course, is "it depends".

Now that we have that out of the way, I want to have a look at Drew Crawford's You should use Core Data, which manages to come up with a less nuanced answer in its 2943 words. It's an older article (2012), but recently came to my attention via Drew McCormack (@drewmccormack): "Great post", he wrote, and after reading the article I not just disagreed, but found that Twitter wasn't really adequate for writing up all the things wrong with that article.

Where to start? Maybe at the beginning, right in the first paragraph the following is called out as a "myth":

Among them, Core Data is designed for something–not really sure what–but whatever it is, it’s a lot more complicated than what I need to do in my project.
First here is a categorical mistake, because unless Drew knows the exact requirements and engineering trade-offs in every iOS application, he can't know whether this is true or false, fact or myth.

The second mistake is that the basic statement "CoreData was designed for something else" is actually true. CoreData's design dates back to NeXT's Enterprise Object Framework or EOF for short, and EOF was designed as an ORM for talking to corporate relational database servers, with a variety of alternate back-ends for non-relational DBs (including the way cool 3270 adapter!).

Obviously the implementation is different and the design has diverged by now, but that is the basic design, and yes, that does do something that is more complicated than what some (many?) developers need.

Details.

Next sentence, next problem:

I just want to save some entities to disk.
I may well be reading too much into this, but using the word entities already bakes so many assumptions into the problem statement. When I actually do want to save entities, databases in general and CoreData in particular are somewhat higher on my list of technologies, but quite often I don't start with ERM, and therefore just have objects, XML or other data. While these could be ER-modeled with varying degrees of difficulty/success, they certainly don't have to be, and it's often not the best choice.

In the enterprise system described in REST: Advanced Research Topics and Practical Applications, we removed the database from the rewrite, because the variety of the data meant converting the DB into a key-value store one way or another, or having a schema that's about an A0 page in small print (a standards body had developed such a schema and I think we even got a poster). We instead ended up converging on an EventPoster architecture that kept the original XML feed files around and parsed them into objects as necessary. No ERM here.

The next couple of paragraphs go off on an ad-hominem straw man tangent making WAG assumptions about the provenance of iOS developers and more WAGs about why that (assumed) provenance causes said developers to have these misconceptions. Those "misconceptions" that actually turn out to be true. Although largely irrelevant, it does contain some actual misinformation. For example the fact that categories don't get linked with static libraries is not an LLMV bug, it's a consequence of the combined semantics of static libraries and Objective-C categories irrespective of compiler/linker versions.

Details.

Joel

Then there's another tangent to Joel Spolsky's article on Things You Should Never Do (the link in Drew's article is dead), such as rewriting legacy code from scratch, which Joel describes categorically as "single worst strategic mistake" that any software company can make. In the words of the great Hans Bethe: "Ach, that isn't in wrong!":

  1. Just because Joel wrote something makes it right or applicable why?
  2. While Joel makes some good points and is right to counter a tendency to not want to deal with existing code, his claim is most certainly wrong in its absoluteness, both empirically (the system referenced above was a rewrite with tremendous benefits, then there's OS X vs. trying to keep fixing Classic Mac OS indefinitely etc.) and logically: it only holds true if all your accumulated complexity is of the essential kind, and none if of the accidental kind. That idea seems ludicrous to me, virtually all software is rushed, has shifting requirements that are only understood after the fact, has historical limitations (such as having to run in 128KB of RAM on a 7MHz CPU with no MMU) etc.

    Sometimes a rewrite is warranted, though you should obviously be wary and not undertake such a project lightly.

  3. Of course the biggest non-sequitur in the whole tangent is "How on earth does the problem of reading source code and rewriting legacy systems apply to this situation?". Apart from "not at all"? We don't have access to CoreData source code and there is no question about us rewriting it. Well, actually, I used to have such access, but even though my remit would have allowed some rewrites, I don't think I could have done a better job for the technology constraints chosen. The question is whether to use it or not, including whether the technology constraints are appropriate.
  4. And no, not every persistence solution ends up as effectively a rewrite of CoreData.

Details

After that ill-conceived tangent, the article goes right back to the next unsubstantiated ad-hominem:
Here are some things you probably haven’t thought about when architecting your so-called data stack:
Apart from the attitude somewhat unbefitting to someone who gets so much wrong, well, Nein, but lets's look at some of these in detail:
  • Handling future changes to the data schema: this just isn't hard if you're not using a relational database. In fact high variability of the schema was one of the reasons for ditching the DB in you know...
  • Bi-directional synchronization with servers...hah hah...!
  • Bi-directional synchronization with peers...see above
  • Undo support: gosh, I wish there were some sort of facility that would manage undo support on my model objects, if I had my way, Apple would add this and call it NSUndoManager. Without such a useful class, I'll just have to do complicated things like renaming old versions of my data file or storing deltas.
  • Multithreading. Really? Multithreading?? If you think CoreData makes multithreading easier rather than harder, I have both a nice bridge and some oceanfront property in Nevada to sell you.
  • Device/time constraints: the performance ca(na)rd. CoreData is slow and memory intensive. It makes up for this by adding the ability and the requirement for the developer to continuously tune the working set, if that is an option. If continuously minimizing/tuning the working set is not an option (i.e. you have some large datasets), you're hosed.

Details

Magic

Then we're back to the familiar ad-hominems/straw-men and non-sequitur analogies for a couple of paragraphs, nothing to see there. The whole analogy to Cocoa is useless: yes, Cocoa is good. How does this make CoreData good (it may or may not be, there simply is no connection) or appropriate for my tasks? Also, Cocoa in particular and Apple code in general is not "magic". I've been there, I've seen it and I fixed some of it. A lot of it is good, some excellent, but some not so much (sleep(5) anyone?).

The section entitled But, there are times not to use CoreData, right? could have been the saving grace, but alas the author blows it again. Having a "mark all as read" option in a NewsReader application is a "strange" "corner case"? Right. Let me turn this section around: there is a very narrow range of application areas where CoreData is or might be appropriate. Mostly read-only data, highly regular (table-like) structure, no need for bulk operations ever, preferably no trees either and no binary data. Fairly loose performance requirements.

What Apple says

The section on "what Apple says" is also gold, though I have to give credit to the Apple doc writers for managing to strongly suggest things without actually explicitly claiming them, strongly enough to fool this particular writer:
Apple’s high-level APIs can be much faster than ‘optimized’ code at lower levels.
This is obviously true, but completely meaningless. My tricycle can be faster than a Ferrari, for example if the Ferrari is out of gas or has a flat tire, or just parked. When we actually measured the performance of applications adopting CoreData at Apple we invariably got a significant performance regression. Then a lot of effort would be expended by all the teams involved in order to fix the regression, optimization effort that hadn't been expended on the original application, usually making up some of the shortfall.

I find the code reduction claim also specious, at least when stated as an unquestionable fact like this. In my experience, code size was reduced when removing CoreData and its predecessor EOF. Had we had exactly the requirements that CoreData/EOF were designed for (talking to existing relational enterprise databases), the result would almost certainly been different, but those were not our requirements, and I doubt that most iOS apps have those requirements (and in fact, CoreData didn't even support taking to external SQL databases, at all, a puzzling development for all the EOF veterans).

For managing small amounts of data inside in iOS application, CoreData is almost always overkill, because it effectively simulates an entire client/server enterprise application stack within your phone or desktop. The performance and complexity costs to pay for that overkill are substantial.

So Should You Use Core Data?

As I wrote: it depends.

The article in question at least adds no useful information to answering that question, only inappropriate analogies, repeated claims without any supporting evidence and lots of ad-hominems that are probably meant to be witty. If you believe its last section, the only reason developers don't use CoreData is because they are naive, lazy or ignorant.

This is evidently not true and ignores even the most basic concepts that one would apply to answering the question, such as, say, fitness for purpose. CoreData may well be the best implementation of an in-process-ORM simulating a client-server enterprise app there is, and there is good evidence that this is in fact true (certainly the people on it are top notch and some of the code is amazing). However, all that doesn't help if that's not what you need.

Considering the gleefully paraded ignorance of not just alternatives but various other programming aspects, I'd take the unconditional advice this article gives with a pinch of salt. Mountain-sized pinch, that is.