Thursday, May 20, 2021

A far too simple (hardcoded) tasks backend in Objective-S

Recently there was a question as to what one should use to create a backend for an iOS/macOS app these days. I couldn't resist mentioning Objective-S, and just to check for myself whether that's feasible, I quickly jotted down the following tiny backend that returns a hardcoded list of tasks via HTTP:
#!env stsh
framework:ObjectiveHTTPD load.

class Task {
   var <bool> done.
   var title.
   -description { "Task: {this:title} done: {this:done}". }
}

taskList ← #( #Task{ #title: 'Clean my room', #done: false }, #Task{ #title: 'Check twitter feed', #done: true } ).

scheme todo {
   var taskList.
   /tasks { 
      |= { 
         this:taskList.
      }
   }
}.

todo := #todo{ #taskList: taskList }.
server := #MPWSchemeHttpServer{ #scheme: todo, #port: 8082 }.
server start.
shell runInteractiveLoop.

After loading the HTTP framework, we define a Task and a list of two example tasks. Then we define a scheme with a single path, just /tasks, which returns said tasks list. We then instantiate the scheme and serve it via HTTP on port 8082. Since this is a shell script and starting the server does not block, we finally start up the REPL.

Details such as coding the tasks as JSON, accessing a single task and modifying tasks are left as exercises for the reader.

Sunday, May 9, 2021

Talking to pins

The last few weeks, I spent a little time getting Objective-S working well on the Raspberry Pi, specifically my Pi400. It's a really wonderful little machine, and the form factor and price remind me very much of the early personal computers.

What's missing, IMHO, is an experience akin to the early BASICs. And I really mean "akin", not a nostalgia project, but recovering a real quality that has been lost: not really "simplicity", more "straightforwardness".

Of course, one of the really cool thing about the Pi is its GPIO interface that lets you do all sorts of electronics experiments, and I hear that the equivalent of "Hello World" for the Raspi is making an LED blink.


import RPi.GPIO as GPIO 
from time import sleep 
GPIO.setwarnings(False) 
 
GPIO.setmode(GPIO.BCM) 
GPIO.setup(17, GPIO.OUT) 
 
while True: 
    GPIO.output(17, True) 
    sleep(1) 
    GPIO.output(17, False) 
    sleep(1) 

Hmm. That's a a lot of semantic noise for something so conceptually simple. All we want to is set the value of a pin. As soon as I saw this, I knew it would be ideal for Polymorphic Identifiers, because a pin is the ultimate state, and PIs and their stores are made for abstracting over state.

Of course, I first had to to get Objective-S running on the Pi, which meant getting GNUstep to run. While there is a wonderful set of install scripts, the one for the Raspi only worked with an ancient clang version and libobjc 1.9. Alas, that version has some bugs on the Raspi, for example with the imp_implentationWithBlock() runtime function that Objective-S uses to define methods.

Long story short, after learning about GNUstep installs and waiting for the wonderful David Chisnall to remove some obsolete 32 bit exception-version detection code from libobjc, we now have a script that installs current GNUstep with a reasonably current clang: https://github.com/plaurent/gnustep-build/tree/master/raspbian-10-clang-9.0-runtime-2.1-ARM. With that in hand, a few bug fixes in MPWFoundation and Objective-S, I could add a really rudimentary Store that manages talking to the pins. And this allows me to write the following in an interactive shell to drive the customary GPIO pin 17 that I connected to the LED via resistor:


gpio:17 ← 1.

Now that's what I am talking about!

Of course, we're supposed to make it blink, not just turn it on. We could use the same looping approach as the Python script, or convenience methods like the ones provided, but the breadboard and pins make me think of wanting to connect components to do the job instead.

So let's connect some components, software architecture style! The following script creates an instance of a Blinker object (using an object literal), which emits alternating ones and zeros and connects it to the pin.


blinker ← #Blinker{ #seconds: 1 }. 
blinker → ref:gpio:17. 
blinker run.
gpio:17 ← 0.

Once connected it tells the blinker to start running, which creates an NSTimer adds it to the current runloop and then runs the run loop. That run is interruptible, so Ctrl-C breaks and runs the cleanup code.

What about setting up the pin for output? Happens automatically when you first output to it, but I will add code so you can do it manually.

Where does the Blinker come from? That's actually an object-template based on an MPWFixedValueSource.


object Blinker : #MPWFixedValueSource{ #values: #(0,1) }

You can, of course, hook up a fixed-value source to any kind of stream.

While getting here took a lot of work, and resulted in me (re-)learning a lot about GNUstep, the result, even this intermediate one, is completely worth it and makes me very happy. This stuff really works even better than I thought it would.

Friday, November 13, 2020

M1 Memory and Performance

The M1 Macs are out now, and not only does Apple claim they're absolutely smokin', early benchmarks seem to confirm those claims. I don't find this surprising, Apple has been highly focused on performance ever since Tiger, and as far as I can tell hasn't let up since.

One maybe somewhat surprising aspect of the M1s is the limitation to "only" 16 Gigabytes of memory. As someone who bought a 16 Kilobyte language card to run the Merlin 6502 assembler on his Apple ][+ and expanded his NeXT cube, which isn't that different from a modern Mac, to a whopping 16 Megabytes, this doesn't actually seem that much of a limitation, but it did cause a bit of consternation.

I have a bit of a theory as to how this "limitation" might tie in to how Apple's outside-the-box approach to memory and performance has contributed to the remarkable achievement that is the M1.

The M1 is apparently a multi-die package that contains both the actual processor die and the DRAM. As such, it has a very high-speed interface between the DRAM and the processors. This high-speed interface, in addition to the absolutely humongous caches, is key to keeping the various functional units fed. Memory bandwidth and latency are probably the determining factors for many of today's workloads, with a single access to main memory taking easily hundreds of clock cycles and the CPU capable of doing a good number of operations in each of these clock cycles. As Andrew Black wrote: "[..] computation is essentially free, because it happens 'in the cracks' between data fetch and data store; ..".

The tradeoff is that you can only fit so much DRAM in that package for now, but if it fits, it's going to be super fast.

So how do we make sure it all fits? Well, where Apple might have been "focused" on performance for the last 15 years or so, they have been completely anal about memory consumption. When I was there, we were fixing 32 byte memory leaks. Leaks that happened once. So not an ongoing consumption of 32 bytes again and again, but a one-time leak of 32 bytes.

That dedication verging on the obsessive is one of the reasons iPhones have been besting top-of-the-line Android phone that have twice the memory. And not by a little, either.

Another reason is the iOS team's steadfast refusal to adopt tracing garbage collection as most of the rest of the industry did, and macOS's later abandonment of that technology in favor of the reference counting (RC) they've been using since NeXTStep 4.0. With increased automation of those reference counting operations and the addition of weak references, the convenience level for developers is essentially indistinguishable from a tracing GC now.

The benefit of sticking to RC is much-reduced memory consumption. It turns out that for a tracing GC to achieve performance comparable with manual allocation, it needs several times the memory (different studies find different overheads, but at least 4x is a conservative lower bound). While I haven't seen a study comparing RC, my personal experience is that the overhead is much lower, much more predictable, and can usually be driven down with little additional effort if needed.

So Apple can afford to live with more "limited" total memory because they need much less memory for the system to be fast. And so they can do a system design that imposes this limitation, but allows them to make that memory wicked fast. Nice.

Another "well-known" limitation of RC that has made it the second choice compared to tracing GC is the fact that updating those reference counts all the time is expensive, particularly in a multi-threaded environment where those updates need to be atomic. Well...

How? Problem solved. I guess it helps if you can make your own Silicon ;-)

So Apple's focus on keeping memory consumption under control, which includes but is not limited to going all-in on reference counting where pretty much the rest of the industry has adopted tracing garbage collection, is now paying off in a majory way ("bigly"? Too soon?). They can get away with putting less memory in the system, which makes it possible to make that memory really fast. And that locks in an advantage that'll be hard to duplicate.

It also means that native development will have a bigger advantage compared to web technologies, because native apps benefit from the speed and don't have a problem with the memory limitations, whereas web-/electron apps will fill up that memory much more quickly.

Tuesday, September 15, 2020

Pointers are Easy, Optimization is Complicated

Just recently came across Ralf Jung's 2018 post titled Pointers are Complicated. The central thesis is that the model that most C (and assembly language) programmers have that a pointer is just an integer that happens to be a machine address is wrong, in fact the author flat out states: "Pointers are definitely not integers."

That's a strong statement. I like strong statements, because they make a discussion possible. So let's respond in kind: the claim that pointers are definitely not integers is wrong.

The example

The example the author uses to show that pointers are definitely not integers is the following:
int test() {
    auto x = new int[8];
    auto y = new int[8];
    y[0] = 42;
    int i = /* some side-effect-free computation */;
    auto x_ptr = &x[i];
    *x_ptr = 23;
    return y[0];
}

And this is the crux of the reasoning:
It would be beneficial to be able to optimize the final read of y[0] to just return 42. The justification for this optimization is that writing to x_ptr, which points into x, cannot change y.
So pointers are "hard" and "not integers" because they conflict with this optimization that "would be beneficial".

I find this fascinating: a "nice to have" optimzation is so obviously more important than a simple and obvious pointer model that it doesn't even need to be explained as a possible tradeoff, never mind justified as to why the tradeoff is resolved in favor of the nice-to-have optimization.

I prefer the simple and obvious pointer model. Vastly.

This way of placing the optimizer's concerns far ahead of the programmer's is not unique, if you check out Chris Lattner's What Every C Programmer Should Know About Undefined Behavior, you will note the frequent occurrence of the phrase "enables ... optimizations". It's pretty much the only justification ever given.

I call this now industry-dominating style of programming Compiler Optimizer Creator Oriented Programming (COCOP). It was thoroughly critiqued in What every compiler writer should know about programmers or “Optimization” based on undefined behaviour hurts performance (pdf).

Pointers as Integers

There are certainly machines where pointers are not integers, the most prominent being 8086/80286 16 bit segmented mode, where a (far) pointer consists of a segment and an offset. On 8086, the segment is simply shifted left 4 bits and added to the offset, on 80286 the segment can be located anywhere in memory or not be resident, implementing a segmented virtual memory. AFAIK, these modes are simplified variants of the iAPX 432 object memory model.

What's important to note in this context is that the iAPX 432 and its memory model failed horribly, and industry actively and happily moved away from the x86 segmented model to what is called a "flat address space", common on other architectures and finally also adopted by Intel with the 386.

The salient feature of a "flat address space" is that a pointer is an integer, and in fact this eqivalence is also rather influential on CPU architecture, with address-space almost universally tied to the CPU's integer size. So although the 68K was billed as a 16 bit CPU (or 16/32), its registers were actually 32 bits, and IIRC its address ALUs were fully 32 bit, so if you wanted to do some kinds of 32 bit aritmetic, the LEA (Load Effective Address) instruction was your friend. The reason for the segmented architecturee on the 8086 was that it was a true 16 bit machine, with 16 bit registers, but Intel wanted to have a 20 bit address space.

So not only was and is there an equivalence of pointers and integers, this state of affairs was one that was actively sought and joyously received once we achieved it again. Giving it up for nice-to-have optimizations seems at best debatable, but at the very least it is something that should be discussed/debated, rather than simply assumed away.

Sunday, June 21, 2020

Beyond Faster JSON Support for iOS/macOS, Part 9: CSV and SQLite

When looking at the MPWPlistStreaming protocol that I've been using for my JSON parsing series, one thing that was probably noticeable is that it isn't particularly JSON-focused. In fact, it wasn't even initially designed for parsing, but for generating.

So could we use this for other de-serialization tasks? Glad you asked!

CSV parsing

One of the examples in my performance book involves parsing Comma Separated Values quickly, within the context of getting the time to convert a 139Mb GTFS file to something usable on the phone down from 20 minutes using using CoreData/SQLite to slightly less than a second using custom in-memory data structures that are also several orders of magnitude faster to query on-device.

The original project's CVS parser took around 18 seconds, which wasn't a significant part of the 20 minutes, but when the rest only took a couple of hundred milliseconds, it was time to make that part faster as well. The result, slightly generalized, is MPWDelimitedTable ( .h .m ).

The basic interface is block-based, with the block being called for every row in the table, called with a dictionary composed of the header row as keys and the contents of the row as values.


-(void)do:(void(^)(NSDictionary* theDict, int anIndex))block;

Adapting this to the MPWPlistStreaming protocol is straightforward:
-(void)writeOnBuilder:(id )builder
{
    [builder beginArray];
    [self do:^(NSDictionary* theDict, int anIndex){
        [builder beginDictionary];
        for (NSString *key in self.headerKeys) {
            [builder writeObject:theDict[key] forKey:key];
        }
        [builder endDictionary];
    }];
    [builder endArray];
}

This is a quick-and-dirty implementation based on the existing API that is clearly sub-optimal: the API we call first constructs a dictionary from the row and the header keys and then we iterate over it. However, it works with our existing set of builders and doesn't build an in-memory representation of the entire CSV.

It will also be relatively straightforward to invert this API usage, modifying the low-level API to use MPWPlistStreaming and then creating a higher-level block- and dictionay-based API on top of that, in a way that will also work with other MPWPlistStreaming clients.

SQLite

Another tabular data format is SQL data bases. On macOS/iOS, one very common database is SQLite, usually accessed via CoreData or the excellent and much more light-weight fmdb.

Having used fmdb myself before, and bing quite delighted with it, my first impulse was to write a MPWPlistStreaming adapter for it, but after looking at the code a bit more closely, it seemed that it was doing quite a bit that I would not need for MPWPlistStreaming.

I also think I saw the same trade-off between a convenient and slow convenience based on NSDictionary and a much more complex but potentially faster API based on pulling individual type values.

So Instead I decided to try and do something ultra simple that sits directly on top of the SQLite C-API, and the implementation is really quite simple and compact:


@interface MPWStreamQLite()

@property (nonatomic, strong) NSString *databasePath;

@end

@implementation MPWStreamQLite
{
    sqlite3 *db;
}

-(instancetype)initWithPath:(NSString*)newpath
{
    self=[super init];
    self.databasePath = newpath;
    return self;
}

-(int)exec:(NSString*)sql
{
    sqlite3_stmt *res;
    int rc = sqlite3_prepare_v2(db, [sql UTF8String], -1, &res, 0);
    @autoreleasepool {
        [self.builder beginArray];
        int step;
        int numCols=sqlite3_column_count(res);
        NSString* keys[numCols];
        for (int i=0;i < numCols; i++) {
            keys[i]=@(sqlite3_column_name(res, i));
        }
        while ( SQLITE_ROW == (step = sqlite3_step(res))) {
            @autoreleasepool {
                [self.builder beginDictionary];
                for (int i=0; i < numCols; i++) {
                    const char *text=(const char*)sqlite3_column_text(res, i);
                    if (text) {
                        [self.builder writeObject:@(text) forKey:keys[i]];
                    }
                }
                [self.builder endDictionary];
            }
        }
        sqlite3_finalize(res);
        [self.builder endArray];
    }
    return rc;
}

-(int)open
{
    return sqlite3_open([self.databasePath UTF8String], &db);
}

-(void)close
{
    if (db) {
        sqlite3_close(db);
        db=NULL;
    }
}


Of course, this doesn't do a lot, chiefly it only reads, no updates, inserts or deletes. However, the code is striking in its brevity and simplicity, while at the same time being both convenient and fast, though with still some room for improvement.

In my experience, you tend to not get all three of these properties at the same time: code that is simple and convenient tends to be slow, code that is convenient and fast tends to be rather tricky and code that's simple and fast tends to be inconvenient to use.

How easy to use is it? The following code turns a table into an array of dictionaries:


#import <MPWFoundation/MPWFoundation.h>

int main(int argc, char* argv[]) {
   MPWStreamQLite *db=[[MPWStreamQLite alloc] initWithPath:@"chinook.db"];
   db.builder = [MPWPListBuilder new];
   if( [db open] == 0 ) {
       [db exec:@"select * from artists;"];
       NSLog(@"results: %@",[db.builder result]);
       [db close];
   } else {
       NSLog(@"Can't open database: %s\n", [db error]);
   }
   return(0);
}

This is pretty good, but probably roughly par for the course for returning a generic data structure such as array of dictionaries, which is not going to be particularly efficient. (One of my first clues that CoreData's predecessor EOF wasn't particularly fast was when I read that fetching raw dictionaries was an optimization, much faster than fetching objects.)

What if we want to get objects instead? Easy, just replace the MPWPListBuilder with an MPWObjectBuilder, parametrized with the class to create. Well, and define the class, but presumably you already havee that if the task is to convert to objects of that class. And it cold obviously also be automated.



#import <MPWFoundation/MPWFoundation.h>

@interface Artist : NSObject { }

@property (assign) long ArtistId;
@property (nonatomic,strong) NSString *Name;

@end

@implementation Artist

-(NSString*)description
{
	return [NSString stringWithFormat:@"<%@:%p id: %ld name: %@>",[self class],self,self.ArtistId,self.Name];
}

@end

int main(int argc, char* argv[]) {
   MPWStreamQLite *db=[[MPWStreamQLite alloc] initWithPath:@"chinook.db"];
   db.builder = [[MPWObjectBuilder alloc] initWithClass:[Artist class]];
   if( [db open] == 0) {
       [db exec:@"select * from artists"];
       NSLog(@"results: %@",[db.builder result]);
       [db close];
   } else {
       NSLog(@"Can't open database: %s\n", [db error]);
   }
   return(0);
}

Note that this does not generate a plist representation as an intermediate step, it goes straight from database result sets to objects. The generic intermediate "format" is the MPWPlistStreaming protocol, which is a dematerialized representation, both plist and objects are peers.

TOC

Somewhat Less Lethargic JSON Support for iOS/macOS, Part 1: The Status Quo
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 2: Analysis
Somewhat Less Lethargic JSON Support for iOS/macOS, Part 3: Dematerialization
Equally Lethargic JSON Support for iOS/macOS, Part 4: Our Keys are Small but Legion
Less Lethargic JSON Support for iOS/macOS, Part 5: Cutting out the Middleman
Somewhat Faster JSON Support for iOS/macOS, Part 6: Cutting KVC out of the Loop
Faster JSON Support for iOS/macOS, Part 7: Polishing the Parser
Faster JSON Support for iOS/macOS, Part 8: Dematerialize All the Things!
Beyond Faster JSON Support for iOS/macOS, Part 9: CSV and SQLite

Sunday, June 14, 2020

The Curious Case of Swift's Adoption of Smalltalk Keyword Syntax

I was really surprised to learn that Swift recently adopted Smalltalk keyword syntax: [Accepted] SE-0279: Multiple Trailing Closures. That is: a keyword terminated by a colon, followed by an argument and without any surrounding braces.

The mind boggles.

A little.

Of course, Swift wouldn't be Swift if this weren't a special case of a special case, specifically the case of multiple trailing closures, which is a special case of trailing closures, which are weird and special-casey enough by themselves. Below is an example:


UIView.animate(withDuration: 0.3) {
  self.view.alpha = 0
} completion: { _ in
  self.view.removeFromSuperview()
}

Note how the arguments to animate() would seem to terminate at the closing parenthesis, but that's actually not the case. The curly braces after the closing paren start a closure that is actually also an argument to the method, a so-called trailing closure. I have a little bit of sympathy for this construct, because closures inside of the parentheses look really, really awkward. (Of course, all params apart from a sole x inside f(x) look awkward, but let's not quibble. For now.).

Another thing this enables is methods that reasonably resemble control structures, which I heard is a really great idea.

The problem is that sometimes you have more than one closure argument, and then just stacking them up behind what appears to be end of the function/method call gets really, really awkward, and you can't tell which block is which argument, because the trailing closure doesn't get a keyword.

Well, now it does. And we now have 4 different method syntaxes in one!

  1. Traditional C/Pascal/C++/Java function call syntax x.f()
  2. The already weird-ish addition of Smalltalk/Objective-C keywords inside the f(x) syntax: f(arg:x)
  3. Original trailing-closure syntax, which is just its own thing, for the first closure
  4. Smalltalk non-brackted keyword syntax for the 2nd and subsequent closures.
That is impressive, in a scary kind of way.
Swift is a crescendo of special cases stopping just short of the general; the result is complexity in the semantics, complexity in the behaviour (i.e. bugs), and complexity in use (i.e. workarounds).
In understand that this proposal was quite controversial, with heated discussion between opponents and proponents. I understand and sympathize with both sides. On the one hand, this is markedly better than alternatives. On the other hand it is a special case of a special case that is difficult to justify as an addition of all that is already there.

Special cases beget special cases beget special cases.

Of course the answer was always there: Smalltalk keyword syntax is not just the only reasonable solution in this case, it also solves all the other cases. It is the general solution. Here's how this could look in Objective-Smalltalk (which uses curly braces instead for closures instead of Smalltalk-80's square brackets):


UIView animate:{ self.view.alpha ← 0. } withDuration:0.3 completion:{ self view removeFromSuperview. }.

No special cases, every argument is labeled, no syntax mush of brackets inside parentheses etc. And yes, this also handles user-defined control structures, to:do: is just a method on NSNumber:


1 to:10 do:{:i | stdout println:"I will not introduce {i} special cases willy nilly.".}.

And since keywords naturally go between their arguments, there is no need for "operators", as a very different and special syntax form. You just allow some "binary" keywords to look a little different, so instead of 2 multiply:3 you can write 2 * 3. And when you have 2 raisedTo:3 instead of pow(2,3) (with the signature: func pow(_ x: Decimal, _ y: Int) -> Decimal), do you really neeed to go to the trouble of defining an "operator"?

Or Swift's a as b, another special kind of syntax. How about a as:b? (Yes I know there are details, but those are ... details.). And so on and so forth.

But of course, it's too late now. When I chose Smalltalk as the base syntax for the language that has turned into Objective-Smalltalk, it wasn't just because I just like it or have gotten used to it via Objective-C. Smalltalk's syntax is surprisingly flexible and general, Smalltalk APIs look a lot like DSLs, without any of the tooling or other overheads.

And that's the frustrating part: this stuff was and is available and well-known. At least if you bother to look and/or ask. But instead, we just choose these things willy-nilly and everybody has to suffer the consequences.

UPDATE:

I guess what I am trying to get at is that if you'd thought things through just a little bit, you could have had almost the entire syntax of your language for the cost (complexity, implementation size and brittleness, cognitive load, etc.) of this one special case of a special case. And it would have been overall better to boot.

Monday, June 1, 2020

MPWTest Only Tests Frameworks

It should be noted, if it wasn't obvious, that MPWTest is opinionated software, meaning it achieves some of its smoothness by gleefully embracing constraints that some might view as potentially crippling limitations.

Maybe the biggest of these constraints, mentioned in the previous post, is that MPWTest only tests frameworks. This means that the following workflow is not supported out of the box:

The point being that this is a workflow I not just somewhat indifferently do not want, but rather emphatically and actively want to avoid. Tests that are run (only?) when launching the app are application tests. My perspective is that unit tests are an integral part of the class. This may seem a subtle distinction, but subtle differences in something you do constantly can have huge impacts. "Steter Tropfen höhlt den Stein."

Another aspect is that launching the app for testing as a permanent and fixed part of your build process seems highly annoying at best. Linker finishes, app pops up, runs for a couple of seconds, shuts down again. I don't see that as viable. For testing to be integral and pervasive, it has to be invisible when the tests succeed.

The testing pyramid is helpful here: my contention is that you want to be at the bottom of that pyramid, ideally all of the time. Realistically, you're probably not going to get there, but you should push really, really hard, even making sacrifices that appear to be unreasonable to achieve that goal.

Framework-oriented programming

Only testing frameworks begs the question as to how to test those parts of the application not in frameworks. For me the answer is simple: there isn't any production code outside of frameworks.

None. Not the UI, not the application delegate. Only the auto-generated main().

The benefits of this approach are plentiful, the effort minimal. And if you think this is an, er, eccentric position to take, the program you almost certainly use to create apps for iOS/macOS etc. takes the same eccentric position: Xcode's main executable is 45K in size and only contains a main() function and some Swift boilerplate.

If all your code is in frameworks, only testing frameworks is not a problem. That may seem like a somewhat extreme case of sour grapes, with the arbitrary limitations of a one-off unit testing framework driving major architectural decisions, but the causality is the other way around: I embraced framework-oriented programming before and independently of MPWTest.

iOS

Another issue is iOS. Running a command-line tool that dynamically loads and tests frameworks is at least tricky and may be impossible, so that approach currently does not work. My current approach is that I view on-device and on-simulator tests as higher-up in the testing hierarchy: they are more costly, less numerous and run less frequently.

The vast majority of code lives in cross-platform frameworks (see: Ports and Adapters) and is developed and tested primarily on macOS. I have found this to be much faster than using the simulator or a device in day-to-day programming, and have used this "mac-first" technique even on projects where we were using XCTest.

Although not testing on the target platform may be seen as a problem, I have found discrepancies to be between exceedingly rare and non-existent, with "normal" code trending towards the latter. One of the few exceptions in the not-quite-so-normal code that I sometimes create was the change of calling conventions on arm64, which meant that plain method pointers (IMPs) no longer worked, but had to be cast to the "correct" pointer type, only on device. Neither macOS nor the simulator would show the problem.

For that purpose, I hacked together a small iOS app that runs the tests specified in a plist in the app bundle. There is almost certainly a better way to handle this, but I haven't had the cycles or motivation to look into it.

How to approximate

So you can't or don't want to adopt MPWTest. That doesn't mean you can't get at least some of the benefits of the approach. As a start, instead of using Cmd-B in Xcode to build, just use Cmd-U instead. That's what I did when working on Wunderlist, where we used XCTest.

Second, adopt framework-oriented programming and the Ports and Adapters style as much as possible. Put all your code in frameworks, and as much as possible in cross-platform frameworks that you can test/run on macOS, and even if you are developing exclusively for iOS, create a macOS target for that framework. This makes using Cmd-U to build much less painful.

Third, adhere to a strict 1:1 mapping between production classes and test classes, and place your test classes in the same file as the class they are testing.

My practical experience with both JUnit and XCTest on medium-sized projects does not square with the assertion that the difference is not that big: you still have to create these additional classes, they have to communicate with the class under tests (self in MPWTest), you have to track changes etc. And of course, you have to know to configure und use the framework differently from the way it was built, intended and documented. And what I've seen of OCUnit use was that the tests were not co-located with the class, but in a separate part of the project.

A final note is that the trick of interchangeably using the class as the test fixture is only really possible in a language like Objective-C where classes are first class objects. It simply wouldn't be possible in Java. This is how the class can test itself, and the tests become an integral part of the class, rather than something that's added somewhere else.