Wednesday, June 9, 2021

Glue: the Dark Matter of Software

"Software seems 'large' and 'complicated' for what it does". I keep coming back to this quote by Alan Kay.

The same feeling has been nagging me pretty me much ever since I started writing software. On the one hand, there is the magic, almost literally: we write some text (spells) and the machine does things in the real world. On the other hand, it seems just way too much work to make the machine do anything more complex than:

10 PRINT "Hello"
20 GOTO 10
Almost like threading a needle with boxing gloves. And that's even if we are careful, if we avoid unnecessary complexity.

And the numbers appear to back that up, Alan Kay mentions Microsoft office at several hundred million lines of code. From my personal experience, the Wunderlist iOS client was not quite 200 KLOC. For the latter, I can attest to the attention given by the team to not introduce unnecessary bloat, and even to actively reduce it. (For example, we cut our core code by around 30KLOC thanks to some of the architectural mechanisms such as Storage Combinators). I am fairly sure I am not the only one with this experience.

So why so much code? After all Wunderlist was just a To Do List, albeit a really nice one. I can't really say much about Office, I don't think anyone can, because 400 MLOC is just way too much code to comprehend. I think the answer is:

Glue Code.

It's the unglamorous, invisible code that connects two pieces of software, makes sure that data that's in location A reaches location B unscathed (from the datbase to the UI, from the UI to the model, from the model to the backend and so on...). And like Dark Matter, it is invisible and massive.

Why do I say it is "invisible"? After all, the code is right there, isn't it? As far as I can tell, there are several related reasons:

  1. Glue code is deemed not important. It's just a couple of lines here, and another couple of lines over there ... and soon enough you're talking real MLOCs!
  2. We cannot directly express glue code. Most of our languages are what I call "DSLs for Algorithms" (See ALGOL, the ALGOrithmic Language), so glue can not be expressed intentionally, but only by describing algorithms for implementing the glue.
That's why it is invisible, and also partly why it is massive: not being able to express it directly means we cannot abstract and encapsulate it, we keep repeating slight variations of that glue. There is another reason why it's massive:
  1. Glue is quadratic. If you have N features that interact with each other, you have O(N²) pieces of glue to get them to talk to each other.

This last point was illustrated quite nicely by Kevin Greer in a video comparing Multics and Unix development, with the crucial insight being that you need to "program the perimeter, not the area":

For him, the key difference is that Unix had the pipe, and I would agree. The pipe is one-character glue: "|". This is absolutely crucial.

If you have to write even a little custom code every time you connect two modules, you will be in quadratic complexity, meaning that as your features grow your glue code will overwhelm the core functionality. And you will only notice this when it's far too late to do anything about it, because the initial growth rate will be low.

So what can we do about it? I think we need to make glue first class so we can actually write down the glue itself, and not the algorithms that implement the glue. Once we have that, we can and hopefully will create better kinds of glue, ones like the Unix pipe in that they can connect components generically, without requiring custom glue per component pair.

UPDATE

There were some questions as to what to do about this. Well, I am working on it, with Objective-S, and I write fairly frequently on this blog (and occasionally submit my writing to scientific conferences), one post that would be immediately relevant is: Why Architecture Oriented Programming Matters.

I also don't see Unix Pipes and Filters as The Answer™, they just demonstrate the concept of minimized and constant glue. Expanding on this, and as I wrote in Why Architecture Oriented Programming Matters, I also don't see any one single connector as "the" solution. We need different kinds of connectors, and we need to write them down, to abstract over them and use them natively. Not simulate everything by calling procedures, methods or functions. See also Foxes vs. Hedgehogs.

Tuesday, June 1, 2021

Towards a ToDoMVC Backend in Objective-S

A couple of weeks ago, I showed a little http backend. Well, tiny is probably a more apt description, and also aptly describes its functionality, which is almost non-existent. All it does is define a simplistic Task class, create an array with two sample instances and then serves that array of tasks over http. And it serves the -description of those tasks rather than anything usefuk like a JSON encoding.

For reference, this is the original code, hacked up in maybe 15 minutes:


#!env stsh
framework:ObjectiveHTTPD load.

class Task {
   var <bool> done.
   var title.
   -description { "Task: {this:title} done: {this:done}". }
}

taskList ← #( #Task{ #title: 'Clean my room', #done: false }, #Task{ #title: 'Check twitter feed', #done: true } ).

scheme todo {
   var taskList.
   /tasks { 
      |= { 
         this:taskList.
      }
   }
}.

todo := #todo{ #taskList: taskList }.
server := #MPWSchemeHttpServer{ #scheme: todo, #port: 8082 }.
server start.
shell runInteractiveLoop.

What would it take to make this borderline useful? First, we would probably need to encode the result as JSON, rather than serving a description. This is where Storage Combinators come in. We (now) have a MPWJSONConverterStore that's a mapping store, it passes its "REST" requests through while performing certain transformations on the data and/or the references. In this case the transformation is serializing or deserialzing objects from/to JSON, depending on which way the request is going and which way the converter is pointing.

In this case, the converter is pointing "up", that is it serializes objects read from its source to JSON and deserializes data written to its source from JSON to objects. We also tell it that it is dealing with Task objects. When we have the converter we connect it to our todo scheme and tell the HTTP server to talk to the json converter (which talks to our todo scheme):


todo := #todo{ #taskList: taskList, #store: persistence }.
json := #MPWJSONConverterStore{  #up: true, #class: class:Task }.
json → todo.
server := #MPWSchemeHttpServer{ #scheme: json, #port: 8082 }.

Second, we also want to be to interact with individual tasks. No problem, just add a /task/:id proprerty path to our store/scheme handler, along with GET ("|=") and PUT ("=|") handlers. I am not fully sold yet on the "|=" syntax for this, but I would like to avoid names for this sort of structural component. Maybe arrows?
	/task/:id {
		|= {
			this:taskDict at:id .
		}
		=| {
			this:taskDict at:id put:newValue.
		}

In order to facilitate this, the taskList was changed to a dictionary. Once we make changes to our data, we probably also want to persist it. One easy way to do this is to store the tasks as JSON on disk. This allows us to reuse the JSON converter from above, but this time pointing "down". We connect this converter to the filesystem at the directory /tmp/tasks and to the store:
json → todo → #MPWJSONConverterStore{  #class: class:Task } → ref:file:/tmp/tasks/ asScheme.

In addition, we need to trigger saving in the PUT handler:
		=| {
			this:taskDict at:id put:newValue.
			self persist.
		}
	-persist {
		source:tasks := this:taskDict allValues.
	}
}

This will (synchronously) write the entire task list on every PUT. The full code is here:
#!env stsh
framework:ObjectiveHTTPD load.

class Task {
	var id.
	var  done.
	var title.
	-description { "Task: {this:title} done: {this:done} id: {this:id}". }
	-writeOnJSONStream:aStream {
		aStream writeDictionaryLikeObject:self withContentBlock:{ :writer |
			writer writeInteger: this:id forKey:'id'.
			writer writeString: this:title forKey:'title'.
			writer writeInteger: this:done forKey:'done'.
		}.
	}
}

taskList ← #( #Task{ #id: '1', #title: 'Clean Room', #done: false }, #Task{ #id: '2', #title: 'Check Twitter', #done: true } ).

scheme todo : MPWMappingStore {
	var taskDict.
	-setTaskList:aList {
		this:taskDict := NSMutableDictionary dictionaryWithObjects: aList forKeys: aList collect id.
	}
	/tasks { 
		|= { 
			this:taskDict allValues.
		}
	}
	/task/:id {
		|= {
			this:taskDict at:id .
		}
		=| {
			this:taskDict at:id put:newValue.
			self persist.
		}
	}
	-persist {
		source:tasks := this:taskDict allValues.
	}
}.

todo := #todo{ #taskList: taskList }.
json := #MPWJSONConverterStore{  #up: true, #class: class:Task }.
json → todo → #MPWJSONConverterStore{  #class: class:Task } → ref:file:/tmp/tasks/ asScheme.
server := #MPWSchemeHttpServer{ #scheme: json, #port: 8082 }.
server start.
shell runInteractiveLoop.

The writeOnJSONStream: method is currently still needed by the serializer to encode the task object as JSON. The parser doesn't need any support, it can figure things out by itself for simple mappings. Yes, this makes no sense, as serializing is easier than parsing, but I haven't gotten around to the automation for serializing yet.

Analysis

So there you have it, an almost functional Todo backend, in refreshingly little code, and with refreshingly little magic. What I find particularly pleasing is that this conciseness can be achieved while keeping the architecture fully visible and maintaining a hexagonal/ports-and-adapters style.

What is the architecture of this app? It says so right at the end: the server is parametrized by its scheme, and that scheme is a JSON serializer hooked up to my todo scheme handler, hooked up to another JSON serializer hooked up to the directory /tmp/tasks.

Although a Rails app contains comparably little code, this code is scattered over different classes and is only comprehensible as a plugin to Rails. All the architecture is hidden inside Rails, it is not at all visible in the code and simply cannot be divined from looking at the code. Although there are many reasons for this, one fundamental one is that Ruby is a call/return language, and Rails does its best to translate from the REST architectural style to something that is more natural in the call/return style. And it does an admirable job at it.

I do think that this example gives us a little glimpse into what I believe to be the power of Architecture Oriented Programming: the power and succinctness of frameworks, but with the simplicity, straightforwardness and reusability of more library-oriented styles.

Performance

I obviously couldn't resist benchmarking this, and to my great joy found that wrk now works on the M1. Since the interpreter isn't thread safe, I had to restrict it to a single connection and thread. My expectations were that it requests/s would be in the double to low triple digits, my fear was that it would be single digits. (The reason for that fear is the writeOnJSONStream: method that is called for every object serialized and is in interpreted Objective-S, probably one of the slowest language implementations currently in existence). To say I was surprised is an understatement. Stunned is more like it:
wrk -c 1 -t 1 http://localhost:8082/task/1 
Running 10s test @ http://localhost:8082/task/1
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   133.62us   14.45us   0.97ms   98.52%
    Req/Sec     7.50k   311.09     7.62k    99.01%
  75326 requests in 10.10s, 12.28MB read
Requests/sec:   7458.60
Transfer/sec:      1.22MBTransfer/sec:      1.97MB

More than 7K requests per second! Those M1 Macs really are fast. I wonder what it will be once I remove the need for the manually written writeOnJSONStream: method.

(NOTE: previous version said >12K requests/s, which is even more insane, but was with an incorrect URL that had the server returning 404s)

Friday, May 21, 2021

Why are there no return statements in Objective-S?

My previous example raised a question: why no return statements? I am assuming this was about this part of the example:

   -description { "Task: {this:title} done: {this:done}". }


The answer is that I would like to do without return statements if and as much as I can. We will see how much that is. In general, I am in favor of expression-orientation in programming languages. A simple example is if-statements vs. conditional expressions. In most languages today, like C, Objective-C and Swift, if is a statement. That means I write something as follows:
if ( condition ) {
   do something if true
} else {
   do something different if false
}

This seems obvious and is general, but very often you don't want to do arbitrary stuff, you just want to have some variable have some value in one case and a different value in another case.
int foo;
if ( condition ) {
   foo = 1;
} else {
   foo = 42;
}

In that case, it is annoying that the if is defined to be a statement and not an expression, because you can't just write the following:
int foo;
foo = if ( condition ) { 1; } else { 42; }

In addition, as hinted to in the previous examples, you can't use a statement to initialize a variable, that definitely has to be an expression. Which is why C and many derived languages have the "ternary" operator (?:), which is really just an if/else in expression form.
int foo=condition ? 1 : 42;

That solves the problem, but now you have two conditionals. Why not have just one? LISP, most of the FP languages as well as Smalltalk and Objective-S have an if that returns a value.
a := condition ifTrue:{ 1. } ifFalse:{ 42. }.

So that's why expression-orientation is useful in general. What about methods? The same general idea applies. Whereas in Java, for example, a read accessor is called getX(), indicating an action that is performed ("get the value of x") in Objective-C Smalltalk and Objective-S, it is just called x, ("the value of x").

The same idea applies to dropping return statements where possible. It's not "get me the description of this object", it is "the description of this object is...". And inside the method, it's not "this statement now returns the following string as the description", but, again, "the description is...".

Describing things that are, rather than actions to perform, is at the heart of Objective-S, as discussed in Can Programmers Escape the Gentle Tyranny of Call/Return.

As Guy Steele put it:

Another weakness of procedural and functional programming is that their viewpoint assumes a process by which "inputs" are transformed into "outputs"; there is equal concern for correctness and for termination (and proofs thereof). But as we have connected millions of computers to form the Internet and the World Wide Web, as we have caused large independent sets of state to interact–I am speaking of databases, automated sensors, mobile devices, and (most of all) people–in this highly interactive, distributed setting, the procedural and functional models have failed, another reason why objects have become the dominant model. Ongoing behavior, not completion, is now of primary interest. Indeed, object-oriented programming had its origins in efforts to simulate the ongoing behavior of interacting real-world entities–thus the programming language SIMULA was born.
So wherever possible, Objective-S tries to push towards expressing things as statically as possible, pushing away from action-orientation. For example, hooking up a timed source to a pin:
#Blinker{ #seconds: 1, #active: true} → ref:gpio:17. 

instead of executing a loop:
while True: 
    GPIO.output(17, True) 
    sleep(1) 
    GPIO.output(17, False) 
    sleep(1) 

The same goes for many other relationships: instead of writing procedural code that initiates and/or maintains the relatinship, with the actual relationship remainig implicit, describe the actual relationship, make that explicit, and instead keep the procedural code that maintains it as a hidden implementation detail.

If the return statement comes back, and it very well might, I am hoping it will be in a slightly more general form. I recall Smalltalk's "^" being described as "send back". I've already taken that and generalised it to mean "send result", using it in filter definitions, where "^" means "send a result to the next filter in the pipeline". It is needed there because filters are not limited to sending a single result, they can send zero or many.

With those more general semantics, "^" might also be used to send back results to the sender of an asynchronous message, which is obviously quite different from a "return".

And of course it would be useful for early returns, which are currently not possible.

What about void methods?

Objective-S does have void methods, after all its procedural part is essentially identical to Objective-C, which also has them. However, I agree with the FP folk that functions (procedures, methods) should be as (side-)effect free as possible, and void methods by definition are effectful (or no-ops).

So where do the effects go? Two places:

  1. The left hand side of the "←".

    In most current programming languages, assignment is severely crippled, and therefore not really useful for generalised effects. With Polymorphic Identifiers and Storage Combinators, there is enough expressive power and ability to abstract that we should need far fewer void methods.

  2. Connecting via "→"

    Much of the need for effectful methods in OO is for constructing and connecting objects. In Objective-S, you don't need to call methods that result in a connection being established as a side effect of munging on some state, you define connections between objects directly using "→".

    Well, and you define objects using object literals such as  #Blinker{ #seconds: 1, #active: true}  instead of setting instance variables procedurally.

That's the plan, anyway. Although a lot of that plan is coming true at the moment. Exciting times! (And one of the reasons I haven't been blogging all that much).

Thursday, May 20, 2021

A far too simple (hardcoded) tasks backend in Objective-S

Recently there was a question as to what one should use to create a backend for an iOS/macOS app these days. I couldn't resist mentioning Objective-S, and just to check for myself whether that's feasible, I quickly jotted down the following tiny backend that returns a hardcoded list of tasks via HTTP:
#!env stsh
framework:ObjectiveHTTPD load.

class Task {
   var <bool> done.
   var title.
   -description { "Task: {this:title} done: {this:done}". }
}

taskList ← #( #Task{ #title: 'Clean my room', #done: false }, #Task{ #title: 'Check twitter feed', #done: true } ).

scheme todo {
   var taskList.
   /tasks { 
      |= { 
         this:taskList.
      }
   }
}.

todo := #todo{ #taskList: taskList }.
server := #MPWSchemeHttpServer{ #scheme: todo, #port: 8082 }.
server start.
shell runInteractiveLoop.

After loading the HTTP framework, we define a Task and a list of two example tasks. Then we define a scheme with a single path, just /tasks, which returns said tasks list. We then instantiate the scheme and serve it via HTTP on port 8082. Since this is a shell script and starting the server does not block, we finally start up the REPL.

Details such as coding the tasks as JSON, accessing a single task and modifying tasks are left as exercises for the reader.

Sunday, May 9, 2021

Talking to pins

The last few weeks, I spent a little time getting Objective-S working well on the Raspberry Pi, specifically my Pi400. It's a really wonderful little machine, and the form factor and price remind me very much of the early personal computers.

What's missing, IMHO, is an experience akin to the early BASICs. And I really mean "akin", not a nostalgia project, but recovering a real quality that has been lost: not really "simplicity", more "straightforwardness".

Of course, one of the really cool thing about the Pi is its GPIO interface that lets you do all sorts of electronics experiments, and I hear that the equivalent of "Hello World" for the Raspi is making an LED blink.


import RPi.GPIO as GPIO 
from time import sleep 
GPIO.setwarnings(False) 
 
GPIO.setmode(GPIO.BCM) 
GPIO.setup(17, GPIO.OUT) 
 
while True: 
    GPIO.output(17, True) 
    sleep(1) 
    GPIO.output(17, False) 
    sleep(1) 

Hmm. That's a a lot of semantic noise for something so conceptually simple. All we want to is set the value of a pin. As soon as I saw this, I knew it would be ideal for Polymorphic Identifiers, because a pin is the ultimate state, and PIs and their stores are made for abstracting over state.

Of course, I first had to to get Objective-S running on the Pi, which meant getting GNUstep to run. While there is a wonderful set of install scripts, the one for the Raspi only worked with an ancient clang version and libobjc 1.9. Alas, that version has some bugs on the Raspi, for example with the imp_implentationWithBlock() runtime function that Objective-S uses to define methods.

Long story short, after learning about GNUstep installs and waiting for the wonderful David Chisnall to remove some obsolete 32 bit exception-version detection code from libobjc, we now have a script that installs current GNUstep with a reasonably current clang: https://github.com/plaurent/gnustep-build/tree/master/raspbian-10-clang-9.0-runtime-2.1-ARM. With that in hand, a few bug fixes in MPWFoundation and Objective-S, I could add a really rudimentary Store that manages talking to the pins. And this allows me to write the following in an interactive shell to drive the customary GPIO pin 17 that I connected to the LED via resistor:


gpio:17 ← 1.

Now that's what I am talking about!

Of course, we're supposed to make it blink, not just turn it on. We could use the same looping approach as the Python script, or convenience methods like the ones provided, but the breadboard and pins make me think of wanting to connect components to do the job instead.

So let's connect some components, software architecture style! The following script creates an instance of a Blinker object (using an object literal), which emits alternating ones and zeros and connects it to the pin.


blinker ← #Blinker{ #seconds: 1 }. 
blinker → ref:gpio:17. 
blinker run.
gpio:17 ← 0.

Once connected it tells the blinker to start running, which creates an NSTimer adds it to the current runloop and then runs the run loop. That run is interruptible, so Ctrl-C breaks and runs the cleanup code.

What about setting up the pin for output? Happens automatically when you first output to it, but I will add code so you can do it manually.

Where does the Blinker come from? That's actually an object-template based on an MPWFixedValueSource.


object Blinker : #MPWFixedValueSource{ #values: #(0,1) }

You can, of course, hook up a fixed-value source to any kind of stream.

While getting here took a lot of work, and resulted in me (re-)learning a lot about GNUstep, the result, even this intermediate one, is completely worth it and makes me very happy. This stuff really works even better than I thought it would.

Friday, November 13, 2020

M1 Memory and Performance

The M1 Macs are out now, and not only does Apple claim they're absolutely smokin', early benchmarks seem to confirm those claims. I don't find this surprising, Apple has been highly focused on performance ever since Tiger, and as far as I can tell hasn't let up since.

One maybe somewhat surprising aspect of the M1s is the limitation to "only" 16 Gigabytes of memory. As someone who bought a 16 Kilobyte language card to run the Merlin 6502 assembler on his Apple ][+ and expanded his NeXT cube, which isn't that different from a modern Mac, to a whopping 16 Megabytes, this doesn't actually seem that much of a limitation, but it did cause a bit of consternation.

I have a bit of a theory as to how this "limitation" might tie in to how Apple's outside-the-box approach to memory and performance has contributed to the remarkable achievement that is the M1.

The M1 is apparently a multi-die package that contains both the actual processor die and the DRAM. As such, it has a very high-speed interface between the DRAM and the processors. This high-speed interface, in addition to the absolutely humongous caches, is key to keeping the various functional units fed. Memory bandwidth and latency are probably the determining factors for many of today's workloads, with a single access to main memory taking easily hundreds of clock cycles and the CPU capable of doing a good number of operations in each of these clock cycles. As Andrew Black wrote: "[..] computation is essentially free, because it happens 'in the cracks' between data fetch and data store; ..".

The tradeoff is that you can only fit so much DRAM in that package for now, but if it fits, it's going to be super fast.

So how do we make sure it all fits? Well, where Apple might have been "focused" on performance for the last 15 years or so, they have been completely anal about memory consumption. When I was there, we were fixing 32 byte memory leaks. Leaks that happened once. So not an ongoing consumption of 32 bytes again and again, but a one-time leak of 32 bytes.

That dedication verging on the obsessive is one of the reasons iPhones have been besting top-of-the-line Android phone that have twice the memory. And not by a little, either.

Another reason is the iOS team's steadfast refusal to adopt tracing garbage collection as most of the rest of the industry did, and macOS's later abandonment of that technology in favor of the reference counting (RC) they've been using since NeXTStep 4.0. With increased automation of those reference counting operations and the addition of weak references, the convenience level for developers is essentially indistinguishable from a tracing GC now.

The benefit of sticking to RC is much-reduced memory consumption. It turns out that for a tracing GC to achieve performance comparable with manual allocation, it needs several times the memory (different studies find different overheads, but at least 4x is a conservative lower bound). While I haven't seen a study comparing RC, my personal experience is that the overhead is much lower, much more predictable, and can usually be driven down with little additional effort if needed.

So Apple can afford to live with more "limited" total memory because they need much less memory for the system to be fast. And so they can do a system design that imposes this limitation, but allows them to make that memory wicked fast. Nice.

Another "well-known" limitation of RC that has made it the second choice compared to tracing GC is the fact that updating those reference counts all the time is expensive, particularly in a multi-threaded environment where those updates need to be atomic. Well...

How? Problem solved. I guess it helps if you can make your own Silicon ;-)

So Apple's focus on keeping memory consumption under control, which includes but is not limited to going all-in on reference counting where pretty much the rest of the industry has adopted tracing garbage collection, is now paying off in a majory way ("bigly"? Too soon?). They can get away with putting less memory in the system, which makes it possible to make that memory really fast. And that locks in an advantage that'll be hard to duplicate.

It also means that native development will have a bigger advantage compared to web technologies, because native apps benefit from the speed and don't have a problem with the memory limitations, whereas web-/electron apps will fill up that memory much more quickly.

Tuesday, September 15, 2020

Pointers are Easy, Optimization is Complicated

Just recently came across Ralf Jung's 2018 post titled Pointers are Complicated. The central thesis is that the model that most C (and assembly language) programmers have that a pointer is just an integer that happens to be a machine address is wrong, in fact the author flat out states: "Pointers are definitely not integers."

That's a strong statement. I like strong statements, because they make a discussion possible. So let's respond in kind: the claim that pointers are definitely not integers is wrong.

The example

The example the author uses to show that pointers are definitely not integers is the following:
int test() {
    auto x = new int[8];
    auto y = new int[8];
    y[0] = 42;
    int i = /* some side-effect-free computation */;
    auto x_ptr = &x[i];
    *x_ptr = 23;
    return y[0];
}

And this is the crux of the reasoning:
It would be beneficial to be able to optimize the final read of y[0] to just return 42. The justification for this optimization is that writing to x_ptr, which points into x, cannot change y.
So pointers are "hard" and "not integers" because they conflict with this optimization that "would be beneficial".

I find this fascinating: a "nice to have" optimzation is so obviously more important than a simple and obvious pointer model that it doesn't even need to be explained as a possible tradeoff, never mind justified as to why the tradeoff is resolved in favor of the nice-to-have optimization.

I prefer the simple and obvious pointer model. Vastly.

This way of placing the optimizer's concerns far ahead of the programmer's is not unique, if you check out Chris Lattner's What Every C Programmer Should Know About Undefined Behavior, you will note the frequent occurrence of the phrase "enables ... optimizations". It's pretty much the only justification ever given.

I call this now industry-dominating style of programming Compiler Optimizer Creator Oriented Programming (COCOP). It was thoroughly critiqued in What every compiler writer should know about programmers or “Optimization” based on undefined behaviour hurts performance (pdf).

Pointers as Integers

There are certainly machines where pointers are not integers, the most prominent being 8086/80286 16 bit segmented mode, where a (far) pointer consists of a segment and an offset. On 8086, the segment is simply shifted left 4 bits and added to the offset, on 80286 the segment can be located anywhere in memory or not be resident, implementing a segmented virtual memory. AFAIK, these modes are simplified variants of the iAPX 432 object memory model.

What's important to note in this context is that the iAPX 432 and its memory model failed horribly, and industry actively and happily moved away from the x86 segmented model to what is called a "flat address space", common on other architectures and finally also adopted by Intel with the 386.

The salient feature of a "flat address space" is that a pointer is an integer, and in fact this eqivalence is also rather influential on CPU architecture, with address-space almost universally tied to the CPU's integer size. So although the 68K was billed as a 16 bit CPU (or 16/32), its registers were actually 32 bits, and IIRC its address ALUs were fully 32 bit, so if you wanted to do some kinds of 32 bit aritmetic, the LEA (Load Effective Address) instruction was your friend. The reason for the segmented architecturee on the 8086 was that it was a true 16 bit machine, with 16 bit registers, but Intel wanted to have a 20 bit address space.

So not only was and is there an equivalence of pointers and integers, this state of affairs was one that was actively sought and joyously received once we achieved it again. Giving it up for nice-to-have optimizations seems at best debatable, but at the very least it is something that should be discussed/debated, rather than simply assumed away.