Saturday, January 13, 2018

Meltdown patch reduces mkfile(8) throughput to less than 1/3 on macOS

In a previous post, I noted that mkfile was severely syscall limited on OSX due to too small transaction sizes, managing only around 250MB/s on the built-in SSD hardware that could achieve 2GB/s with larger buffer size.

I just retested the stock mkfile after the meltdown patch, and I/O rates are now down to a measly 61.5MB/s, measured both by wall-clock (well, time) and iostat. That's actually 1/4 the throughtput measured before, but the new timing is also with APFS enabled. Using large buffers to minimize the number of sys calls and presumably effectively eliminates the meltdown penalty shows the maximum throughput with APFS to be reduced by about 20% compared to before, to 1.6GBs. Just to put that graphically, here's disk throughput pre-APFS/Meltdown and post, using either very large block sizes (1MB) or the 512 byte block sizes used by mkfile:

So more than ever: use (really) large batches for I/O!

Monday, August 7, 2017

The Science Behind the "Google Manifesto"

The "Google Diversity Manifesto" has created quite a bit of controversy. One thing that hasn't helped is that (at least that's what I read), Gizmodo stripped the links to the scientific evidence supporting the basic premises. For me, being at least roughly aware of that research, the whole thing seemed patently unremarkable, to others apparently not so much:

Now I don't think everyone has to agree with what was written, but it would help to at least get to a common understanding. I didn't find anything in the text that said or even hinted that women were "inferior", but apart from the chance that I just missed it, it also seems that some of the ideas and concepts presented might at least "feel" that way when stripped of their context.

Ideally, we would get the original with citations and figures, but as a less-then-ideal stopgap, here are some references to the science that I found.

UPDATE: The original document has been published.

Biases

The text starts with a list of biases that the author says are prelavent in the political left and the political right. This seems to be taken directly from Jonathan Haidt.
Text, Slides

Article in the New York Times: Forget the money follow the sacredness

Possible non-bias causes of the "gender gap"

Second, after acknowledging that biases hold people back, the author goes into possible causes of a gender gap in tech that are not bias, and may even be biological in nature. There he primarily goes into the gender differences in the Big Five personality traits.

As far as I can tell, the empirical solidity of the Big Five and findings around them are largely undisputed, the criticism listed in the Wikipedia page is mostly about it not being enough, being "just empirical". One thing to note is that terms like "neuroticism" in this context appear to be different from their everyday use. So someone with a higher "neuroticism" score is not necessarily less healthy than one with a lower score. Seeing these terms without that context appears to have stoked a significant part of the anger aimed at the paper and the author.

Jordan Peterson has a video on the same topic, and here are some papers that show cross-cultural (hinting at biological causes) and straight biologically caused gender differences in these personality traits:

So yes, there are statistical gender differences. None of these say anything about an individual, just like most physical differences: yes, men are statisticially taller than women. Yet, there are a lot of women that are taller than a lot of men. Same with the psychological traits, where the overlap is great and there is also no simple goodness scale attached to the traits.

As a matter of fact, it appears to be that one reason women choose tech less than men is that women who have high math ability also tend to have high verbal ability, whereas men with high math ability tend to have just the high math ability. So women have more options, and apparently people of either gender with options tend to avoid tech: Why Brilliant Girls Tend to Favor Non-STEM Careers

Of course, the whole idea that there are no biological reasons for cognitive differences is The Blank Slate hypothesis, which was pretty thoroughly debunked by Steven Pinker in his book of the same title: The Blank Slate. What's interesting is that he documents the same sort of witch hunt we've seen here. This is not a new phenomenom.

Even more topical, there was also the Pinker/Spelke debate "...on the research on mind, brain, and behavior that may be relevant to gender disparities in the sciences, including the studies of bias, discrimination and innate and acquired difference between the sexes."

This covers a lot of the ground alluded to in the "manifesto", with Pinker providing tons and tons of interlocking evidence for there being gender-specific traits and preferences that explain the gaps we see. Almost more interestingly, he makes a very good case that the opposite thesis makes incorrect predictions.

There is lots and lots more to this. One of my favorite accessible (and funny!) intros is the Norwegian Documentary The Gender Equality Paradox. The documentary examines why in Norway, which is consistently at the top of world-wide country rankings for gender equality, professions are much more segregated than in less egalitarian countries, not less.

Empathy

I was surprised to find this, but what he writes about is exactly the thesis of Paul Bloom's recent book Against Empathy. (amazon, goodreads, New York Times).
Brilliantly argued, urgent and humane, AGAINST EMPATHY shows us that, when it comes to both major policy decisions and the choices we make in our everyday lives, limiting our impulse toward empathy is often the most compassionate choice we can make.
One small example he gives is that empathy tends to make us give much weight to an individual being harmed than many people being harmed, which is a somewhat absurd outcome when you think about it. There's a lot more, it's a fascinating read that forces you to think and question some sacred beliefs.

Microagressions

Microaggressions: Strong Claims, Inadequate Evidence:
I argue that the microaggression research program (MRP) rests on five core premises, namely, that microaggressions (1) are operationalized with sufficient clarity and consensus to afford rigorous scientific investigation; (2) are interpreted negatively by most or all minority group members; (3) reflect implicitly prejudicial and implicitly aggressive motives; (4) can be validly assessed using only respondents’ subjective reports; and (5) exert an adverse impact on recipients’ mental health. A review of the literature reveals negligible support for all five suppositions.

The Science of Microaggressions: It’s Complicated:

Subtle bigotry can be harmful, but research on the concept so far raises more questions than answers.
[..]
Still, the microaggression concept is so nebulously defined that virtually any statement or action that might offend someone could fall within its capacious borders.
[..]
The science aside, it is crucial to ask whether conceptualizing the interpersonal world in terms of microaggressions does more good than harm. The answer is “We don’t know.” Still, there are reasons for concern. Encouraging individuals to be on the lookout for subtle, in some cases barely discernible, signs of prejudice in others puts just about everyone on the defensive. Minority individuals are likely to become chronically vigilant to minor indications of potential psychological harm whereas majority individuals are likely to feel a need to walk on eggshells, closely monitoring their every word and action to avoid offending others. As a consequence, microaggression training may merely ramp up already simmering racial tensions.

Conclusion

I hope this gives a bit of background and that I haven't mis-represented the author's intent.

Saturday, March 4, 2017

So I wrote a book about performance...

...specifically iOS and macOS Performance Tuning: Cocoa, Cocoa Touch, Objective-C, and Swift. Despite or maybe because this truly being a labor of love (and immense time, the first time Addison-Wesley approached me about this was almost ten years ago), I truly expected it to remain just that: a labor of love sitting in its little niche. So imagine my surprise to see the little badge today:

Ios macos number1

Wow! Number #1 new release in Apple Programming (My understanding is that this link will change over time). And yes I checked, it wasn't the only release in Apple books for the period, there were a good dozen. In fact, iOS and macOS Programming took both the #1 and the #4 spots: Apple releases Oh, and it's also taken #13 overall in Apple programming books.

So a big THANK YOU to everyone that helped make this happen, the people I was allowed to learn from, Chuck who instigated the project and Trina who saw it through despite me.

Anyway, now that the book is wrapped up, I can publish more performance related information on this blog. Oh, the source code for the book is on GitHub.

UPDATE (March 5th, 2017): Now taking both the #1 and #2 spots in Apple new releases and the print edition is in the top 10 for Apple, with the Kindle edition in the top 20. Second update: now at #5 and #21 in overall Apple and #1/#3 in new releases. Getting more amazing all the time. I should probably take a break...

Concept Shadowing and Capture in MVC and its Successors

In a previous post, I noted that Apple's definition of MVC does not actually match the original definition, that it is more like Taligent's Model View Presenter (MVP) or what I like to to call Model Widget Controller. Matt Gallagher's look at Model-View-Controller in Cocoa makes a very similar point.

So who cares? After all, a rose by any other name is just as thorny, and the question of how the 500 pound gorilla gets to name things is also moot: however it damn well pleases.

The problem with using the same name is shadowing: since the names are the same, accessing the original definition is now hard. Again, this wouldn't really be a problem if it weren't for the fact that the old MVC solved exactly the kinds of problems that plague the new MVC. .

However, having to say "the problems of MVC are solved by MVC" is less than ideal, because, well, you sound a bit like a lunatic. And that is a problem, because it means that MVC is not considered when trying to solve the problems of MVP/MVC. And that, in turns, is a shame because it solves them quite nicely, IMHO much nicer than a lot of the other suggested patterns.

It turns out that MVC is, just like Algol an improvement on most of its successors.

Sunday, February 12, 2017

mkfile(8) is severely syscall limited on OS X

When I got my brand-new MacBook Pro (late 2016), I was interested in testing out the phenomenal SSD performance I had been reading about, reportedly around 2GB/s. Unix geek that I am, I opened a Terminal window and tapped out the following:
mkfile 8G /tmp/testfile

To my great surprise and consternation, both the time command and an iostat 1 running in another window showed a measly 250MB/s throughput. That's not much faster than a spinning rust disk, and certainly much much slower than previous SSDs, never mind the speed demon that is supposed the MBP 2016's SSD.

So what was going on? Were the other reports false? At first, my suspicions fell on FileVault, which I was now using for the first time. It didn't make any sense, because what I had heard was that FileVault had only a minimal performance impact, whereas this was roughly a factor 8 slowdown.

Alas, turning FileVault off and waiting for the disk to be decrypted did not change anything. Still 250MB/s. Had I purchased a lemon? Could I return the machine because the SSD didn't perform as well as advertised? Except, of course, that the speed wasn't actually advertised anywhere.

It never occurred to me that the problem could be with mkfile(8). Of course, that's exactly where the problem was. If you check the mkfile source code, you will see that it writes to disk in 512 byte chunks. That doesn't actually affect the I/O path, which will coalesce those writes. However, you are spending one syscall per 512 bytes, and that turns out to be the limiting factor. Upping the buffer size increases throughput until we hit 2GB/s at a 512KB buffer size. After that throughput stays flat.

Mkfile ssd throughput X-Axis is buffer size in KB. The original 512 byte size isn't on there because it would be 0.5KB or the entire axis would need to be bytes, which would also be awkward at the larger sizes. Also note that the X-Axis is logarithmic.

Radar filed: 30482489. I did not check on other operating systems, but my guess is that the results would be similar.

UPDATE: In the HN discussion, a number of people interpreted this as saying that syscall speed is slow on OS X. AFAIK that is no longer the case, and in case not the point. The point is that the hardware has changed so dramatically that even seemingly extremely safe and uncontroversial assumptions no longer hold. Heck, 250MB/s would be perfectly fine if we still had spinning rust, but SSDs in general and particularly the scorchingly fast ones Apple has put in these laptops just changed the equation so that something that used to just not factor into the equation at all, such as syscall performance, can now be the deciding factor.

In the I/O tests I did for my iOS/macOS performance book (see sidebar), I found that CPU nowadays generally dominates over actual hardware I/O performance. This was a case where I just wouldn't have expected it to be the case, and it took me over day to find the culprit, because the CPU should be doing virtually nothing. But once again, assumptions just got trampled by hardware advancements.

So check your assumptions.

Saturday, May 21, 2016

What's Missing in the Discussion about Dynamic Swift

There have been some great posts recently on the need for dynamic features in Swift.

I think Gus Muller really nails it with his description of adding an "Add Layer Mask" menu item to Acorn that directly talks to the addLayerMask: implementation:

With that in place, I can add a new menu item named "Add Layer Mask" with an action of addLayerMask: and a target of nil. Then in the layer class (or subclass if it's only specific to a bitmap or shape layer) I add a single method named addLayerMask:, and I'm pretty much done after writing the code to do the real work.

[..]

What I didn't add was a giant switch statement like we did in the bad old days of classic Mac OS programming. What I didn't add was glue code in various locations setting up targets and actions at runtime that would have to be massaged and updated whenever I changed something. I'm not checking a list of selectors and casting classes around to make the right calls.

The last part is the important bit, I think: no need to add boiler-plate glue code. IMHO, glue code is what is currently killing us, productivity-wise. It is sort of like software's dark matter, largely invisible but making up 90% of the mass of our software universe.

After giving some great  examples, Brent Simmons spells it out:

In case it’s not clear: with recent and future articles I’m documenting problems that Mac and iOS developers solve using the dynamic features of the Objective-C runtime. My point isn’t to argue that Swift itself should have similar features — I think it should, but that’s not the point. The point is that these problems will need solving in a possible future world without the Objective-C runtime. These kinds of problems should be considered as that world is being designed. The answers don’t have to be the same answers as Objective-C — but they need to be good answers, better than (for instance) writing a script to generate some code.
Again, that's a really important point: it's not that us old Objective-C hands are saying "we must have dynamic features", it's that there are problems that need solving that are solved really, really well with dynamic features, and really, really poorly in languages lacking such dynamic features.

However, many of these dynamic features are definitely hacks, with various issues, some of which I talk about in The Siren Call of KVO and (Cocoa) Bindings. I think everyone would like to have these sorts of features implemented in a way that is not hacky, and that the compiler can help us with, somehow.

I am not aware of any technology or technique that makes this possible, i.e. that gives us the power of a dynamic runtime when it comes to building these types of generic architectural adapters in a statically type-safe way. And the direction that Objective-C has been going, and that Swift follows with a vengeance is to remove those dynamic features in deference to static features.

So that's a big worry. However, an even bigger worry, at least for me, is that Apple will take Brent's concern extremely literally, and provide static solutions for exactly the specific problems outlined (and maybe a few others they can think of). There are some early indicators that this will be the case, for example that you can use CoreData from Swift, but you couldn't actually build it.

And that would be missing the point of dynamic languages almost completely.

The truly amazing thing about KVC, CoreData, bindings, HOM, NSUndoManager and so on is that none of them were known when Objective-C was designed, and none of them needed specific language/compiler support to implement.

Instead, the language was and is sufficiently malleable that its users can think up these things and then go to implement them. So instead of being an afterthought, a legacy concern or a feature grudgingly and minimally implemented, the metasystem should be the primary focus of new language development. (And unsurprisingly, that's the case in Objective-Smalltalk).

To quote Alan Kay:

If you focus on just messaging - and realize that a good metasystem can late bind the various 2nd level architectures used in objects - then much of the language-, UI-, and OS based discussions on this thread are really quite moot.

[..]

I think I recall also pointing out that it is vitally important not just to have a complete metasystem, but to have fences that help guard the crossing of metaboundaries. [..assignment..] I would say that a system that allowed other metathings to be done in the ordinary course of programming (like changing what inheritance means, or what is an instance) is a bad design. (I believe that systems should allow these things, but the design should be such that there are clear fences that have to be crossed when serious extensions are made.)

[..]

I would suggest that more progress could be made if the smart and talented Squeak list would think more about what the next step in metaprogramming should be - how can we get great power, parsimony, AND security of meaning?

Note that Objective-C's metasystem does allow changing meta-things in the normal course of programming, and it is rather ad-hoc about which things it allows us to change and which it does not. It is a bad design, as far as metasystems are concerned. However, even a bad metasystem is better than no metasystem. Or to quote Guy Steele (again):
This is the nub of what I want to say. A language design can no longer be a thing. It must be a pattern—a pattern for growth—a pattern for growing the pattern for defining the patterns that programmers can use for their real work and their main goal.
The solution to bad metasystems is not to ban metasystems, it is to design better metasystems that allow these things in a more disciplined and more flexible way.

As usual, discussion here or on HN.

Wednesday, March 23, 2016

Feindenpeinlichkeit

Sam Harris asks:

I think the word should be Feindenpeinlichkeit.