Monday, August 7, 2017

The Science Behind the "Google Manifesto"

The "Google Diversity Manifesto" has created quite a bit of controversy. One thing that hasn't helped is that (at least that's what I read), Gizmodo stripped the links to the scientific evidence supporting the basic premises. For me, being at least roughly aware of that research, the whole thing seemed patently unremarkable, to others apparently not so much:

Now I don't think everyone has to agree with what was written, but it would help to at least get to a common understanding. I didn't find anything in the text that said or even hinted that women were "inferior", but apart from the chance that I just missed it, it also seems that some of the ideas and concepts presented might at least "feel" that way when stripped of their context.

Ideally, we would get the original with citations and figures, but as a less-then-ideal stopgap, here are some references to the science that I found.

UPDATE: The original document has been published.

Biases

The text starts with a list of biases that the author says are prelavent in the political left and the political right. This seems to be taken directly from Jonathan Haidt.
Text, Slides

Article in the New York Times: Forget the money follow the sacredness

Possible non-bias causes of the "gender gap"

Second, after acknowledging that biases hold people back, the author goes into possible causes of a gender gap in tech that are not bias, and may even be biological in nature. There he primarily goes into the gender differences in the Big Five personality traits.

As far as I can tell, the empirical solidity of the Big Five and findings around them are largely undisputed, the criticism listed in the Wikipedia page is mostly about it not being enough, being "just empirical". One thing to note is that terms like "neuroticism" in this context appear to be different from their everyday use. So someone with a higher "neuroticism" score is not necessarily less healthy than one with a lower score. Seeing these terms without that context appears to have stoked a significant part of the anger aimed at the paper and the author.

Jordan Peterson has a video on the same topic, and here are some papers that show cross-cultural (hinting at biological causes) and straight biologically caused gender differences in these personality traits:

So yes, there are statistical gender differences. None of these say anything about an individual, just like most physical differences: yes, men are statisticially taller than women. Yet, there are a lot of women that are taller than a lot of men. Same with the psychological traits, where the overlap is great and there is also no simple goodness scale attached to the traits.

As a matter of fact, it appears to be that one reason women choose tech less than men is that women who have high math ability also tend to have high verbal ability, whereas men with high math ability tend to have just the high math ability. So women have more options, and apparently people of either gender with options tend to avoid tech: Why Brilliant Girls Tend to Favor Non-STEM Careers

Of course, the whole idea that there are no biological reasons for cognitive differences is The Blank Slate hypothesis, which was pretty thoroughly debunked by Steven Pinker in his book of the same title: The Blank Slate. What's interesting is that he documents the same sort of witch hunt we've seen here. This is not a new phenomenom.

Even more topical, there was also the Pinker/Spelke debate "...on the research on mind, brain, and behavior that may be relevant to gender disparities in the sciences, including the studies of bias, discrimination and innate and acquired difference between the sexes."

This covers a lot of the ground alluded to in the "manifesto", with Pinker providing tons and tons of interlocking evidence for there being gender-specific traits and preferences that explain the gaps we see. Almost more interestingly, he makes a very good case that the opposite thesis makes incorrect predictions.

There is lots and lots more to this. One of my favorite accessible (and funny!) intros is the Norwegian Documentary The Gender Equality Paradox. The documentary examines why in Norway, which is consistently at the top of world-wide country rankings for gender equality, professions are much more segregated than in less egalitarian countries, not less.

Empathy

I was surprised to find this, but what he writes about is exactly the thesis of Paul Bloom's recent book Against Empathy. (amazon, goodreads, New York Times).
Brilliantly argued, urgent and humane, AGAINST EMPATHY shows us that, when it comes to both major policy decisions and the choices we make in our everyday lives, limiting our impulse toward empathy is often the most compassionate choice we can make.
One small example he gives is that empathy tends to make us give much weight to an individual being harmed than many people being harmed, which is a somewhat absurd outcome when you think about it. There's a lot more, it's a fascinating read that forces you to think and question some sacred beliefs.

Microagressions

Microaggressions: Strong Claims, Inadequate Evidence:
I argue that the microaggression research program (MRP) rests on five core premises, namely, that microaggressions (1) are operationalized with sufficient clarity and consensus to afford rigorous scientific investigation; (2) are interpreted negatively by most or all minority group members; (3) reflect implicitly prejudicial and implicitly aggressive motives; (4) can be validly assessed using only respondents’ subjective reports; and (5) exert an adverse impact on recipients’ mental health. A review of the literature reveals negligible support for all five suppositions.

The Science of Microaggressions: It’s Complicated:

Subtle bigotry can be harmful, but research on the concept so far raises more questions than answers.
[..]
Still, the microaggression concept is so nebulously defined that virtually any statement or action that might offend someone could fall within its capacious borders.
[..]
The science aside, it is crucial to ask whether conceptualizing the interpersonal world in terms of microaggressions does more good than harm. The answer is “We don’t know.” Still, there are reasons for concern. Encouraging individuals to be on the lookout for subtle, in some cases barely discernible, signs of prejudice in others puts just about everyone on the defensive. Minority individuals are likely to become chronically vigilant to minor indications of potential psychological harm whereas majority individuals are likely to feel a need to walk on eggshells, closely monitoring their every word and action to avoid offending others. As a consequence, microaggression training may merely ramp up already simmering racial tensions.

Conclusion

I hope this gives a bit of background and that I haven't mis-represented the author's intent.

Saturday, March 4, 2017

So I wrote a book about performance...

...specifically iOS and macOS Performance Tuning: Cocoa, Cocoa Touch, Objective-C, and Swift. Despite or maybe because this truly being a labor of love (and immense time, the first time Addison-Wesley approached me about this was almost ten years ago), I truly expected it to remain just that: a labor of love sitting in its little niche. So imagine my surprise to see the little badge today:

Ios macos number1

Wow! Number #1 new release in Apple Programming (My understanding is that this link will change over time). And yes I checked, it wasn't the only release in Apple books for the period, there were a good dozen. In fact, iOS and macOS Programming took both the #1 and the #4 spots: Apple releases Oh, and it's also taken #13 overall in Apple programming books.

So a big THANK YOU to everyone that helped make this happen, the people I was allowed to learn from, Chuck who instigated the project and Trina who saw it through despite me.

Anyway, now that the book is wrapped up, I can publish more performance related information on this blog. Oh, the source code for the book is on GitHub.

UPDATE (March 5th, 2017): Now taking both the #1 and #2 spots in Apple new releases and the print edition is in the top 10 for Apple, with the Kindle edition in the top 20. Second update: now at #5 and #21 in overall Apple and #1/#3 in new releases. Getting more amazing all the time. I should probably take a break...

Concept Shadowing and Capture in MVC and its Successors

In a previous post, I noted that Apple's definition of MVC does not actually match the original definition, that it is more like Taligent's Model View Presenter (MVP) or what I like to to call Model Widget Controller. Matt Gallagher's look at Model-View-Controller in Cocoa makes a very similar point.

So who cares? After all, a rose by any other name is just as thorny, and the question of how the 500 pound gorilla gets to name things is also moot: however it damn well pleases.

The problem with using the same name is shadowing: since the names are the same, accessing the original definition is now hard. Again, this wouldn't really be a problem if it weren't for the fact that the old MVC solved exactly the kinds of problems that plague the new MVC. .

However, having to say "the problems of MVC are solved by MVC" is less than ideal, because, well, you sound a bit like a lunatic. And that is a problem, because it means that MVC is not considered when trying to solve the problems of MVP/MVC. And that, in turns, is a shame because it solves them quite nicely, IMHO much nicer than a lot of the other suggested patterns.

It turns out that MVC is, just like Algol an improvement on most of its successors.

Sunday, February 12, 2017

mkfile(8) is severely syscall limited on OS X

When I got my brand-new MacBook Pro (late 2016), I was interested in testing out the phenomenal SSD performance I had been reading about, reportedly around 2GB/s. Unix geek that I am, I opened a Terminal window and tapped out the following:
mkfile 8G /tmp/testfile

To my great surprise and consternation, both the time command and an iostat 1 running in another window showed a measly 250MB/s throughput. That's not much faster than a spinning rust disk, and certainly much much slower than previous SSDs, never mind the speed demon that is supposed the MBP 2016's SSD.

So what was going on? Were the other reports false? At first, my suspicions fell on FileVault, which I was now using for the first time. It didn't make any sense, because what I had heard was that FileVault had only a minimal performance impact, whereas this was roughly a factor 8 slowdown.

Alas, turning FileVault off and waiting for the disk to be decrypted did not change anything. Still 250MB/s. Had I purchased a lemon? Could I return the machine because the SSD didn't perform as well as advertised? Except, of course, that the speed wasn't actually advertised anywhere.

It never occurred to me that the problem could be with mkfile(8). Of course, that's exactly where the problem was. If you check the mkfile source code, you will see that it writes to disk in 512 byte chunks. That doesn't actually affect the I/O path, which will coalesce those writes. However, you are spending one syscall per 512 bytes, and that turns out to be the limiting factor. Upping the buffer size increases throughput until we hit 2GB/s at a 512KB buffer size. After that throughput stays flat.

Mkfile ssd throughput X-Axis is buffer size in KB. The original 512 byte size isn't on there because it would be 0.5KB or the entire axis would need to be bytes, which would also be awkward at the larger sizes. Also note that the X-Axis is logarithmic.

Radar filed: 30482489. I did not check on other operating systems, but my guess is that the results would be similar.

UPDATE: In the HN discussion, a number of people interpreted this as saying that syscall speed is slow on OS X. AFAIK that is no longer the case, and in case not the point. The point is that the hardware has changed so dramatically that even seemingly extremely safe and uncontroversial assumptions no longer hold. Heck, 250MB/s would be perfectly fine if we still had spinning rust, but SSDs in general and particularly the scorchingly fast ones Apple has put in these laptops just changed the equation so that something that used to just not factor into the equation at all, such as syscall performance, can now be the deciding factor.

In the I/O tests I did for my iOS/macOS performance book (see sidebar), I found that CPU nowadays generally dominates over actual hardware I/O performance. This was a case where I just wouldn't have expected it to be the case, and it took me over day to find the culprit, because the CPU should be doing virtually nothing. But once again, assumptions just got trampled by hardware advancements.

So check your assumptions.