On flywheels and double selection
Multidirectional correlations are most of what we see
Epistemic status: Describes a strong prior for what you should expect to raise to your attention, but this doesn’t mean everything that does will fit this pattern.
Flywheels
There’s a common argument where we notice that two things are correlated, but then start arguing about the direction. For example:
Good schools tend to have smart students. But is it because good schools are good at teaching their students to be smart, or because smart students are more likely to go to the good schools (either because they have some kind of acceptance filter, or because smart students are more likely to have parents who act proactively about school quality)?
People who exercise tend to be healthier and more energetic. But is that because exercise makes you healthy, or because healthy energetic people have an easier time working out?
Political radicalization tends to correlate with government dysfunction. But is that because radical politicians tend to govern poorly, or because poor governance causes the people to support more angry radical outsiders1?
On the other side, in high trust societies people trust each other more and are more trustworthy, but maybe they’re only trustworthy because it’s such a strong social expectation?
With all these questions we ask for cause and effect. I say, why not both? And this brings us to flywheels.
Consider an unobserved universe with a bunch of background mechanisms that lead to a bunch of observables2. What we see is correlations between the observables. But what can cause this?
The observables might have correlations by accident due to random noise, but these are unlikely, and become less likely the stronger the correlation is (and if they do happen, they’re unlikely to replicate). Cause and effect correlations (where one of the background mechanisms affects another mechanism) will show up in the observables, but unless the effect is strong this may not noticeably rise above the level of random noise (unless the effect is very strong).
On the other hand, some of the mechanisms may have flywheel relations, where each mechanism positively affects the other. Since these go in a loop, they’ll usually produce much stronger correlations that rise above the noise. We should expect to basically always notice these.
To actually do the math of how many of each type of correlation rises to the level where we notice it would require making way too many assumptions about base rates. But broadly speaking, we have a pattern where we don’t notice most unidirectional correlations and do notice most flywheels. Unless flywheel correlations have incredibly low base rates (which they shouldn’t - if mechanism A affects B, it’s at least plausible that B also affects A), we should expect most (but not all) observed correlations to be caused by flywheel relations.
As an empirical observation of base rates, I believe that four of the five specific examples described above are flywheels and only one is a unidirectional correlation3. So as a very rough ballpark, an observed correlation has about an 80% chance to be a flywheel.
Double Selection
Consider an elite selection process, like an application to a top college or an interview day for a hedge fund. To a first approximation, a candidate’s performance is affected by a mix of talent (which affects how well he can do on questions and tests in general) and luck4 (a candidate can have an off day, or get an interview question which he happened to run across talking to a friend the other day, or chance on an interviewer with who he has unusually good or bad personal vibes).
Assume talent and luck are given by independent normal random variables X and Y (with mean zero and separate variances). If we assume that a candidate is accepted only if they pass a certain bar A, then the talent distribution of accepted candidates will be X | X+Y > A. Similarly, the luck distribution to pass an interview looks like Y | X+Y > A.
What do these look like? They’re not exactly normal distributions, but they come close for large A. As A grows, the conditional mean of X approaches A*Var(X)/Var(X+Y), and the conditional variance approaches Var(X)Var(Y)/Var(X+Y). The conditional distribution of luck will approach something with the same variance and a similar mean (replacing X and Y in the above equations)5.
In other words, the conditional variance of X and Y converges to being equal, and X and Y each individually converge to the same quantile on average (both luck and talent want to contribute their fair share, but no more, since extra points are expensive). The higher A is, the more you need both luck and talent to get through selection. Which is why the halls of any sufficiently elite institution are full of smart people who happened to wake up on the right side of the bed the day they interviewed6.
This is going to show up in any isolated case of disaster7. Chernobyl wasn’t just a badly designed plant, or one where everyone got fantastically unlucky, or a plant without sufficient safety fallbacks - it was all of these. It had to be, because nuclear power is generally very safe, and it took a lot of different things going wrong to make a big disaster of it.
Or consider a political controversy, like a controversial police shooting8. You’ll have some people pointing at how the officer involved was angry or reckless or trigger happy and acted stupidly in a bunch of ways. And on the other side, you’ll have people pointing out how the victim acted stupidly or provocatively in ways that escalated the situation, didn’t give the officer time to think, or made him feel legitimately threatened. In general, both sides will be able to point to facts that support their case (and conclude that therefore their own side is blameless).
But in general, police shootings (especially controversial ones) are just incredibly rare9. For someone to end up getting shot by a police officer a lot of things have to go wrong on all sides. Both the officer and the victim usually have to act pretty stupidly, and there also has to be a lot of bad luck involved10.
This isn’t specific to police shootings. Any controversy that becomes news has to be unusual by definition (deadly car crashes or heart attacks don’t make national news), which means it has to be a rare event with a whole lot of selection going on.
For a final example, consider the New York Subway. for it to be such an expensive boondoggle, there’s no one thing that had to go wrong - pretty much everything (soft costs, station size, procurement processes, interagency cooperation, corruption…) had to go wrong. you don’t get multibillion dollar/mile subway lines with only one mistake.
But to end this on a hopeful note - this means it’s easy to improve. For a disaster to happen, a lot of different things had to go wrong. Fix any one of them, and you’ve already avoided the disaster happening again. Fix all of them, and it’s all smooth sailing from here out11.
A specific example of this: Strong rent control correlates with dysfunctional housing markets. But is that because rent control makes the housing market dysfunctional, or because dysfunctional housing markets make people support rent control to avoid evictions?
I’m modeling here that each background process creates a series of observables. Each background mechanism creates its own observables and also may affect some of the other mechanisms, thus affecting other observables indirectly.
Which one is left as an exercise to the reader.
This analysis applies to any mix of factors - for example, a college that grades candidates on a mix of GPA scores and personal essays will require more success in both the more selective it is.
The proof of this, and the rate of convergence, is left as an exercise to the reader.
There’s a subtle point here: If we assume the interview process is well designed to measure talent more than luck - a famously hard problem, but let’s assume our institution does a good job at it. Then having more talent than the minimum bar will help people get in more than a bit of extra luck would: more talent sensitivity implies X has wider variance, and then a point of extra talent is worth more quantiles of Y than it is quantiles of X, so any candidate who’s notably more talented than the minimum requirement is much more likely to get it. But, and this is important, the higher the threshold is, the larger the ratio is of people who meet the minimum bar to people who cleanly exceed it. So as your threshold of acception becomes higher, your interview process becomes exponentially more difficult to design well.
Most successes are replicable - a great athlete or scientist has to have a lot of different successes to be famous, and a great engineer or businessman has to do a lot of things consistently well to be successful. Since fame-making success often requires consistency, most headline-making cases of multiple selection will be failures.
This part was mostly written a few weeks ago and isn’t means as a specific comment on any recent events.
There’s around ten million arrests per year in the US and about a thousand fatal police shootings, many of which are noncontroversial (e.g. police returning fire on someone who opened fire first). So somewhere around 0.005% of arrests, and an officer arrests someone just once a month on average.
What we can conclude from this about moral culpability is harder. In terms of the specific shooting, both the officer and the victim are likely (but not always) in the bottom quantile of competence for handling the encounter, so if we measure individual virtue on the baseline of the average member of their class both are, on average, equally bad.
On the other hand, we might ask what societal reforms we could enact to make negative outcomes less likely, it becomes an object-level question of which curve is easier to shift. This is in general a hard question (it requires asking both what we can do and what the baseline of things we’re already doing is), and I have no special insight into it right now.
Until you hit another, unrelated disaster. But that’s a whole different problem!



Re: footnote 3, gur sbhe va gur znva grkg ner ovqverpgvbany naq gur bar va gur sbbgabgr vf havqverpgvbany (erag pbageby -> qvfshapgvbany znexrg), right?