In 2007, author and statistician Howard Wainer published an article in New Scientist titled *The Most Dangerous Equation*. What a provocative title! So, what is the most dangerous equation?

There are two types of equations that can be dangerous:

- those that are dangerous if you know them (like
*e = mc*, which opened the door to the atom bomb), and^{2} - those that are dangerous if you don’t know them.

According to Dr. Wainer, the most dangerous equation lies in the second category. Over time, ignorance of this equation has caused a great deal of confusion, massive misinvestment of time and money, and a great deal of hardship. What’s the equation? It’s de Moivre’s equation relating to the standard error of the mean. The equation basically provides that when you measure less than the full population, the mean of that sub-population will likely vary from the true mean. The error of the calculated mean of a sub-population tends to increase exponentially as the sample size declines. Or, to flip it on its head, as the sample size gets closer to the true size of the population, the sample means cluster more and more around the true population mean. Here’s the equation:

## Rural Counties and Kidney Cancer

Here’s how this works in real life and why ignorance of the concept of this equation is dangerous. Check out this map from Dr. Wainer’s paper:

The map above shows counties in the continental U.S. that have unusual kidney-cancer rates. The teal-colored counties are in the lowest decile of kidney cancer and the orange counties are in the top decile. What do these teal and orange counties have in common? They tend to be rural counties. If you were just looking at the lowest decile counties you might think it made sense for them to have low cancer rates due to the clean living of rural life. If you were just looking at the orange counties you might also infer that living rural might lead to more cancer due to poor diet and too much alcohol and tobacco. But how can we explain that rural counties have both the highest and lowest incidence of kidney cancer?

The answer lies in de Moivre’s formula which provides that the smaller the sample size the higher the variation. So, counties with a small population have a very large variation around the mean while densely-populated urban counties have a much lower variation. So, small population counties had the greatest variation in cases, which put them on the best and worst outcomes list.

## Insurance Claims by Car Brand in The Netherlands

In an interesting post on LinkedIn, it was noted that drivers of Mazdas had the highest incidence of claims in the Netherlands and that drivers of Citroens had the lowest. Based on just knowing that data we can imagine that Mazda drivers are enjoying their *Zoom Zoom* while Citroen drivers are more careful. Plausible, but the real answer lies in the sample size. Here’s the proportion of car brands owned in the Netherlands:

As the smallest sample size, you’d expect for Mazda to have the greatest variation in claims history. In a future year you’d expect that Mazda would be among the best. In fact, that was the case with Skoda, a small sample size brand, as it went from the best claim record to one of the worst in back-to-back years.

## The Small Schools Movement

Even geniuses can be fooled when they don’t know about de Moivre’s equation. In the 1990’s the data showed that the best performing schools in the country were often small schools. As a result of this data, the Bill and Melinda Gates Foundation, the Annenberg Foundation, the Pew Charitable Trusts as well as many other foundations provided grants totaling in the billions to promote small schools. Of course, digging into the data found that small schools were over-represented both in the high performing schools as well as the low performing schools. The reason small schools outperformed was due to de Moivre’s formula, not due to some critical aspect of the school being small. After figuring this out, the small schools movement was abandoned.

## How Good Is Your Hospital?

This study found that in terms of mortality both the highest and lowest performing hospitals were small hospitals. The study reminded the reader that “a close examination of the information reveals a pattern which is consistent with a statistical phenomenon, discovered by the French mathematician de Moivre nearly 300 years ago, described in every introductory statistics textbook: namely that variation in performance indicators is expected to be greater in small [hospitals] and smaller in large [hospitals]. From a statistical viewpoint, the number of deaths in a hospital is not in proportion to the size of the hospital, but is proportional to the square root of its size.”

## The Takeaway

Small sample sizes tend to skew results. Because the standard error increases inverse to the square root of the sample size, a small sample size is more variable than we usually expect.

Please tell me pollsters are using larger sample sizes this election cycle when they are predicting an 8 point Biden lead.

Predicting the outcome of the election comes to mind. All of news origination 4 years ago were predicting Hillary would beat Donald by a good margin. What happened, I suggest it might have had something to do with Dr. Wainer’s equation.

Alas, not the same thing. De Moivre’s equation yields variability not bias. The wrong results from pre-election polls stems directly from those polls having 90% (or more!) noresponse rate. Most people, when they get a telephone poll pretty much hang up immediately. The ones that stay on to give their opinion usually have pretty strong feelings about it. This is termed (technically) non-ignorable nonresponse, and typically such data are biased. The amount of error is determined by the amount of nonresponse. If all you have is 10% of the presdetermined sample, the missing data can shift the results massively. (this response also relates to “Jim’s” comment — it isn’t just ‘n’ but also which ‘n’)

Omg. A comment by “the” Howard Wainer? Holy cow.

That is profound. Thx for sharing. Luv2Nap