# The Misapplication of Statistics in Medical Research: Insights from the Polio and Ice Cream Error
In the 1940s, prior to the development of the polio vaccine, the frightening illness created significant worry among parents of young children. To reduce risk, certain public health authorities suggested staying away from ice cream. The rationale? A study indicated a link between ice cream consumption and polio cases. However, this association was found to be erroneous—both ice cream consumption and polio occurrences spiked during the summer months, leading to a false impression of causality. The researchers had committed a fundamental statistical error: mistaking **correlation for causation**.
This case illustrates an essential issue in medical research. Scientists frequently scrutinize data to reveal environmental factors contributing to illness, but the interpretation and presentation of statistical results can occasionally result in incorrect conclusions. As researcher John Ioannidis provocatively stated in 2005, **“most published research findings are false.”** While such a claim is dramatic, it rightly identifies significant flaws in the utilization of statistics within scientific inquiry.
Recognizing common statistical traps can assist individuals in critically evaluating health studies and media narratives. Here are **six prevalent ways statistics can be misapplied**—along with methods to spot these mistakes.
## 1) **Assuming Correlation Signifies Causation**
One of the most common misjudgments in statistics is presuming that just because two events coincide, one must be the cause of the other.
Take a look at a few ludicrous yet genuine illustrations from **[tylervigen.com](http://www.tylervigen.com/)**:
– There exists a **strong correlation** between the number of Nicolas Cage films released and the yearly drowning incidents in swimming pools. Yet clearly, Cage movies do not lead to drowning fatalities.
– An investigation could reveal a correlation between autism prevalence and organic food purchases, but that does not imply organic food leads to autism.
In truth, a **third hidden variable** (commonly referred to as a **confounding factor**) is frequently responsible for both trends. In the instance of ice cream and polio, **the summer season** was the shared underlying cause. Always consider: **Could there be another variable responsible for the trend?**
## 2) **Data Dredging (P-Hacking)**
Medical researchers sometimes participate in a practice known as **data dredging**—sifting through a large dataset to find multiple potential correlations and only publishing the most compelling one.
Envision a hypothetical survey:
– Researchers pose two questions to 1,000 individuals:
1. Have you viewed a Nicolas Cage film during the past year?
2. On a scale from 1 to 20, how strong is your desire to drown in a swimming pool?
– If the average drowning desire was **12** for Cage enthusiasts and **10** for non-enthusiasts, a statistician might claim that Cage film fans have a 20% greater likelihood of wanting to drown.
However, the issue lies here: If numerous statistical analyses are conducted on random data, **some discoveries will appear “statistically significant” purely by chance.** The conventional benchmark for significance in medical research tends to be **p < 0.05 (5%)**, suggesting that one in twenty findings could regrettably be false positives. Without adjustments for multiple comparisons, misleading interpretations can quickly surface.
When assessing a study, inquire: **Did the researchers explore multiple hypotheses but only showcase the one that “worked”?**
## 3) **Small Sample Sizes Can Lead to Misleading Conclusions**
The size of the sample plays a vital role in drawing credible conclusions. Suppose a tiny cohort of merely **six individuals** takes part in the Nicolas Cage drowning desire investigation. If half of the Cage-watchers indicate a higher longing to swim, this may appear significant—but could simply be coincidental.
The greater the sample size, the **more trustworthy the findings**. If research is based on merely **50 patients**, its results might not be applicable to the broader population.
Ask: **What was the size of the study sample? Were the participants chosen at random?**
## 4) **Misreading a Small P-Value**
A **p-value** gauges the likelihood of obtaining a result at least as extreme as the one observed, under the assumption that the null hypothesis (**no substantial effect**) holds true. A **low p-value** implies that the observed effect is less likely to stem from random variability. However, it **DOES NOT**:
– Confirm that a hypothesis is accurate.
– Reflect the magnitude or significance of an effect.
A noteworthy p-value holds little weight if the effect is trivial. Researchers and media often inflate results merely because they are **statistically significant**, even if the effect is **minor or trivial**.
Ask: **Does the study place excessive reliance on p-values without addressing real-world relevance?**
## 5) **Overemphasizing Minor Effect Sizes**
Even if a result is statistically