Probability & Statistics Exercises
The math of uncertainty — applied to the decisions you actually make.
Probability and statistics are the formal language of uncertainty, and the exercises in this category translate that formal language into the everyday reasoning you actually need. Most people think probabilistically without realizing it — when you hear that a friend tested positive for a rare disease, or that a candidate is leading in polls, or that a stock has 'dropped 30% from its peak,' you are doing probability reasoning, often badly. The exercises drill the moves that separate accurate from intuitive judgment under uncertainty: base rates, conditional probability, sample size, regression to the mean, and Bayesian updating.
The single most important concept in this category is the base rate — the underlying frequency of an event before you see any specific evidence. Studies of probabilistic reasoning (Kahneman & Tversky 1973, Bar-Hillel 1980) consistently show that humans neglect base rates in favor of vivid case-specific evidence, leading to systematic overconfidence in the wrong direction. The exercises explicitly train the base-rate-first habit. Beginner exercises focus on simple probability calculations and reading common statistical formats (averages, percentages, percentiles). Intermediate exercises introduce conditional probability and the famous tests-with-low-base-rates puzzle. Advanced exercises cover Bayesian reasoning, regression to the mean, and the structural causes of statistical illusions.
If formal math intimidates you, do not worry — the exercises emphasize conceptual reasoning over calculation. Most of the value comes from internalizing four or five qualitative principles that you can apply without arithmetic.
Why this skill matters
Probability literacy has measurable effects on health, financial, and policy decisions. Studies of risk communication (Gigerenzer, Galesic) show that doctors and patients who reason in natural-frequency terms (10 out of 1,000) make systematically better decisions than those who reason in percentages or probabilities, because the frequency framing makes base rates impossible to ignore. People who have practiced these exercises reach the natural-frequency framing automatically and avoid the worst statistical errors in everyday reasoning.
The skill also pays off in professional contexts. Anyone who interprets data — analysts, product managers, engineers running experiments, clinicians, traders, journalists — is constantly making probability judgments. The difference between someone who is calibrated about uncertainty and someone who is not is enormous over a career. A small number of base-rate errors in high-stakes decisions can dwarf the cumulative cost of any other reasoning weakness, which is why this category receives disproportionate attention in research on expert judgment (Tetlock, Galef).
Common pitfalls
The reasoning errors these exercises specifically train against.
Base-rate neglect
Given specific evidence about an individual case, people anchor on the case and ignore the underlying frequency. The most famous example: a 95% accurate test for a disease that affects 1 in 10,000 people will produce mostly false positives, but most people guess that a positive result means the patient probably has the disease.
Confusing P(A|B) with P(B|A)
The probability of cancer given a positive test is not the same as the probability of a positive test given cancer. Mistaking these two is the technical core of base-rate neglect. The exercises explicitly drill the difference until the framing becomes automatic.
Ignoring sample size
A 70% success rate based on 10 trials is much weaker evidence than a 60% success rate based on 1,000 trials. People routinely treat percentages as comparable regardless of the sample they were drawn from, which leads to massive misjudgments of evidence strength.
Missing regression to the mean
Extreme observations tend to be followed by less extreme ones, even with no causal change. Misattributing this regression to interventions is the source of many false beliefs about coaching, medicine, and personal habits.
How the exercises are structured
Each exercise presents a probabilistic scenario — a medical test, a polling result, an A/B test outcome, an investment claim — and asks for the right interpretation. Wrong answers reflect the canonical errors: base-rate neglect, sample-size insensitivity, misreading conditional statements. The explanations rephrase the problem in natural-frequency terms (which makes the right answer obvious) and show how the original framing produced the misjudgment.
Most exercises can be reasoned about qualitatively, with rough estimates rather than precise math. The goal is calibrated intuition — the ability to look at a probabilistic claim and immediately see whether it is plausible — not numerical fluency. Numerical fluency comes from practice; qualitative calibration comes from the conceptual moves these exercises train.
Where this skill applies
- Medical decision-making. Whether you are evaluating a screening recommendation, weighing a treatment option, or deciding when to seek a second opinion, base-rate-first reasoning produces systematically better choices than the intuitive alternative.
- Reading polls and forecasts. Election coverage, weather forecasts, and economic predictions are all probabilistic claims that most consumers misread. Practiced statistical thinking lets you extract calibrated information from coverage that is otherwise misleading.
- Better experimentation at work. Anyone running A/B tests, pilots, or controlled trials benefits from understanding sample size, variance, and the difference between underpowered and informative experiments.
Frequently asked questions
Do I need to remember formulas?
No. The exercises are conceptual, not computational. Almost everything can be reasoned about in natural-frequency terms (10 out of 1,000) which avoids the formula-heavy version of probability. If you ever take an actual statistics course, the conceptual fluency these exercises build will make the formulas much easier to remember.
What is Bayesian reasoning, in plain English?
Updating your belief in a claim by combining your prior probability (what you thought before) with the strength of the new evidence (how much more likely the evidence is if the claim is true versus if it is false). The exercises walk through this process in everyday scenarios so the formal math becomes optional.
How is this category different from scientific reasoning?
Scientific reasoning is about the design and interpretation of empirical claims — what is the hypothesis, what controls were used, what alternative explanations exist. Probability and statistics is about the math of uncertainty itself. They are complementary: scientific reasoning produces the question, statistics provides the framework for the answer.
Are these exercises useful for jobs that involve data?
Yes — they target the conceptual mistakes that data-handling roles most commonly make. Even people with formal statistics training often fall into base-rate and sample-size errors in unfamiliar contexts. The exercises specifically drill these patterns until catching them becomes automatic.
Further reading
Primary sources and reputable references for the concepts covered above.
- Reckoning with Risk: Learning to Live with UncertaintyGerd Gigerenzer — Penguin
On natural-frequency framing and how to reason clearly about probabilistic risks.
- Stanford Encyclopedia of Philosophy: Bayesian EpistemologyStanford University
Scholarly treatment of Bayesian reasoning as a theory of belief revision.
- How to Lie with StatisticsDarrell Huff — W.W. Norton
The classic introduction to common statistical deceptions; still directly relevant decades after publication.
- Superforecasting: The Art and Science of PredictionPhilip Tetlock & Dan Gardner — Crown
On calibrated probabilistic thinking applied to forecasting, drawing on the largest study of prediction accuracy ever conducted.