Back to Exercises

Argument Analysis

Evaluating Evidence Quality

Sharpen your ability to weigh evidence like a scientist, distinguishing gold-standard research from persuasive-sounding but unreliable support. You will practice evaluating evidence from medical claims, business reports, and media stories, building the judgment to know when data actually supports a conclusion.

Intermediate15 minArgument Analysis

Context

Why this exercise

An argument is only as strong as the evidence behind it, and most evidence in everyday discourse is weaker than it sounds. A statistic without context, a testimonial without controls, a chart without a baseline — these are not lies, but they fail to support the conclusions people draw from them. This exercise trains the judgment to know when data actually licenses a claim and when it is being asked to do more work than it can bear, using scenarios drawn from medical research, news reporting, business consulting, and everyday wellness culture.

Before you start

Evidence evaluation became a formal discipline in the 20th century with the development of the randomized controlled trial (Austin Bradford Hill, 1948, streptomycin for tuberculosis) and Hill's later criteria for inferring causation from epidemiological data: strength of association, consistency, specificity, temporality, biological gradient, plausibility, coherence, experimental evidence, and analogy. These criteria distinguish a correlation worth taking seriously from one that is probably spurious. The most consequential question Hill's framework asks is the counterfactual one: what would have happened in the absence of the proposed cause? Without a comparison group — a placebo arm, a control city, a baseline rate — a percentage figure cannot tell you whether the intervention worked. Sixty-five percent of antidepressant patients improving means little if forty percent on placebo would have improved anyway.

Common evidence failures cluster around a few recurring patterns. Selection bias arises when the people in the data are not a random sample of the population the claim is about — the famous 1936 Literary Digest poll predicted Roosevelt's defeat by surveying 2.4 million telephone owners, missing the broader electorate. Confounding mistakes a third variable for a causal relationship — children who play chess score higher on math tests, but both behaviors may stem from a shared trait rather than one causing the other. Reverse causation flips the direction — successful people wake early, but probably because their schedules require it, not because waking early made them successful. And anecdote-as-evidence treats a single vivid case as representative when it cannot rule out coincidence, placebo, or simultaneous unrelated changes.

As you read each scenario, force the comparison question: 'compared to what?' That single move catches a large fraction of weak arguments. Then ask whether the data could be confounded, whether the sample reflects the relevant population, whether sample size and representativeness are being conflated (the 50,000-participant question in this exercise is built around exactly that confusion), and whether the observed pattern could plausibly run in the opposite causal direction. The explanations walk through each move on real examples. For deeper treatment, see Scientific Thinking, which covers experimental design, controls, and the inferential logic of evidence.

Question 1 of 520% Complete

A pharmaceutical company promotes a new antidepressant, citing a study where 65% of patients improved after 8 weeks on the drug. A critical thinker should be MOST concerned about which missing piece of information?