The Normal Distribution
The normal (Gaussian) distribution is bell-shaped and symmetric around the mean μ. It arises naturally in countless real-world measurements due to the Central Limit Theorem.
PDF and Standardisation
f(x) = (1/σ√2π) × e^(-(x-μ)²/2σ²)
To find probabilities, standardise to z:
z = (x - μ) / σ
Then look up Φ(z) (standard normal CDF)
Probability Calculations
P(X < a) = Φ((a-μ)/σ)
P(X > a) = 1 - Φ((a-μ)/σ)
P(a < X < b) = Φ((b-μ)/σ) - Φ((a-μ)/σ)
Heights: μ=170cm, σ=10cm
P(height > 185cm) = 1-Φ(1.5) = 1-0.9332 = 6.7%
P(160
Empirical Rule (68-95-99.7)
- μ ± 1σ: 68.3% of data
- μ ± 2σ: 95.4% of data
- μ ± 3σ: 99.7% of data
- μ ± 1.96σ: exactly 95.0%
Calculate normal probabilities: Free Normal Distribution Calculator
Normal Distribution Quick-Reference Table
| Interval | Area (probability) | Common use |
|---|---|---|
| μ ± σ | 68.27% | One-sigma rule |
| μ ± 1.645σ | 90.00% | 90% CI / one-tail 5% |
| μ ± 1.96σ | 95.00% | Standard 95% CI |
| μ ± 2σ | 95.45% | Two-sigma rule |
| μ ± 2.576σ | 99.00% | 99% CI |
| μ ± 3σ | 99.73% | Process control limits |
| μ ± 4σ | 99.9937% | Four sigma (62 in million) |
How the Normal Distribution Works
The normal (Gaussian) distribution N(μ, σ²) has PDF f(x) = (1/σ√2π) e^[−(x−μ)²/2σ²]. It is completely described by its mean μ and standard deviation σ. The standard normal Z ~ N(0, 1) is used for probability lookups; any normal variate is standardised via z = (x − μ)/σ. Areas under the curve represent probabilities.
The Central Limit Theorem makes the normal distribution ubiquitous: the sum (or mean) of many independent random variables converges to normal regardless of the underlying distribution. This is why measurement errors, manufacturing tolerances, human heights, test scores, and financial returns over medium timescales all approximate the normal distribution — each is the aggregate effect of many small independent influences.
Common Mistakes
- Assuming all data is normal: Financial returns are leptokurtic (fat tails — more extreme events than normal predicts). Income distributions are right-skewed. Always test normality before applying normal-distribution methods.
- Confusing % and probability: The area between μ and μ+σ is 34.13%, not 68.27%. The 68% rule covers the symmetric interval μ−σ to μ+σ.
- Using the normal for small samples with unknown σ: For n < 30 (or even n < 100) with σ estimated from data, use the t-distribution, which has heavier tails to account for additional uncertainty in the σ estimate.
Frequently Asked Questions
The Central Limit Theorem explains it: when a quantity is the result of many small, additive, independent influences, the sum tends to normal regardless of the shape of each individual contribution. Human height is the cumulative genetic and environmental effect of hundreds of gene variants and growth factors — each small, each roughly additive — so height is approximately normally distributed.
If log(X) is normally distributed, X is lognormal. Lognormal distributions are always positive and right-skewed. They arise when quantities are the result of multiplicative (not additive) processes: stock prices (returns compound multiplicatively), income (raise percentages multiply), particle sizes in grinding, and biological cell growth. The lognormal is the most common distribution used in financial options pricing (Black-Scholes model).
Both are symmetric and bell-shaped. The t-distribution has heavier tails, controlled by degrees of freedom (df = n − 1). As df → ∞, t → standard normal. For df = 30, the 95% critical value is 2.042 (vs. 1.96 for normal) — a small but meaningful difference. Always use t when σ is estimated from the sample; use normal when σ is known (rare in practice outside physics).