Standard Deviation
Standard deviation measures how spread out data is around the mean. Low SD means values cluster tightly; high SD means they're spread widely.
Formulas
Mean: x̄ = Σxᵢ / n
Population SD: σ = √[Σ(xᵢ - x̄)² / N]
Sample SD: s = √[Σ(xᵢ - x̄)² / (n-1)]
Use population (σ) when you have ALL data.
Use sample (s) when data is a sample from a larger group.
Step-by-Step Example
Data: {4, 7, 13, 2, 1} n=5
Mean = (4+7+13+2+1)/5 = 27/5 = 5.4
Deviations²: (4-5.4)²=1.96, (7-5.4)²=2.56,
(13-5.4)²=57.76, (2-5.4)²=11.56, (1-5.4)²=19.36
Sample variance = (1.96+2.56+57.76+11.56+19.36)/(5-1) = 23.3
Sample SD (s) = √23.3 = 4.83
Interpreting SD
- Normal distribution: 68% of data within ±1σ
- 95% within ±2σ | 99.7% within ±3σ (68-95-99.7 rule)
- Coefficient of variation: CV = (s/x̄) × 100% (relative spread)
Calculate standard deviation: Free Standard Deviation Calculator
Standard Deviation Quick-Reference Table
| Rule | % of data within range | Range (μ ± kσ) |
|---|---|---|
| 1σ rule | 68.27% | μ − σ to μ + σ |
| 2σ rule | 95.45% | μ − 2σ to μ + 2σ |
| 3σ rule | 99.73% | μ − 3σ to μ + 3σ |
| ±1.96σ | 95.00% | 95% confidence interval |
| ±2.576σ | 99.00% | 99% confidence interval |
| 6σ (Six Sigma) | 99.99966% | Manufacturing defect limit |
How Standard Deviation Works
Standard deviation (σ for population, s for sample) measures how spread out data is around the mean. For a sample: s = √[Σ(xᵢ − x̄)² / (n−1)]. The (n−1) denominator (Bessel's correction) makes s an unbiased estimator of the population σ. Variance = s². Standard deviation has the same units as the data; variance has squared units.
In a normal distribution, the 68-95-99.7 rule (empirical rule) describes what percentage of data falls within 1, 2, or 3 standard deviations of the mean. Standard deviation is used in finance (volatility of returns), quality control (Six Sigma), clinical trials (statistical significance), polling (margin of error), and educational testing (normalising scores).
Common Mistakes
- Population vs. sample formula: Use n in the denominator for a full population; n−1 (Bessel's correction) for a sample estimating population σ. Spreadsheets: STDEV (sample) vs STDEVP (population).
- Interpreting SD without context: An SD of 10 is large if the mean is 5, but tiny if the mean is 1,000. Use the Coefficient of Variation (CV = σ/μ × 100%) to compare dispersion across different scales.
- Assuming normality: The 68-95-99.7 rule only applies to normally distributed data. Skewed or heavy-tailed distributions (like financial returns) can have far more extreme values beyond ±2σ than the normal rule suggests.
Frequently Asked Questions
Standard deviation describes spread in the original data. Standard error of the mean (SEM = s/√n) describes how much the sample mean is likely to vary from the true population mean. As n increases, SEM decreases (larger samples give more precise mean estimates) while SD stays approximately constant. Error bars in research graphs should label whether they show SD (data spread) or SEM (mean precision).
Annual return volatility is measured as the standard deviation of monthly (or daily) returns, annualised by multiplying by √12 (or √252 for trading days). A stock with 20% annual volatility has returns within ±20% of its mean about 68% of years, within ±40% about 95% of years. Higher σ = more risk; portfolio theory uses σ to quantify diversification benefits.
Six Sigma targets a defect rate of 3.4 per million opportunities — equivalent to process output staying within ±6σ of the target (with an assumed 1.5σ mean shift over time). This is far more stringent than ±3σ (2,700 defects/million). Achieving Six Sigma requires reducing process variation until the specification limits are 12 standard deviations apart — a fundamental goal of Lean Six Sigma quality programmes.