Z-Scores and the Normal Distribution
A z-score tells you how many standard deviations a value is above or below the mean. Converting to z-scores lets you compare values from different distributions and find probabilities.
Z-Score Formula
z = (x - μ) / σ
x = individual value
μ = population mean
σ = population standard deviation
Worked Example
Exam scores: μ=65, σ=12
Student scored 83:
z = (83-65)/12 = 18/12 = 1.5
z = 1.5 → 93.3rd percentile
(93.3% of students scored below this student)
Key Z-Score Percentiles
z = -2.00 → 2.28th percentile
z = -1.00 → 15.87th percentile
z = 0 → 50th percentile (median)
z = +1.00 → 84.13th percentile
z = +1.65 → 95th percentile
z = +1.96 → 97.5th percentile (95% CI boundary)
z = +2.33 → 99th percentile
z = +3.00 → 99.87th percentile
Applications
- Test standardisation (SAT, IQ scores)
- Quality control (6-sigma = z=6 → 3.4 defects/million)
- Finance: z-score credit model (Altman Z-score)
- Hypothesis testing: compare observed z to critical value
Calculate z-scores: Free Z-Score Calculator
Z-Score Quick-Reference Table
| Z-score | % below (left tail) | % above (right tail) | Interpretation |
|---|---|---|---|
| −3.0 | 0.13% | 99.87% | Extremely low |
| −2.0 | 2.28% | 97.72% | Well below average |
| −1.0 | 15.87% | 84.13% | Below average |
| 0.0 | 50.00% | 50.00% | Average (mean) |
| +1.0 | 84.13% | 15.87% | Above average |
| +1.645 | 95.00% | 5.00% | One-tail 95% critical value |
| +1.96 | 97.50% | 2.50% | Two-tail 95% critical value |
| +3.0 | 99.87% | 0.13% | Extremely high |
How Z-Scores Work
A z-score (z = (x − μ) / σ) standardises a value by expressing it as the number of standard deviations above or below the mean. Z-scores from different distributions are directly comparable. They are the foundation of hypothesis testing: the test statistic for a large-sample proportion test or known-σ mean test is a z-score, which is looked up in the standard normal table to find the p-value.
Uses include: comparing test scores across different exams (SAT vs. ACT), identifying outliers in datasets (|z| > 3 is a common threshold), standardising features before machine learning (z-score normalisation), quality control (process capability indices Cp and Cpk are z-score based), and medical diagnostic reference ranges (paediatric growth charts express height and weight as z-scores).
Common Mistakes
- Using sample statistics for population z-score: The textbook z = (x − μ) / σ uses population parameters. When σ is unknown and estimated from a sample, use the t-distribution instead of the standard normal.
- Confusing one-tail and two-tail p-values: A z-score of 1.96 gives a one-tail p-value of 2.5%, not 5%. For a two-tailed test at α = 0.05, you need |z| ≥ 1.96.
- Applying z-scores to non-normal distributions: Z-scores only map to probabilities via the standard normal table when the underlying distribution is approximately normal. For skewed data, the z-score location is still valid but the tail probability interpretation is not.
Frequently Asked Questions
Standardised test scores use the z-score to create a common scale. SAT uses mean ≈ 1050, σ ≈ 200. A score of 1250 gives z = (1250−1050)/200 = +1.0 — top 84th percentile. ACT uses mean ≈ 21, σ ≈ 5. A score of 26 gives z = (26−21)/5 = +1.0 — also 84th percentile. The z-score directly reveals relative performance across tests.
Many ML algorithms (k-NN, SVM, linear regression, neural networks) are sensitive to feature scales. Z-score normalisation (subtracting mean, dividing by SD) brings all features to the same scale — mean 0, SD 1. Without it, a feature measured in thousands (income) would dominate one measured in units (number of children) in distance-based models.
The Altman Z-score is a financial model (not a statistical z-score) that predicts corporate bankruptcy risk using five financial ratios. Z > 2.99 = safe zone; 1.81–2.99 = grey zone; < 1.81 = distress zone. Developed in 1968 by Edward Altman, it has ~72–80% accuracy in predicting bankruptcy two years in advance and is widely used by credit analysts and investment banks.