← Back to Blog

Correlation Coefficient Calculator: Pearson r and Interpretation

Calculate Pearson's correlation coefficient r between two variables. Interpret r values from +1 to -1, test for significance, and understand the difference between correlation and causation.

Correlation Coefficient Calculator: Pearson r and Interpretation

Pearson Correlation Coefficient

Pearson's r measures the strength and direction of the linear relationship between two continuous variables, ranging from -1 (perfect negative) to +1 (perfect positive), with 0 meaning no linear relationship.

Formula

r = [nΣxy - ΣxΣy] / √{[nΣx² - (Σx)²][nΣy² - (Σy)²]}

Or equivalently:
r = Σ[(xᵢ-x̄)(yᵢ-ȳ)] / √[Σ(xᵢ-x̄)² × Σ(yᵢ-ȳ)²]

Interpretation Scale

|r| = 0.90–1.00: Very strong
|r| = 0.70–0.89: Strong
|r| = 0.50–0.69: Moderate
|r| = 0.30–0.49: Weak
|r| = 0.00–0.29: Negligible

R² — Coefficient of Determination

R² = r²
r = 0.8 → R² = 0.64
Interpretation: 64% of the variation in Y is
explained by the linear relationship with X

Critical Warnings

  • Correlation ≠ causation (spurious correlations exist)
  • Sensitive to outliers — one point can change r dramatically
  • Only measures linear relationships (misses curves)
  • Test significance: |t| = r√(n-2)/√(1-r²), df = n-2

Calculate correlation: Free Correlation Coefficient Calculator

Correlation Coefficient Quick-Reference Table

r valueStrengthDirectionExample
+0.90 to +1.00Very strongPositiveHeight vs. arm span
+0.70 to +0.89StrongPositiveStudy hours vs. grade
+0.40 to +0.69ModeratePositiveExercise vs. fitness
+0.10 to +0.39WeakPositiveIncome vs. happiness
−0.10 to +0.10NegligibleNoneShoe size vs. IQ
−0.40 to −0.69ModerateNegativeStress vs. sleep quality
−0.90 to −1.00Very strongNegativeAltitude vs. oxygen level

How the Pearson Correlation Coefficient Works

Pearson's r = Σ[(xᵢ−x̄)(yᵢ−ȳ)] / [√Σ(xᵢ−x̄)² × √Σ(yᵢ−ȳ)²]. It measures the strength and direction of the linear relationship between two continuous variables, ranging from −1 (perfect negative linear) through 0 (no linear relationship) to +1 (perfect positive linear). r² (coefficient of determination) is the proportion of variance in y explained by x in simple linear regression.

Spearman's rank correlation (ρ) is a non-parametric alternative that uses ranks instead of values — appropriate for ordinal data, non-linear monotonic relationships, or when outliers would distort Pearson's r. Point-biserial correlation handles one dichotomous variable. For categorical variables, use Cramér's V or phi coefficient.

Common Mistakes

  • Assuming causation from correlation: Countries with more TVs per capita have higher life expectancy — but TVs don't cause longevity. Both are driven by wealth. Always consider confounding variables before inferring causation.
  • Using Pearson r for non-linear relationships: r measures only linear association. Two perfectly related variables following a U-curve may give r = 0. Always plot a scatter graph first.
  • Ignoring statistical significance: A correlation of r = 0.8 is not necessarily significant if n = 3 (df = 1). Test H₀: ρ = 0 using t = r√(n−2)/√(1−r²) with df = n−2. With n = 10, r = 0.8 gives t = 3.77 (p < 0.01).

Frequently Asked Questions

Q: How many data points do I need for a reliable correlation?

A minimum of n = 30 is often cited for Pearson's r to be approximately normally distributed. For detecting a moderate correlation (r = 0.3) with 80% power at α = 0.05, you need n ≈ 84. Small samples produce unstable r estimates with very wide confidence intervals — an r = 0.5 from n = 10 has a 95% CI of roughly [−0.07, 0.87], almost useless for interpretation.

Q: What is the difference between Pearson and Spearman correlation?

Pearson r measures linear relationship and requires interval/ratio data with approximate normality. Spearman ρ converts data to ranks and measures monotonic (but not necessarily linear) association — valid for ordinal data and robust to outliers. Use Spearman when data are ordinal (Likert scales, rankings), when outliers are present, or when the relationship appears curved but monotonic on a scatter plot.

Q: How are correlations used in portfolio management?

Portfolio variance = w₁²σ₁² + w₂²σ₂² + 2w₁w₂σ₁σ₂ρ₁₂. When ρ < 1, combining assets reduces portfolio volatility — the core of diversification theory. Perfect negative correlation (ρ = −1) would theoretically eliminate all risk. In practice, asset correlations increase during market crises (flight to safety), reducing diversification benefits precisely when they are most needed.