← Back to Blog

Linear Regression Calculator: Slope, Intercept, and R²

Calculate linear regression line (y = mx + b), slope, y-intercept, and R-squared coefficient of determination. Understand how well the line fits your data.

Linear Regression Calculator: Slope, Intercept, and R²

Linear Regression: Fitting a Line to Data

Linear regression finds the best-fit line through data points by minimising the sum of squared residuals (least squares method).

Slope and Intercept Formulas

m = [nΣ(xy) - ΣxΣy] / [nΣ(x²) - (Σx)²]
b = (Σy - mΣx) / n
Line: ŷ = mx + b

Worked Example

Data: (1,2),(2,4),(3,5),(4,4),(5,6)
n=5, Σx=15, Σy=21, Σxy=72, Σx²=55

m = (5×72 - 15×21) / (5×55 - 225)
  = (360-315) / (275-225) = 45/50 = 0.9
b = (21 - 0.9×15)/5 = (21-13.5)/5 = 1.5

Line: ŷ = 0.9x + 1.5

R² (Coefficient of Determination)

R² = 1 - SS_res / SS_tot
SS_res = Σ(yᵢ - ŷᵢ)²  (sum of squared residuals)
SS_tot = Σ(yᵢ - ȳ)²   (total variation)

R² = 0: line explains nothing
R² = 1: perfect fit
R² = 0.8: line explains 80% of variance

Calculate linear regression: Free Linear Regression Calculator

Linear Regression Quick-Reference Table

R² valueInterpretationTypical context
0.00–0.19Very weak fitSocial sciences, noisy data
0.20–0.49Weak to moderate fitEconomic forecasting
0.50–0.74Moderate fitMany business models
0.75–0.89Good fitControlled experiments
0.90–0.99Strong fitPhysical sciences, engineering
1.00Perfect fit (suspect data)Overfitting or duplicate data

How Linear Regression Works

Simple linear regression fits the line ŷ = b₀ + b₁x that minimises the sum of squared residuals (OLS — Ordinary Least Squares). The slope b₁ = Σ[(xᵢ − x̄)(yᵢ − ȳ)] / Σ[(xᵢ − x̄)²]; the intercept b₀ = ȳ − b₁x̄. R² (coefficient of determination) measures the proportion of variance in y explained by x; R² = 1 − SS_residual/SS_total.

Linear regression assumptions: linearity (correct functional form), independence of errors, homoscedasticity (constant variance), and normality of residuals for inference. Violations — such as heteroscedasticity (variance increasing with x) or autocorrelation (time-series data) — require corrections like weighted least squares or robust standard errors.

Common Mistakes

  • Confusing correlation with causation: High R² shows that x predicts y in your sample, not that x causes y. Ice cream sales and drowning rates are both correlated with temperature — neither causes the other.
  • Extrapolating beyond the data range: A regression line fit to data over x ∈ [10, 50] may predict poorly outside that range. The relationship may be non-linear beyond your observed region.
  • Ignoring outliers and influential points: A single outlier with high leverage can dramatically change the slope. Always plot your data and check residuals before trusting regression coefficients.

Frequently Asked Questions

Q: What is multiple linear regression?

Multiple regression fits ŷ = b₀ + b₁x₁ + b₂x₂ + … + bₖxₖ, allowing prediction from several independent variables simultaneously. Adjusted R² (penalises adding useless variables) should be used instead of R² for model comparison. Multicollinearity — when predictors are highly correlated with each other — inflates standard errors and makes individual coefficients unstable.

Q: How do I test whether the slope is statistically significant?

Calculate the t-statistic: t = b₁ / SE(b₁), with df = n−2. If |t| > t_critical (typically 1.96 for n > 30 at α=0.05), the slope is statistically significant — i.e., unlikely to be zero in the population. The p-value for the slope tells you the probability of observing such a large t-statistic by chance if the true slope were zero.

Q: When should I use regression vs. correlation?

Use correlation (Pearson r) to measure the strength and direction of linear association without prediction. Use regression when you want to predict y from x or estimate the magnitude of the x–y relationship (slope in interpretable units). Pearson r = ±√R² for simple regression; the sign matches the sign of the slope b₁.