SAT Topic 6

Scatter plots & best-fit lines

Read trends, residuals, and use the line of best fit to predict.

Concept

A scatter plot displays paired (x, y) data. Look for three things:

The line of best fit minimises the sum of squared residuals — the vertical distances from each point to the line. residual = observed y − predicted y. Positive residual ⇒ point above the line.

The correlation coefficient r ranges from −1 to +1. |r| close to 1 means a strong linear relationship; close to 0 means little linear pattern (could still be non-linear). Correlation is not causation.

Worked example 1

A best-fit line is y = 1.5x + 4. A data point sits at (6, 14). Find the residual.

Solution
Predicted. ŷ = 1.5(6) + 4 = 13
Residual. 14 − 13 = 1
Residual +1. The point sits 1 unit above the line.

Worked example 2

A scatter plot of "hours studied" vs "test score" has best-fit line y = 8x + 50. Interpret the slope and y-intercept.

Solution
Slope. Each additional hour of studying predicts 8 more points on the test.
Intercept. Predicted score with 0 hours of studying is 50. (Beware extrapolation if the data didn't include 0-hour students.)
Slope = +8 points/hour; intercept = 50.

Practice test

8 questions on reading scatter plots, computing residuals, and interpreting best-fit lines.

Practice test · 8 questions Question 1 of 8 · Score 0