Line & Scatter
(What Would You Use: Part 2)
Teacher Background
Line Graphs and Scatter Plots
Line graphs provide an excellent way to map independent and dependent variables that are both quantitative. When both variables are quantitative, the line segment that connects two points on the graph expresses a slope, which can be interpreted visually relative to the slope of other lines or expressed as a precise mathematical formula. Line graphs are like scatter plots in that they record individual data values as marks on the graph. The difference is that a line is created connecting each data point together. In this way, the local change from point to point can be seen. This is done when it is important to be able to see the local change between any to pairs of points. An overall trend can still be seen, but this trend is joined by the local trend between individual or small groups of points. Unlike scatter plots, the independent variable can be either scalar or ordinal.
Scatter plots are similar to line graphs in that they start with mapping quantitative data points. With a scatter plot a mark, usually a dot or small circle, represents a single data point. The difference is that with a scatter plot, the decision is made that the individual points should not be connected directly together with a line. As the marks are plotted a visual distribution of the data can be seen which may express a trend or correlation between the variables. This trend can be seen directly through the distribution of points, or with the addition of a regression line.
Source:
Types of correlations between variables that can be demonstrated on a scatter plot
Positive correlation between variables
If the points cluster around a line that runs from the lower left to upper right of the graph area, then there is a positive correlation between the two variables. An increase in the value of x is more likely associated with an increase in the value of y. The closer the points are to the line, the stronger the relationship.
Negative correlation between variables
If the points cluster around a line that runs from the upper left to lower right of the graph area, then there is a negative correlation between the two variables. An increase in the value of x is more likely associated with a decrease in the value of y. The closer the points are to the line, the stronger the relationship.
Very low correlation between variables
Very low or zero correlation may result from a non-linear relationship between two variables. If the relationship is, in fact, non-linear (i.e., points clustering around a curve, not a straight line), the correlation coefficient will not be a good measure of the strength of the relationship
Scattered data points; No correlation between variables
If the data points are randomly scattered, then there is no relationship between the two variables; this means there is a low or zero correlation between the variables
Spread of data; No correlation between variables
A scatterplot will also illustrate if the data are widely spread or if they are concentrated within a smaller area
Outliers
Besides portraying a non-linear relationship between the two variables, a scatterplot can also show whether or not there exist any outliers in the data. In a set of data, an outlier is a value so far removed from other values in the distribution that its presence cannot be attributed to the random combination of chance causes
Adapted from: