Class 2 – variability 6

Measures of Variability

We have already discussed the most frequently used measures of central tendency. Measures of central tendency allow us to select one number to represent a distribution of scores. In discussing them, we focused on the type of measurement scale that is being employed (Nominal, Ordinal, Interval and Ratio. Measures of Variability are a second form of descriptive statistic, which we use to describe how spread out the scores in our distribution are. We will begin by defining the most common ways of measuring variability. We will then discuss how different types of distributions might affect our choice of measure of central tendency. Finally, we will talk about the importance of looking at variability when interpreting results. Since the logic underlying inferential statistics depends on a good understanding of variability it is important that you understand these concepts.

When dealing with nominal scales we have the same limitations we had with measures of central tendency. Numbers assigned to nominal scales are not truly numbers – they are name labels. We therefore, cannot calculate a number that would describe the variability of the responses. In fact, we cannot even meaningfully say that the scores range from category 1 to category 7 because the order of the categories is arbitrary. We can only summarize by listing the categories and their frequency of occurrence. If there are only a few categories, you might simply summarize the distribution in text form (e.g., Thirty-eight percent of respondents were males and 62% were female). When there are around 4 to 7 categories bar charts or pie graphs may be appropriate, whereas, larger numbers of categories might best be summarized in tables. This is not a hard and fast rule. It depends on the variable. We tend to reserve figures for more important variables.

With ordinal scales, we can define the Range of the categories. We might say our sample included respondents ranging in education level from no high school through to PhDs. This defines the extremes, or end points of our ordered categories. Since the intervals are not equal between our categories, we cannot define in numbers the average difference between scores.

With continuous /scale variables, (Interval and Ratio) we can use numbers to describe the variability of the distribution. Looking at an example, let’s compare final exam scores from three sections of a Gen Psych course.

Section 1 / Section 2 / Section 3
160 / 102 / 200
130 / 101 / 78
100 / 100 / 77
70 / 99 / 75
40 / 98 / 70
Total / 500 / 500 / 500
Mean / 100 / 100 / 100

All three classes have the same mean but the variability of the grades differs greatly. There are several measures of variability that can be used to describe these differences.

Range - simplest measure of variability. It is defined by the highest score minus the lowest score.

Range Section 1 = 160 - 40 = 120

Range Section 2 = 102 - 98 = 4

Range Section 3 = 200 - 70 = 170

The higher the range, the more variable the scores, however, the range may not be a very representative measure of a distribution of scores. You might notice that the range is defined by only 2 numbers; the highest and the lowest.

In section one, there is a relatively large range and the scores are fairly evenly distributed within it. The range of section 2 is small, but once again the distribution of scores is fairly even. Section 3 has the largest range; however this is due to one extremely high score. All other scores in section 3 are very close to each other. Range can be strongly affected by the occurrence of extreme scores. Therefore, range is often not the best way to represent the variability of scores.

Deviation Scores. One way that we can use to represent the distribution of scores is to look at the average amount that scores deviate (differ) from the mean. We could take each score and minus the mean from it and then sum this value. The problem here is that the mean is the arithmetic center. By definition the sum of the deviation of scores that fall above the mean (have a positive deviation) is equal to the sum of deviation scores below the mean (have a negative deviation). When we sum deviation scores, we will always get zero. This measure therefore is useless. The mean deviation score will always be zero. Clearly, the mean deviation score will not help us to describe the distribution of scores. We could get around this, however, if we used absolute (ignoring the positive or negative sign) distribution scores. In essence, that is what we do when we calculate Variance.

Variance - Instead of ignoring the sign, we use a little mathematical trick to convert all the deviation scores to positive numbers. We square them. You might recall from algebra that all squared values are positive.

E.g., 22 = 4 and -22 = 4 (a negative multiplied by a negative is a positive).

If we square all the deviation scores and sum them we will get a positive number. If we then divide by the number of scores, we obtain the average squared distance that scores deviate from the mean. Variance is one of the most commonly used measures of variability. Recall it is the heart of the analysis that we call Analysis of Variance (ANOVA). We are also going to learn that the assumption of homogeneity of variance (the requirement that that the variance of samples we are comparing do not differ from each other) will be something that you will need to test in order to be able to use ANOVA’s. For the moment, the important thing to realize about variance is that it is a measure of variability in terms of average squared distances between the individual scores in the distribution and the mean of that distribution.

In this course you will not be asked to calculate the variance of a distribution by hand, nor with a calculator. SPSS will do these calculations for you. There is, however, something you should be aware of about the manner in which SPSS calculates variance. It sums the squared deviation scores and divides by N-1 (the total number of scores – minus one). Why does it do that? In order to understand this we have to take a short detour and discuss the difference between Samples and Populations.

A Population is the entire group of people or scores that you want to apply your sample statistics to. If I wanted to know the average height of students attending Platteville, I could go out and measure them all and then determine the exact average height. When we obtain measurements from an entire population we refer to the descriptive values (mean, range, variability) as parameters. They are exact. When we do research, more often then not, we measure a sample of the population. The descriptive statistics from the sample are used as estimates of the population parameters. The word statistic means that we are estimating. A statistic is an estimate of a parameter.

When we use statistics we are taking a subset of scores (the sample) and generalizing them to the population. One way that statistics can be misleading is that the sample might not be an unbiased subset of the population. We have discussed this a great deal when talking about sample selections and external validity. Statisticians have also done a great deal of work looking at the degree to which statistics are unbiased estimates of the parameter. The easiest way to understand this is to look at a technique they use called Monte Carlo studies. Statisticians generate a large distribution of numbers of a known mean and variability and then they repeatedly (thousands of times) draw random samples of a given size from this population. Generally, they use computers to do these studies. Monte Carlo studies have provided us with two important findings.

1) Larger samples give more precise estimates of the population parameters. This should make sense. The larger the sample, the more representative it is of the population. Extreme scores to one side of the distribution are more likely to be counteracted by extreme scores to the other end and thus the estimate is more accurate.

2) No matter how large the sample, some statistics are still biased estimates of the population. The mean is an unbiased estimate. If you calculate the mean from several samples, any given sample is as likely to be an overestimate of the population mean as it is to be an underestimate. If you average the means from several samples, you will get a good estimate of the population mean. Variance, however, is not an unbiased estimate of the population variance. The smaller the sample, the more it underestimates the variance. I am sure there are complex mathematical explanations for this, but it would be well beyond what you need to know. The important thing to know about variances is that this bias is very easily corrected. Statisticians have found that if you divide the sum of the squared deviation scores by N-1, you get an unbiased estimate of the population variance. Notice that the higher the sample size the less the correction is. For example, 100 - 1 is a smaller adjustment than 10 - 1.

When you have SPSS calculate the variance of a distribution of scores, it assumes you are working with a sample. It divides the sum of the squared deviation scores by N-1. If you are really working with a population you should correct this by multiplying the variance by N-1 and then dividing by N.

One of the major advantages of variance is that it is easy for the computer program to work with. The major limitation is that, unlike computers, people have difficulty time thinking about squared values. If you look at the two distributions below, you can see that the variability of scores in distribution B is twice that as distribution A, but the variance of B is 4 times as large as A’s. We can, however, convert variances to values that are easier to think about, simply by using their square roots. These are called standard deviations. So the standard deviation of Distribution A is 1.58 and the standard deviation of Distribution B is 3.16. The variability of Distribution A is half that of Distribution B and this is also true of the magnitudes of their respective Standard Deviations. In other words, it is not easy for us to compare distributions using variance, but it is easy to do so with standard deviations.

Distribution A / Deviation Scores / Squared Deviations / Distribution B / Deviation scores / Squared Deviations
4 / -2 / 4 / 2 / -4 / 16
5 / -1 / 1 / 4 / -2 / 4
6 / 0 / 0 / 6 / 0 / 0
7 / 1 / 1 / 8 / 2 / 4
8 / 2 / 4 / 10 / 4 / 16
Mean = 6 / s2 = 2.5
s = 1.58 / Mean = 6 / s2 = 10
s = 3.16

Standard Deviations. The easiest way to think about standard deviations is as an approximation of the average amount that the scores in the distribution deviate from the mean. Yes, I know that the average deviation of scores in distribution A is 1.5 not 1.58, and the average distance between scores for distribution B is 3 not 3.16 but it is a very close estimate. This is once again so that the statistic estimates the population parameter. The important thing to remember is that Variances and Standard deviations allow us to use a number to describe the amount to which scores in the distribution differ from each other.

Properties of Variance and Standard Deviations. While standard deviations are more useful for describing the distribution of scores in a manner in which most people can understand and they allow us to compare the average variability of scores between distributions, standard deviations cannot be meaningfully added or averaged. For example, if I wanted to calculate the average standard deviation of two distributions, I cannot simply add them together and divide by 2. Instead, I would need to go back to the variances and find their average and then re-convert to a standard deviation. The main point is that you cannot add, subtract, divide or multiply standard deviations and obtain a meaningful answer. These mathematical manipulations can be done with variances and that makes variance much more useful when computing inferential statistics. Although there is debate about which statistic, variance or standard deviation, should be reported in the results section, I suggest you use the one which is most easily understood, the standard deviation. Whenever you report a mean, you should report the standard deviation as well.

Variation is only one aspect of a distribution that may be important to look at, and perhaps included in your write-up. The shape of the distribution can also be important. Below I have included some examples of distributions and the terms used to describe them.

One way we can describe a distribution as symmetric or skewed. A distribution curve is symmetric if when folded in half the two sides match up. If a curve is not symmetrical it is skewed. When a curve is positively skewed, most of the scores occur at the lower values of the horizontal axis, and the curve tails off towards the higher end. When a curve is negatively skewed, most of the scores occur at the higher value and the curve tails off towards the lower end of the horizontal axis. SPSS reports skewness as a number. A perfectly symmetrical curve has a skewness value of 0. Positive skewed curves have positive numbers, whereas, negative skewed curves have negative numbers.

If a distribution is a bell-shaped symmetrical curve, the mean, median and mode of the distribution will all be the same value. When the distribution is skewed the mean and median will not be equal. Since the mean is most affected by extreme scores, it will have a value closer to the extreme scores than will the median.

For example, consider a country so small that its entire population consists of a queen (Queen Cori) and four subjects. Their annual incomes are