SUPPLEMENTAL 2.1 CLASS HANDOUT

Cumulative Relative Frequency Graphs:

Alternate Example: State Median Household Incomes – see page 86 – 89

Here is a table showing the distribution of median household incomes for the 50 states and the District of Columbia.

Median
Income
($1000s) / Frequency / Relative
Frequency / Cumulative
Frequency / Cumulative
Relative
Frequency
35 to < 40 / 1 / 1/51 = 0.020 / 1 / 1/51 = 0.020
40 to < 45 / 10 / 10/51 = 0.196 / 11 / 11/51 = 0.216
45 to < 50 / 14 / 14/51 = 0.275 / 25 / 25/51 = 0.490
50 to < 55 / 12 / 12/51 = 0.236 / 37 / 37/51 = 0.725
55 to < 60 / 5 / 5/51 = 0.098 / 42 / 42/51 = 0.824
60 to < 65 / 6 / 6/51 = 0.118 / 48 / 48/51 = 0.941
65 to < 70 / 3 / 3/51 = 0.059 / 51 / 51/51 = 1.000

1.  Create a cumulative relative frequency graph using the data above on the graph below.

/ a.  What does the point at (50, 0.49) mean?
b.  What does the steepness of the graph tell you about the distribution?

2.  At about what percentile is California, with a median income of $57,445?

3.  Estimate and interpret the first quartile of the distribution.

4.  Challenge: Can you sketch a cumulative relative frequency graph for the following shapes? i. skewed right, ii. Skewed left, iii. uniform, and iv. Bimodal

5.  Discuss the answers of “Check your Understanding” – pg. 89 from your reading guide #6

p. 90-91: STANDARDIZED VALUE (Z-SCORE) - Tells us how many standard deviations a value is above or below (depending on the sign) the mean.

It is used to compare a value relative to the values around it.

Alternate Example – p. 90 – interpreting z-scores

1.  In 2009, the mean number of wins was 81 with a standard deviation of 11.4 wins.

Problem: Find and interpret the z-scores for the following teams. – parts a and b

(a)  The New York Yankees, with 103 wins.

(b)  The New York Mets, with 70 wins.

(c)  The San Diego Padres performed at 0.53 standard deviation below the mean. How many wins did they have that season?

(d)  Check your understanding, p. 91 part c: Brent, a member of his school basketball team, is 74 inches tall. The mean height of his team is 76 inches. Brent’s height translates to a z-score of -0.85 in the teams’ height distribution. What is the standard deviation of the team members’ heights?

Alternate Example – p. 91

The single-season home run record for major league baseball has been set just three times since Babe Ruth hit 60 home runs in 1927. Roger Maris hit 61 in 1961, Mark McGwire hit 70 in 1998 and Barry Bonds hit 73 in 2001. In an absolute sense, Barry Bonds had the best performance of these four players, since he hit the most home runs in a single season. However, in a relative sense this may not be true. Baseball historians suggest that hitting a home run has been easier in some eras than others. This is due to many factors, including quality of batters, quality of pitchers, hardness of the baseball, dimensions of ballparks, and possible use of performance-enhancing drugs. To make a fair comparison, we should see how these performances rate relative to others hitters during the same year.

Problem: Compute the standardized scores for each performance. Which player had the most outstanding performance relative to his peers?

Year / Player / HR / Mean / SD
1927 / Babe Ruth / 60 / 7.2 / 9.7
1961 / Roger Maris / 61 / 18.8 / 13.4
1998 / Mark McGwire / 70 / 20.7 / 12.7
2001 / Barry Bonds / 73 / 21.4 / 13.2

Transforming Data – p. 92 – 98

Alternate Example: Test Scores

Here are a graph and table of summary statistics for a sample of 30 test scores. The maximum possible score on the test was 50 points.

n / / / Min / Q1 / M / Q3 / Max / IQR / Range
Score / 30 / 35.8 / 8.17 / 12 / 32 / 37 / 41 / 48 / 9 / 36

Suppose that the teacher was nice and added 5 points to each test score.

Here are graphs and summary statistics for the original scores and the +5 scores. Fill in the missing values in the “Score + 5” row.

n / / / Min / Q1 / M / Q3 / Max / IQR / Range
Score / 30 / 35.8 / 8.17 / 12 / 32 / 37 / 41 / 48 / 9 / 36
Score + 5 / 30

How does this change the shape, center, and spread of the distribution?

Suppose that the teacher in the previous alternate example wanted to convert the original test scores to percents. Since the test was out of 50 points, she should multiply each score by 2 to make them out of 100. Here are graphs and summary statistics for the original scores and the doubled scores. Fill in the missing values in the “Score x 2” row.

n / / / Min / Q1 / M / Q3 / Max / IQR / Range
Score / 30 / 35.8 / 8.17 / 12 / 32 / 37 / 41 / 48 / 9 / 36
Score x 2

How does this change the shape, center, and spread of the distribution?

EXAMPLE: Analyzing the effects of transformations: p. 96

Here are 30 temperature readings (in degrees Celsius) from a thermostat: Enter these data into L1 in your stat menu.

3, 5, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 12, 14

To convert these to degrees Fahrenheit, use the formula F = (9/5)C + 32

In L2, scroll up until L2 is highlighted. Type in the following formula and press enter:

(9/5)*L1 + 32

Fill in the following table from your one-variable stats for each list.

n / / / Min / Q1 / M / Q3 / Max / IQR / Range
Temp. °C
Temp. °F

a.  Create parallel box plots in your calculator for both sets of data. Do not hand draw. Describe how the graph representing degrees Fahrenheit changed. In what way is it the same?

b.  Show how the mean temperature in °F was calculated from the mean temperature in °C.

c.  Show how the standard deviation of temperatures in °F was calculated from the standard deviation of temperatures in °C.

d.  The 90th percentile of the temperature readings was 11 °C. What is the 90th percentile in degrees Fahrenheit?

TO SUMMARIZE: What one characteristic about a distribution NEVER changes with multiplying (or dividing) or Adding (or subtracting) each data point by the same value?

What additional characteristic NEVER changes with addition (or subtraction) by a constant that DOES change with scalar multiplication?