Math 251, 10 October 2003, Exam I
Name: ANSWERS .
Instructions: Complete each of the following eight questions, and please explain and justify all appropriate details in your solutions in order to obtain maximal credit for your answers.
1. (6 pts)Classify the type of sampling used in the following examples.
(a) To maintain quality control, a tire manufacturer tests every 100th tire that comes off of the assembly line in its plant.
ANS: This is a systematic sample
(b) To conduct a poll, the Join Arnold team randomly chose 8 different prefixes in California (the first 3 digits of the telephone number) and called all households from those prefixes.
ANS: This is a cluster sample. The population was divided into groups, some groups were randomly selected and every member of the selected groups was surveyed.
(c) To determine student attitudes toward worship requirements at La Sierra, President Geraty gave questionnaires to ten randomly selected students each from of the following groups: Freshmen, Sophomores, Juniors, Seniors and Graduate Students.
ANS: This is a stratified sample. The population was divided into groups, and a random sample from each group was selected.
2. (6 pts) Categorize the following data according to level: nominal, ordinal, interval, or ratio.
(a) The quality of a restaurant’s food: poor, average, good.
ANS: Ordinal—the responses can be ranked, but differences between the ranks do not make sense.
(b) The outdoor temperature in degrees Fahrenheit.
ANS: Interval—differences in times make sense, but ratios do not.
(c) The length of time of the drive home.
ANS: Ratio—differences and ratios are meaningful.
3. (2 pts) In a set of data with more than two values, how does decreasing the smallest number affect the mean? How would it affect the median?
ANS: The mean is decreased, but the median remains unchanged.
4. (6 pts)How will the mean, standard deviation, and coefficient of variation compare in Population 1 below compare with those in Population 2 below? Explain why but do not compute the means, standard deviations or coefficients of variance. Notice the data values in Population 2 are 5 times data values in Population 1.
Pop1: 5 10 15 20 25 30 40 45 50 75 80 95
Pop2: 25 50 75100125150200225250375400475
ANS: The mean and standard deviation of the second population are 5 times the mean and standard deviation of the first, both the average and the distance from the mean of the data in the second is exactly 5 times that in the first. However, the coefficients of variation of both populations are equal since the factors of 5 in the mean and standard deviation cancel in when computing the ratio for the coefficient of variation.
5. Consider the data (which are systolic blood pressures of 25 subjects):
95 98102106108110112118 118120
124126126128130130 132134136138
142152166168184
(a) (2 pts) What class width should be chosen if you would like to have 8 classes.
ANS: First, the range divided by the number of classes is (184 – 95)/8 = 11.125. Now go up to the next who number to make sure all data are covered in the 8 classes, so we choose a class width of 12.
(b) (8 pts) Complete the following table for this data given that the first class has limits 95—109
LowerLimit
/ UpperLimit
/ Lower Boundary / Upper Boundary /Midpoint
/Frequency
/ CumulativeFrequency
/ Relative Frequency95 / 109 / 94.5 / 109.5 / 102 / 5 / 5 / .20
110 / 124 / 109.5 / 124.5 / 117 / 6 / 11 / .24
125 / 139 / 124.5 / 139.5 / 132 / 9 / 20 / .36
140 / 154 / 139.5 / 154.5 / 147 / 2 / 22 / .08
155 / 169 / 154.5 / 169.5 / 162 / 2 / 24 / .08
170 / 184 / 169.5 / 184.5 / 177 / 1 / 25 / .04
(c) (10 pts) Find the median, Q1, and Q3 for the above data. Draw a box and whisker plot for the data. You may draw it horizontally if you prefer.
ANS: The median is in the (25 + 1)/2th place, i.e. the 13th place, so the median is 126.
The first quartile is the median of the first 12 numbers, which is 111 (average of 6th and 7th data).
The third quartile is the median of the highest 12 numbers which is 137 (average of 19th and 20th data).
High: 184
Third Quartile: 137
Median: 126
First Quartile: 111
Low: 95
See text for method of constructing box and whisker plot. The lower whisker starts at 95 and goes to 111, the low edge of the box is at 111, the upper edge is at 137, and the line in the box is at 126. The upper whisker starts at 137 and goes up to 184.
(d) (5 pts) Construct a relative frequency histogram for the data using the table in (b).
6. (6 pts) At a large university, 3000 students wrote a mathematics placement test one day. Given that
x = 86,250 and x2=2,521,875 for these test scores, Find the mean and population standard deviation, and the coefficient of variation.
The mean is: 86,250/3000 = 28.75
SSx = 2,521,875 – (86,250)2/3000 = 42,187.5
The population standard deviation is = (SSx /N)1/2 = (42,187.5/3000)1/2 = 3.75
The coefficient of variation is CV = 100% 13.04%
7. (4 pts) A population is known to have a mean of 80 and standard deviation of 5. Use Chebyshev’s theorem to find the interval in which you would expect to find at least 8/9 of the data.
ANS: Chebyshev’s theorem says that at least 8/9 of data lies within 3 standard deviations of the mean.
Therefore, we compute the interval 3 which is (65,95). So at least 8/9 of all data in the population should be in the interval (65,95).
8. (5 pts) Professor Henry Wiggins decided to study the ages of the students attending the classes he taught. He constructed the following frequency distribution for ages in years.
x / 18—21 / 22—25 / 26—29F / 65 / 25 / 10
Please help Professor Wiggins by estimating the mean and sample standard deviation for the ages of students in his classes.
ANS: Use the formulas x xf, and x2 x2f where on the right hand side we use the class midpoints and frequencies. Then
x 19.565 + 23.525 + 27.510 = 2130
x2 19.5265 + 23.5225 + 27.5210 = 46,085
and
SSx 46,085 – (2130)2/100 = 716
Therefore, the mean is approximately 2130/100 = 21.3, and the standard deviation is approximately
s = (SSx /(n-1))1/2 = (716/99)1/2 2.68929791