251y0452 10/20/04ECO251 QBA1
FIRST HOUR EXAM
October 6, 2004
Name:____Key______
Student Number :I can’t count that high
Class Hour:When I’m still asleep
Remember – Neatness, or at least legibility, counts. In most non-multiple-choice questions an answer needs a calculation or short explanation to count.
Part I. (7 points)
Use the eleven numbers that you used in the second problem in the take-home exam. Add 2 to the first number. (If you don’t have them – take your student number plus the numbers (3, 9, 9, 12, 21) . Example: Seymour Butz’s student number is 876509, so he gets 8, 7, 6, 5, 0, 9, 3, 9, 9, 12, 21. Of course, he has read “Things That You Should Never Do on an Exam or Anywhere Else” and knows that he can’t use them this way. )
Compute the following:
a) The Median (1)
b) The Standard Deviation (3)
c) The 2nd Quintile (2)
d) The Coefficient of variation (1)
Seymour used the eleven numbers 3, 3, 12, 15, 22, 7, 7, 5, 2, 10, 1 . The numbers in order are
1, 2, 3, 3, 5, 7, 7, 10, 12, 15, 22.
/ 1 / 1/ 2 / 4
/ 3 / 9
/ 3 / 9
/ 5 / 25
/ 7 / 49
/ 7 / 49
/ 10 / 100
/ 12 / 144
/ 15 / 225
/ 22 / 484
Total / 89 / 1099
a) The middle number is 7.
b) ,. So
c) =4.8. So and .
so
d)
251y0451 10/20/04
If you enjoy wasting time, you might want to use the definitional formula. I don’t think that there is any point in reworking the problem on the previous page, so I copied this from Version 1 of the exam.
/ 1 / -6.7273 / 45.256/ 1 / -6.7273 / 45.256
/ 2 / -5.7273 / 32.802
/ 3 / -4.7273 / 22.347
/ 5 / -2.7273 / 7.438
/ 7 / -0.7273 / 0.529
/ 7 / -0.7273 / 0.529
/ 10 / 2.2727 / 5.165
/ 12 / 4.2727 / 18.256
/ 15 / 7.2727 / 52.873
/ 22 / 14.2727 / 203.711
Total / 85 / 0.0003 / 434.182
,. The vast majority of people who thought that they were using the definitional formula used , which, I believe, should have given them . Doing a little bit of homework should have prevented this error.
251y0452 10/20/04
Part II.
- The problem in the textbook that gives the data used in the take home also gives the braking distance for a sample of domestic made cars. It is presented below.Cumulative frequency (in red) is needed to get the median and was not given.
Distance(feet) frequency Cumulative frequency
210 - 220 1 1
220 - 230 1 2
230 – 240 1 3
240 – 250 1 4
250 – 260 4 8
260 - 270 3 11
270 - 280 6 17
280 - 290 4 21
290 - 300 324
300 – 310 125
310 –320 025
Sum 25
Minitab was used to calculate statistics from these data. It claims the following: (Note!!!!!) You will not be able to use any of these numbers in b) or c) without some manipulation in parts b and c. Answers below are not acceptable unless you give some evidence in the sample statistics.
a)Do American cars have a shorter braking distance? Compare all 3 measures of central tendency. (2)
b)Are American cars more consistent in braking distance than foreign cars? Use a dimension-free measurement of variability. (2)
c)Compare the direction and degree of skewness in the two distributions. Use one dimension- free measure of skewness. (2)
d)Write a 5-number summary of the results from the first take-home problem. (2) 15
Solution: a) Seymour had given us, for the foreign-made cars., and the mode is 255.
For the median for the domestic cars . Since 13 is above 11 and below 17, the median is in 270-280, which has a frequency of 6. . The mode is the midpoint of the largest group, which is 275.
DomesticForeign
Mean 268.6 260.647
Median 272.5 255.681
Mode 275 255
According to all measures,American cars have a longer braking distance.
b) Seymour says for the foreign cars . If we compute the coefficient of variation,
Domestic Foreign .
American cars are more consistent.
251y0452 10/20/04
c) You can use or
Domestic Foreign
Mean 268.6 260.647
Mode 275 255
-7946.70 8389.92
22.338 23.8271
= .6424
or
Bothand make Domestic look more skewed. However, Domestic is skewed to the left and Foreign to the right.
d) Lower Limit210
First Quartile243.5
Median255.681
Third Quartile275.357
Upper Limit320
- The following numbers refer to miles-per-gallon of a sample of vehicles (Bowerman and O’Connell).
Class (mpg) / / / /
29.8 - 30.3 / ____ / ____ / ____ / .0612
30.4 – 30.9 / ____ / ____ / ____ / .2449
31.0 – 31.5 / ____ / ____ / 24 / ____
31.6 – 32.1 / ____ / .2653 / 35 / .7551
32.2 – 32.7 / 11 / .2245 / 46 / .9388
32.8 – 33.3 / 3 / .0612 / 49 / 1.000
Fill in the missing numbers. (5)20
Even with corrections made above, this had some errors, but I still could check easily to see if you knew what you were doing. The completely corrected results were.
Class (mpg) / / / /29.8 - 30.3 / 3 / .0612 / 3 / .0612
30.4 – 30.9 / 9 / .1837 / 12 / .2449
31.0 – 31.5 / 12 / .2449 / 24 / .4898
31.6 – 32.1 / 13 / .2653 / 37 / .7551
32.2 – 32.7 / 9 / .1837 / 46 / .9388
32.8 – 33.3 / 3 / .0612 / 49 / 1.000
Total / 49 / 1.0000
251y0452 10/20/04
Part III. (At least 22 points – 2 points each unless marked)
- Mark the variables below as qualitative (A) or quantitative (B)
a)Number of days a patient stays at a spaB
b)Per cent change in population between censusesB
c)Preferences for 10 beers on a 1st to 10th scaleA
d)Method of contraceptionA
- Which of the following is an example of continuous ratio data?
a)Number of days a patient stays at a spa
b)*Per cent change in population between censuses
c)Preferences for beers on a 1 to 10 scale
d)Method of contraception
e)None of the above.4
- A summary measure that is computed to describe a characteristic of a population is called.
a)a census.
b)a statistic.
c)*a parameter
d)An inference
e)None of the above6
- In general what are the two types of descriptive statistic most frequently reported
a)Measures of skewness and measures of central tendency
b)Measures of dispersion and measures of skewness
c)*Measures of dispersion and measures of central tendency
d)Measures of kurtosis and measures of dispersion
e)Measures of kurtosis and measures of skewness
f)Measures of kurtosis and measures of central tendency
g)None of the above.8
251y0452 10/20/04
Mark the following formulas (2 each) . Circle a, b or c. b) must be filled in if you have circled it.
- (Sample mean)
a)This cannot be negative.
b)If this is negative it means the distribution is ______
c)*This can be negative, but it has no special meaning.
- Coefficient of Excess or
a)This cannot be negative.
b)*If this is negative it means the distribution is platykurtic (Flat-topped)
c)This can be negative, but it has no special meaning.
- (Skewness)
a)This cannot be negative.
b)*If this is negative it means the distribution is Skewed to the left.
c)This can be negative, but it has no special meaning.
- (Variance)
a)*This cannot be negative.
b)If this is negative it means the distribution is ______
c)This can be negative, but it has no special meaning.12
Does it really mean anything to tell me that if one of these statistics is negative, the distribution is negative?
251y0452 10/20/04
Exhibit 1:The following is taken from Problem 3.22 in the text. The data below represent sales tax receipts submitted to a township government by 50 businesses in one quarter.
Sales Taxes ($000)
10.3 11.1 9.6 9.0 14.5 13.0 6.7 11.0 8.4 10.3
13.0 11.2 7.3 5.3 12.5 8.0 11.8 8.7 10.6 9.5
11.1 10.2 11.1 9.9 9.8 11.6 15.1 12.5 6.5 7.5
10.0 12.9 9.2 10.0 12.8 12.5 9.3 10.4 12.7 10.5
9.3 11.5 10.7 11.6 7.8 10.5 7.6 10.1 8.9 8.6
The text solution manual offers the following results.
(a) Stem-and-leaf display of Quarterly Sales Tax Receipts
53
657
73568
804679
902335689
1000123345567
11011125668
12555789
1300
145
151
(b) = 10.28
(c) = 4.1820, = 2.045
(d) 64% of the receipts are within standard deviations of the mean.
(e)94%of the receipts are within standard deviations of the mean.
(f)100% of the receipts are within standard deviations of the mean.
- According to the stem and leaf display, what percent of the receipts were below $8000? (1)
7/50 = 14%
- If the researcher was directed to present the data in 5 classes, what should the class interval be? Show your calculations. 15
Lowest is 5.3. Highest is 15.1 Let’s try 2
- Show the actual intervals you might use. 17
Notice that if we start at 5 an interval of 2 doesn’t cover, so you have to either start above 5 or use a number above 2.
Class / From / toA / 5.2 / Under 7.2
B / 7.2 / Under 9.2
C / 9.2 / Under 11.2
D / 11.2 / Under 13.2
E / 13.2 / Under 15.2
I think that using 2.5 and starting at 4 works better.
Class / From / toA / 4.0 / Under 6.5
B / 6.5 / Under 9.0
C / 9.0 / Under 11.5
D / 11.5 / Under 14.0
E / 14.0 / Under 16.5
251y0452 10/20/04
Before we start, most of you seem to have no idea what ‘3 standard deviations from the mean’ signifies. Nevertheless, one student paper put it this way.
or 8.235 to 12.325
or 6.190 to 14.370
or 4.145 to 16.415
Two of these should appear in your answer below.
- The description above says that 64% of the receipts are within standard deviations of the mean. Between what numbers does this mean? How does this compare with the empirical rule? Why might there be a discrepancy? (3)
Empirical rule: (For Symmetrical Unimodal distributions only): 68% within one standard deviation of the mean, 95% within two and almost all (99.7%) within three. This is lower and could be because the distribution is not quite symmetric.
- The description above says that 94% of the receipts are within standard deviations of the mean. Between what numbers does this mean? How does this compare with the Chebyshev rule? Why might there be a discrepancy? (3) 23
Chebyshef’s Inequality: or . Thismeans that at least 8/9 should be within 3 standard deviations of the mean. In the realworld the number is almost always larger.
251y0452 10/20/04
ECO251 QBA1
FIRST EXAM
October 6, 2004
TAKE HOME SECTION
Name: ______
Student Number: ______
Throughout this exam show your work! Please indicate clearly what sections of the problem you are answering and what formulas you are using. Turn this is with your in-class exam.
Part IV. Do all the Following (11 Points) Show your work!
1. The frequency distribution below represents the braking distance for a sample of foreign made cars.. Personalize the data as follows. Write down your student number. Take the last two digits of the number. Add the largest of the two last numbers to the frequency for 300-310 and the second largest to the frequency for 310-320. Use the results as your frequencies. For example, Seymour Butz’s student number is 876509 so he adds 0 to the last frequency and 9 to the second to last frequency and uses (1, 3, 12, 15, 22, 7, 7, 5, 2, 10, 1).
1
Distance (feet) frequency
210 - 220 1
220 - 230 3
230 – 240 12
240 – 250 15
250 – 260 22
260 - 270 7
270 - 280 7
280 - 290 5
290 - 300 2
300 – 310 1
310 - 320 1
a. Calculate the Cumulative Frequency (0.5)
b. Calculate The Mean (0.5)
c. Calculate the Median (1)
d. Calculate the Mode (0.5)
e. Calculate the Variance (1.5)
f. Calculate the Standard Deviation (1)
g. Calculate the InterquartileRange (1.5)
h. Calculate a Statistic showing Skewness and Interpret it (1.5)
i. Make an ogive of the data showing relative or percentage cumulative frequency (Neatness Counts!)(1.5)
j. Extra credit: Put a (horizontal) box plot below the ogive using the same scale. (1)
1
Solution:is the midpoint of the class. Our convention is to use the midpoint of 50 to 60, not 50 to 59.999. Note also, that the midpoints have been divided by 10. Most numbers should be multiplied by 10, the variance should be multiplied by 100 and by 1000. Calculations follow for both the computational and definitional formulas. (Don’t do both.) Seymour’s frequencies are used below.
If you used computational formulas, you should have the following.
1 210-220 1 1 215 215 46225 9938375
2 220-230 3 4 225 675 151875 34171875
3 230-240 12 16 235 2820 662700 155734500
4 240-250 15 31 245 3675 900375 220591875
5 250-260 22 53 255 5610 1430550 364790250
6 260-270 7 60 265 1855 491575 130267375
7 270-280 7 67 275 1925 529375 145578125
8 280-290 5 72 285 1425 406125 115745625
9 290-300 2 74 295 590 174050 51344750
10 300-310 10 84 305 3050 930250 283726250
11 310-320 1 85 315 315 99225 31255875
Total 85 22155 5822325 1543144875
251y0452 10/07/04
If you used definitional formulas, you should have the following.
1 210-220 1 1 215 215 -45.6471 -45.647 2083.7 -95113
2 220-230 3 4 225 675 -35.6471 -106.941 3812.1 -135892
3 230-240 12 16 235 2820 -25.6471 -307.765 7893.3 -202439
4 240-250 15 31 245 3675 -15.6471 -234.706 3672.5 -57463
5 250-260 22 53 255 5610 -5.6471 -124.235 701.6 -3962
6 260-270 7 60 265 1855 4.3529 30.471 132.6 577
7 270-280 7 67 275 1925 14.3529 100.471 1442.0 20698
8 280-290 5 72 285 1425 24.3529 121.765 2965.3 72214
9 290-300 2 74 295 590 34.3529 68.706 2360.2 81081
10 300-310 10 84 305 3050 44.3529 443.529 19671.8 872504
11 310-320 1 85 315 315 54.3529 54.353 2954.2 160572
85 22155 0.000 47689.4 712778
(except for a possible rounding error), and
a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole column.
b. Calculate the Mean (1):
c. Calculate the Median (2): . This is above and below so the interval is the 5th one, 250 – 260. so
d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 22 is the largest frequency, the modal group is 250 to 260 and the mode is 255 ..
e. Calculate the Variance (3): or . The computer got 567.731.
f. Calculate the Standard Deviation (2): .
g. Calculate the InterquartileRange (3): First Quartile: . This is above and below so the interval is 240-250. gives us .
Third Quartile: . This is above and below so the interval is 270-280. . .
Note that an answer for the mean, median, mode, first quartile or third quartile that is not between the highest and lowest number, in this case 210 and 340, is not reasonable!
251y0452 10/07/04
h. Calculate a Statistic showing Skewness and interpret it (3):
We had and
.
or The computer gets 8689.93.
or
or Pearson's Measure of Skewness
Because of the positive sign, the measures imply skewness to the right.
i. An ogive is a simple line graph with cumulative frequency between zero and one on the y-axis and the numbers 200-340 on the x-axis. The data Seymour showed is:
210 0 0
220 1 .012
230 4 .047
240 16 .188
250 31 .365
260 53 .624
270 60 .706
280 67 .788
290 72 .847
300 74 .870
310 84 .988
320 85 1.000
330 85 1.000
Each number in the column is the corresponding number in the column divided by The y axis should be marked from zero to a 1.00. In spite of the fact that the question tells you that an ogive shows cumulative frequency, many of you gave me a frequency polygon, most of you did not obey the convention that the curve starts at zero and most of you did not convert of per cent.
j. The box plot should show the median and the quartiles and use the same x axis as the ogive.
251y0452 10/07/04
2. Use the frequencies you used in problem 1 in this problem as values of .
For these eleven numbers, compute the a) Geometric Mean b) Harmonic mean, c) Root-mean-square (1point each). Label each clearly. If you wish, d) Compute the geometric mean using natural or base 10 logarithms. (1 point extra credit each ). While you’re at it, compute the sample mean and bring it and the numbers that you used on this take-home exam to the in-class exam (no credit until you get to the exam – but it won’t hurt).
Solution: Note thatSeymour used the eleven numbers 1, 3, 12, 15, 22, 7, 7, 5, 2, 10, 1. He found or This is not used in any of the following calculations and there is no reason why you should have computed it except to use in class!Note that an answer that is not between the highest and lowest number is not reasonable!
a) The Geometric Mean.
. At least, not many of you tried to get the answer by dividing 582112000 by 11, but a number of you seem to have convinced your selves that you could take a square root instead of an 11th root.
b) The Harmonic Mean.
. So
Of course many of you decided that
. This is, of course, an easier way to do the problem, but I warned you that it wouldn’t work. . It is equivalent to believing that
c) The Root-Mean-Square.
. So .
Of course many of you decided that . This is, of course, an easier way to do the problem, but I warned you that it wouldn’t work. It is equivalent to believing that .
251y0452 10/07/04
d) (i) Geometric mean using natural logarithms
So .
(ii) Geometric mean using logarithms to the base 10
So .
Notice that the original numbers and all the means are between 1 and 22.
It’s probably more efficient to handle a problem this large in columns. The arithmetic mean is also computed below.
Row
1 1 1.00000 1 0.00000 0.00000
2 3 0.33333 9 0.47712 1.09861
3 12 0.08333 144 1.07918 2.48491
4 15 0.06667 225 1.17609 2.70805
5 22 0.04545 484 1.34242 3.09104
6 7 0.14286 49 0.84510 1.94591
7 7 0.14286 49 0.84510 1.94591
8 5 0.20000 25 0.69897 1.60944
9 2 0.50000 4 0.30103 0.69315
10 10 0.10000 100 1.00000 2.30259
11 1 1.00000 1 0.00000 0.00000
Total 85 3.61450 1091 7.76501 17.8796
7.727270.32859199.1818 0.705910 1.62542
So, as before and .
1