251y0112 10/08/01 ECO251 QBA1 Name ______key______

FIRST HOUR EXAM SECTION MWF 10 11 TR 11 12:30

OCTOBER 2, 2001

Part I. (10 points)

1. Indicate whether the following are: Nominal Data, Ordinal Data, Interval Data, Continuous Ratio Data or Discrete Ratio Data. (3)

a. Price to earnings ratio of your stock. Ans: Continuous Ratio.

b. Number of customers who said that service was unsatisfactory in a survey. Ans: Discrete Ratio.

c. The Likert Scale rates customer satisfaction with your firm's service on a one to five scale

where 1 is exceptional and 5 is unsatisfactory. Ans: Ordinal.

(Note: See text p. 13 for most of this - discrete/continuous was defined in class)

2.(D-68) Which of the following explains the shape of a distribution best? (1)

a. Mean

b. Median

c. Box Plot.

*d. Stem-and-leaf plot

e. Mode

(Note: See text pp. 39-44)

3. Make a diagram of a table and show where the field is. (1)

4. The accompanying box plot shows the sale prices of homes (in thousands) in a Pennsylvania town

0 30 60 70 80 110 140

a. What percent of home prices fall between $60 thousand and $80 thousand - why? (2)

Ans: Since 60 is the first quartile and has 25% below it and 80 is the third quartile with 25%

above it, 50% must be between them

b. If the mean price is $69 thousand, is the data skewed to the left or right? (1)

Ans: Since, for data that is skewed to the left, Mean < median < mode, because the diagram

shows that the median is 70, and the mean is lower, it must be skewed to the left

5. Which of the following is a graph that shows cumulative frequency? (2)

a. Histogram

*b. Ogive

c. Frequency Polygon

d. Pie chart

e. None of the above


251y0112 10/08/01

Part II. Compute an appropriate answer, showing your work (15+ Points)

a) A distribution of 89 home sale prices has a mean of $67500, a median of $72500 and a standard deviation of $10000. What is the maximum number of homes that have prices that could be below $37500? (2)

Ans: Since 37500 is 3 standard deviations below the mean ,

according to Chebyschev, there could only be above $97500, this is less than 10

homes.

b) Assume that the distribution above is symmetrical and unimodal. Give a rough answer to the

question in a) and explain your reasoning. (2)

Ans: Since 37500 is 3 standard deviations below the mean, the Empirical rule says that there will

be almost none below $37500.

c) The smallest selling price in the distribution above was $25,000 and the largest was $146,000 (Note correction!). If these data are to be presented in five classes, what intervals would you use? Explain your reasoning using an appropriate formula and use it to fill in the table below.(3)

Ans: so use 25000. This is only a suggestion. Any number somewhat above 24200 will work, as long as you cover the range.

Class / From / to
A / 25000 / 49999
B / 50000 / 74999
C / 75000 / 99999
D / 100000 / 124999
E / 125000 / 149999

d) WIM technology weighs and measures trucks driving at highway speeds. Trucks are classified

in a report as follows:

A 'WIM gross weight above 70,000 lbs.' B 'WIM gross weight 70,00 lbs. or less.

C 'WIM total length above 60 ft. D 'WIM total length no more than 60 ft.

Which of the following classes are mutually exclusive? (Circle) (1.5)

A and C , *C and D, A, B, and C

Which of the following classes are collectively exhaustive? (Circle) (1.5)

A and C , *C and D, *A, B, and C

(Note: This was grade at 0.5 for each item correctly marked or not marked)


251y0112 10/08/01

e) For the numbers 3, 103, 203, 303 and 403, compute the i) Root-mean-square ii) Harmonic mean, iii) Geometric mean (2.5 each)

Solution: Note that . This is not used in any of the following calculations and there is

no reason why you should have computed it!

(i) The Root-Mean-Square.

. So .

(ii) The Harmonic Mean.

. So .

(iii) The Geometric Mean.

.

Or

. So .

Or

. So .

Notice that the original numbers and all the means are between 3 and 403.
251y0112 10/08/01

Part III. Do the following problems (25 Points)

1. I have the following data for sales clerk work hours at a sample of 8 stores.

310 254 180 170 116 100 96 320

Compute the following:

a) The Median (1)

b) The Standard Deviation (4)

c) The 3rd Decile (2)

Solution: Compute the Following: Index

Note that x is in order 1 96 9216 -97.25 9457.6

2 100 10000 -93.25 8695.6

3 116 13456 -77.25 5967.6

4 170 28900 -23.25 540.6

5 180 32400 -13.25 175.6

6 254 64516 60.75 3690.6

7 310 96100 116.75 11630.6

8 320 102400 126.75 16065.6

1546 356988 0.00 58223.5

Note that, to be reasonable, the mean, median and 3rd decile must fall between 96 and 320.

,, ,.

a) Just put the numbers in order and average the middle numbers, .

Or formally:

so .

b) or

c) The 3rd decile has 30% below it.. .

so

(New Formula: . .

so )


251y0112 10/08/01

2. A bank is investigating the amount of time customers are put on hold when they call. The times are tabulated below. (Assume that the numbers are a sample.)

2

amount frequency

less than 30 seconds 2200

30 - 59.99 seconds 800

60 - 89.99 seconds 770

90 - 119.99 seconds 200

120 - 149.99 seconds 20

150 - 179.99 seconds 10

a. Calculate the Cumulative Frequency (1)

b. Calculate The Mean (1)

c. Calculate the Median (2)

d. Calculate the Mode (1)

e. Calculate the Variance (3)

f. Calculate the Standard Deviation (2)

g. Calculate the Interquartile Range (3)

h. Calculate a Statistic showing Skewness and Interpret it (3)

i. Make a frequency polygon of the Data (Neatness Counts!)(2)

2

(Note - It may make things easier to move the decimal point to the left in the midpoint column, before you start calculating - but be careful of the median etc. if you do it. For a printout doing things this way , see 251z0112)

Solution: is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999.

A 0- 29.99 2200 2200 15 33000 495000 7425000 -23.025 -50655.5 1166331 -26854784

B 30-199.99 800 3000 45 36000 1620000 72900000 6.975 5580.0 38920 271470

C 60- 89.99 770 3770 75 57750 4331250 324843744 36.975 28470.7 1052706 38923800

D 90-119.99 200 3970 105 21000 2205000 231524992 66.975 13395.0 897130 60085288

E 120-149.99 20 3990 135 2700 364500 49207500 96.975 1939.5 188083 18239350

F 150-179.99 10 4000 165 1650 272250 44921248 126.975 1269.7 161227 20471734

4000 152100 9288000 730822528 0.0 3504398 111136864

and Note that, to be reasonable, the mean, median and quartiles must fall between 0 and 180. (If you moved your decimal point one place to the left before you started, your column is now in tens, is in tens, is in hundreds, is in ten thousands, is in tens, is in tens, is in hundreds and is in ten thousands.).

a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole column.

b. Calculate the Mean (1):

c. Calculate the Median (2): . This is above and below so the interval is A, 0-29.99. so

d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 2100 is the largest frequency, the modal group is 0 to 29.99 and the mode is 15.000.

e. Calculate the Variance (3): or


251y0112 10/08/01

f. Calculate the Standard Deviation (2):

g. Calculate the Interquartile Range (3): First Quartile: . This is above and below so the interval is A, 0-29.99. gives us .

Third Quartile: . This is above and below so the interval is C, 60-89.99. . .

(New Formula:

For the median -. This is the same result as on the previous

page.

For the first quartile -. This leads to interval A and the

same result as above.

For the third quartile --. This leads to interval C and the

same result as above.)

h. Calculate a Statistic showing Skewness and interpret it (3): .

or

or

or Pearson's Measure of Skewness

Because of the positive sign, the measures imply skewness to the right.

i. Make a frequency polygon of the Data (Neatness Counts!)(2) A frequency polygon is a line graph of the frequency. It should hit zero on the right at but this point will not show if the x axis starts at zero. The next point is 2200 at so the height is It falls after that (the next point is at ) and hits zero at which may be hard to show. In general, it is difficult to put a consistent scale on the y-axis because of the extreme differences in the values of . Putting the y-axis on a logarithmic scale with the distances 1 to 10, 10 to 100, 100 to 1000 and 100 to 10000 equal would help. This might be a bit hard and messy, however, without some appropriate graph paper.

A copy of the frequency polygon as done by Minitab appears on the next page, but I would prefer to see the x-axis and the y-axis start at zero and the x-points marked as 15, 45, 75, etc.


251y0112 10/08/01

2