Practice Problems for Exam2
Paraphrased problems from EMBS.
30. (p 484)A large auto insurance company selected random samples of single and married male policy-holders and recorded the number who made a claim over the preceding three-year period.
Single Male Policy-Holders / Married Male Policy-Holdersn=400 / n=900
Number making claims = 76 / Number making claims = 90
Is the observed difference in claim rates statistically significant?
claim / nonesingle / 76 / 324 / 400
married / 90 / 810 / 900
166 / 1134 / 1300
EXPECTED
51.1 / 348.9
114.9 / 785.1
DISTANCES
12.2 / 1.8
5.4 / 0.8
Calculated Chi-squared
20.1
Pvalue = chidist(20.1,1)
7.20624E-06
pvalue = chitest(observed,expected)
7.20624E-06
YES, we reject the null hypothesis of independence because the p-value is less than 0.05. The difference in sample claim rates are statistically significant.
40. (p 486) The Wall Street Journal Subscriber Study gathered data on the employment status of a random sample of subscribers. Sample results were broken out for subscribers of the eastern and western editions.
RegionEMPLOYMENT STATUS / Eastern Edition / Western Edition
Full-Time / 1105 / 574
Part-Time / 31 / 15
Self Employed / 229 / 186
Not employed / 485 / 344
Test the hypothesis that employment status is independent of region.
EXPECTED1046.2 / 632.8 / 1679
28.7 / 17.3 / 46
258.6 / 156.4 / 415
516.6 / 312.4 / 829
1850 / 1119 / 2969
DISTANCES
3.31 / 5.46
0.19 / 0.32
3.39 / 5.60
1.93 / 3.19
Calculated Chi-square
23.4
p-vaue=chidist(2969,3)
3.3759E-05
p-value=chitest(observed,expected)
3.3759E-05
We reject the null hypothesis that employment status and region are independent.
PEP Problem NOT in the Book. THE MARK and UPTOWN KITCHEN are separate restaurants owned by the same firm. The weekend revenue generated by THE MARK is normally distributed with mean $80,000 and standard deviation $10,000. The UPTOWN KITCHEN revenues are normally distributed with mean $100,000 and standard deviation of $20,000. The following questions refer to the total weekend revenue enjoyed by the firm.
a. What is the mean?
b. What is the standard deviation?
c. What is the shape of the distribution?
e. Which of your answers to a,b,c requires revenues from the two restaurants to be independent?
f. Which of your answers to a,b,c requires that revenues at each store are normal?
g. (Difficult?) Next weekend, which restaurant will bring in more revenue?
a. mean of the sum is the sum of the means, always. The mean total revenue is $180,000.
b. The variance of the sum is the sum of the variances IF independent. So if independent revenues, the standard deviation of total revenue is (10000^2+20000^2)^.5 = $22,361
c. Normal. Sums of normal are always normal.
d. only the answer to b requires independence.
e. only the answer to c. requires normality of individual revenues. The central limit theorem does not apply because n is only 2.
f. The answer to b. requires independence so that we can add the variances.
g. The trick to answering this is to realize it is a question about the difference in revenues. The difference in revenues (UPTOWN – MARK) will be normal with mean $20,000 and standard deviation $22,361 (the variance of a difference is the sum of the variances given independence….so the difference is just as unpredictable as the sum). So the difference will be negative (MARK will bring in more revenue) with probability NORMDIST(0,20000,22361,true) = 0.186. So the Mark will bring in more revenue than UPTOWN with probability 0.186 (given independence…which is NOT a reasonable assumption…but without it, we can’t do much…)
46 (p 341)AARP estimated that the average annual expenditure on restaurants and carryout food was $1,873 for people 50 and over. Suppose this estimate is based on a random sample of 80 people and that the sample standard deviation is $550.
b. What is the 95% confidence interval for the population mean amount spent on restaurants and carryout.
t.inv.2t(.05,79) / 1.99045
s / 550
s/n^.5 / 61.5
margin of error / 122.4
Lower limit / 1750.6
Upper limit / 1995.4
d. If the distribution of amounts is positively skewed (not normal), would you expect the sample median amount to be greater or less than $1,873?
We would expect the sample median to be less than the sample mean for a positively skewed distribution.
e. (not in the book).If the distribution of amounts is positively skewed (not normal), does this make your answer to b.) invalid? Briefly explain.
A normal population is not required for our confidence intervals to be valid. This is because of the central limit theorem saying sample means are normal when n is big regardless of the shape of the population distribution. Here n is 80 and big enough for the central limit theorem to apply.
52 (p 390). The chamber of commerce of a Florida Gulf Coast community advertises that residential property is available at a mean cost of $125,000 or less per acre. A random sample of 32 properties has a sample mean cost per acre of $130,000 and sample standard deviation of $12,500. Comment on the validity of the advertising statement.
H0 is mean=125. Ha is mean > 125. T-stat is (130-125)/(12.5/32^.5) = 2.27. Pvalue = t.dist.rt(2.27,31) = 0.015. This is statistically significant. We can reject the hypothesis that the claim is truthful in favor of an alternative hypothesis that the mean is greater than $125K.
56 (p391). Virtual call centers are staffed by individuals working from home. Regional Airways is considering switching from its traditional call center to a virtual one but only if a level of customer satisfaction greater than 80% can be maintained. In a test of home agents, 252 out of 300 randomly chosen customers reported being satisfied with their experience with the call center.
a. Formulate a relevant null and alternative hypothesis.
b. What is the sample proportion of satisfied customers?
c. What is the p-value of your test of hypothesis?
d. What is your conclusion?
Let H0 be P (the satisfaction probability) = 0.8 and Ha: P>0.8. Here we hope to reject the null hypothesis. We have a variety of ways to calculate the p-value:
Using the binomial
p-value = Pr(X>=252 given H0) = 1-BINOMDIST(251,300,.8,true) = 0.046.
Using the normal approximation to the binomial
p-value = Pr(X>=252 gien Ho) = 1-NORMDIST(252,300*.8,(300*.8*.2)^.5,true) = 0.042.
Using the Z-statistic calculated using number satisfied
p-value = Pr(Z>=(252-240)/(300*.8*.2)^.5) = Pr(Z>=1.73) = 1-NORM.S.DIST(1.73,true) = 0.042.
Using the Z-statistic calculated using sample proportion satisfied
p-value = Pr(Z>= (252/300-.8)/(.8*.2/300)^.5) = 1-NORM.S.DIST(1.73,true) = 0.042.
The sample proportion satisfied is 252/300 and our conclusion is that this sample proportion is statistically significant higher than 0.8.
42 (p446). Mutual funds are either load or no-load. Because load mutual funds charge fees not charged by no load funds, the question is whether load funds provide a higher mean return. Returns from a random sample of 30 load and 30 no load mutual funds are provided in an accompanying Excel spreadsheet.
a. Formulate a null and alternative hypothesis such that rejection of the null leads to the conclusion that load funds have higher mean returns.
b. Use the data to test your hypothesis. What is the p-value and your conclusion?
H0: mean returns are equal. Ha: mean return is greater for load fund. This will be a two-sample, one-tailed t-test of differences in means. The resulting p-value is 0.28 which means we cannot reject the null in favor of load funds having a higher mean return. The difference in sample means is not statistically significant.
44 (p 446). Typical prices of single-family homes in Florida are shown for a random sample of 15 metropolitan areas (Naples Daily News, February 23, 2003). Data are in thousands of dollars and are in the spreadsheet.
Metro Area / Jan-03 / Jan-02Daytona Beach / 117 / 96
Fort Lauderdale / 207 / 169
Fort Myers / 143 / 129
Fort Walton Beach / 139 / 134
Gainesville / 131 / 119
Jacksonville / 128 / 119
Lakeland / 91 / 85
Miami / 193 / 165
Naples / 263 / 233
Ocala / 86 / 90
Orlando / 134 / 121
Pensacola / 111 / 105
Sarasota-Bradenton / 168 / 141
Tallahassee / 140 / 130
Tampa-St. Petersburg / 139 / 129
Have mean prices changed across the two years? Formulate and test an appropriate hypothesis.
H0: means are equal. Ha: they are not. Use a two sample paired test.
The p-value is 0.000166, so we reject the null hypothesis. There is a statistically significant difference in means.
46 (p 448).A study claimed that self-employed individuals do not experience greater job satisfaction than individuals who are not self-employed. Job satisfaction was measured using 18 questions with answers ranging from 1 to 5. The total score was the measure of job satisfaction. Scores for individuals in four separate professions are given below and in the spreadsheet.
Lawyer / Physical Therapist / Cabinetmaker / Systems Analyst44 / 55 / 54 / 44
42 / 78 / 65 / 73
74 / 80 / 79 / 71
42 / 86 / 69 / 60
53 / 60 / 79 / 64
50 / 59 / 64 / 66
45 / 62 / 59 / 41
48 / 52 / 78 / 55
64 / 55 / 84 / 76
38 / 50 / 60 / 62
Are the differences in sample mean job satisfaction scores across the four professions statistically significant?
Yes, the p-value from the ANOVA single factor test of equality of the four means is only 0.0061. The differences in sample means ARE statistically significant.