252solngr3 3/7/08 (Open this document in 'Page Layout' view!)
Name:
Class days and time:
Please include this on what you hand in!
Graded Assignment 3
Solution (15 pages)
Part 1: In your outline there are 6 methods to compare means or medians, methods D1, D2, D3, D4, D5a and D5b. Methods D6a and D6b compare proportions and method D7 compares variances or standard deviations. In the following cases, identify and and identify which method to use. If the hypotheses involve a mean, state the hypotheses in terms of both and . If the hypotheses involve a proportion, state them in terms of both and . If the hypotheses involve standard deviations or variances, state them in terms of both and or . All the questions involve means, medians, proportions or variances. One of these problems is a chi-squared test.
Note: Look at 252thngs (252thngs) on the syllabus supplement part of the website before you start (and before you take exams). ). Neatness and clarity of explanation are expected. Note that from now on neatness means paper neatly trimmed on the left side if it has been torn, multiple pages stapled and paper written on only one side. This is very similar to Problem D8.
------
Example: This may seem long but it appears on an old graded assignment 3.
A group of supervisors are given the exams on management skills before and after taking a course in management. Scores are as follows.
Supervisor / Before / After1 / 63 / 78
2 / 93 / 92
3 / 84 / 91
4 / 72 / 80
5 / 65 / 69
6 / 72 / 85
7 / 91 / 99
8 / 84 / 82
9 / 71 / 81
10 / 80 / 87
11 / 68 / 93
If we assume that the distribution of results is Normal, what method should we use to answer the question “Has the course improved the scores of the managers?”
Solution: You are comparing means before and after the course. You can get away with using means because the parent distributions are Normal. If is the mean of the second sample, you are hoping that , which, because it contains no equality is an alternate hypothesis. So your hypotheses are or . If , then . The important thing to notice here is that the data are in before and after pairs, so you use Method D4.
------
General considerations.
1) All methods in section D are methods that can be used only for comparison of 2 samples. This is because, if (theta) is a parameter like or is easy to define and will be zero if and are equal. If we go to more than two samples, say 3, will not be zero when we need something like , where is some sort of average of the parameters of the samples. This will equal zero if all the parameters are equal and will not allow positive discrepancies in one sample to cancel out negative discrepancies in another. This is what takes us to chi-squared and ANOVA methods.
Saying is not the same as saying , because would be negative if , but saying is the same as saying . (Try proving this – it’s simple algebra.)
2) You can always substitute a method for the median for a method for the mean, but not vice versa.
However, if a Normal distribution applies, a method involving means will be more efficient and powerful.
3) The computer will used Method D3 when it is not told what method to use. This is quite general because if the sample variances are similar, it gives results like D2 and if the sample sizes are large, it gives results like D1. However, if variances are equal D2 is easier to use and if the samples are large D1 is easier to use.
4) The K-S and Lilliefors methods only exist because chi-square performs so poorly for small samples. K-S needs or other parameters. Lilliefors uses and only works to test for a Normal distribution.
5) ‘Significant’ in statistics means that we have rejected a hypothesis like and ‘significantly different’ means that we have rejected a hypothesis like . Of course, if two parameters are significantly different, their difference is significant. Remember, if we are saying that a difference is significant, we are saying that a difference as large or larger than what we observe is very unlikely under our null hypotheses and that a p-value tells us the probability of getting a difference as large or larger if the null hypothesis is true.
6) Be careful of inequalities. If or and , then Please remember -
A hypothesis containing is an alternative hypothesis. The null hypotheses will contain .
7) In most problems you are better off trying to figure out what the alternative hypothesis is before you try to state the null hypothesis.
8) Do not lose sight of the fact that the purpose of samples is to compare populations. We may look at numbers in methods D6b and chi-squared tests, but our purpose is to deal with proportions of a population.
1. Dora Jarr and Daughters is a maker of components for automobile dashboards. When Dora retired, her company’s stock became publicly traded. A sample of 160 stock analysts were asked whether they rated the stock as a ‘buy‘in 2007 and again in 2008. 79 analysts rated the stock a ‘buy’ in both 2007 and 2008. 15 analysts recommended it as a ‘buy’ in 2007 but not in 2008. 9 analysts upgraded the stock to a ‘buy’ in 2008. The remaining analysts did not consider the stock a ‘buy’ in either year. Can we say that the proportion of analysts who favor the stock has fallen?
Solution: This can be called a paired comparison of proportions and the method is D6b, the McNemar Test Let represent the proportion of analysts that rated the stock as a ‘buy‘ in 2007 and represent the proportion of analysts that rated the stock as a ‘buy‘ in 2007. We want to test to see if . This is an alternative hypothesis because it does not contain an equality. The null hypothesis is the opposite . 79 analysts rated the stock a ‘buy’ in both 2007 and 2008. 15 analysts recommended it as a ‘buy’ in 2007 but not in 2008. 9 analysts upgraded the stock to a ‘buy’ in 2008. The remaining 160 – 79 – 15 – 9 analysts did not consider the stock a ‘buy’ in either year. Our hypotheses are given along with the table to be analyzed. or if ,
2. Of a sample of 200 MBA students, 110 are males. Of a sample of 500 managers, 300 are males. Is there a significant difference between the fraction of males in the population of MBA students and the population of managers? (What are and and what is the identifier of the method you would use?)
Solution: You are comparing two proportions. If refers to the proportion of males among MBA students and refers to the proportion of males among managers, or . If , then . Since we are comparing proportions of unrelated samples, use Method D6a.
3. We add a sample of 100 CEOs to the data in 2. 80 of them are males. Can we say that there is a significant difference between the proportion of males in all three groups?
Solution: You are comparing three proportions. If refers to the proportion of males among MBA students, refers to the proportion of males among managers and refers to the proportion of males among CEOs, . Since we are comparing proportions of more than two unrelated samples, use a chi-squared test of homogeneity.
4. You have two machines that plop fruit into bottles, a new one and an old one. A sample of weights of 10 bottles from the old machine is taken, the average weight is 971.375 grams with a standard deviation of 15.250 grams. A sample of weights is taken from the new machine and the average weight turns out to be 971.374 grams with a standard deviation of 11.001 grams. If variability is a measure of reliability, can we say that the new machine is more reliable than the old one?
Solution: The variance (or standard deviation) is a measure of variability, which is the opposite of consistency or reliability especially in this case where the difference in means hardly exists. You need to test the equality of the variances. ‘Less consistent or reliable’ means a larger variance, so the new machine (machine 2) being more reliable than the old one translates as so we have or . In terms of the variance ratio you are testing and you will do this by comparing against . This is Method D7.
Question for a Later Exam: The F-test assumes that the underlying distribution is Normal. What if you doubt that the Normal Distribution applies? Solution: Levene Test.
Question for a Later Exam: What if there are 3 machines? Solution: Levene or Bartlett Test.
5. The Wallaby Shock Absorber company takes 6 of its own shock absorbers and tests them for durability by driving different cars 20000 miles with them. The mean and variance of the strength of the shock was recorded giving a mean of 10.716 and a standard deviation of 3.069. 6 of a competitor’s shocks were tested the same way, and a mean of 10.300 and a standard deviation of 3.304 were found. The manufacturer wants to compare the means, and assumes an underlying Normal distribution, but needs to find out first whether to use method D2 or D3. What should the manufacturer do to decide?
Solution: To make the decision as to which method to use, you need to test the equality of the variances. or . In terms of the variance ratio or , you are testing or . In practice, this is a right-sided test because 3.069 and 3.304. So you will compare with . This is Method D7.
6. The manufacturer in the previous example never did decide what to do. Instead Wallaby continued the experiment by testing 120 of its own shocks and 90 of the competitors. For Wallaby’s shocks the mean and standard deviation were now 10.701 with a standard deviation of 3.051 . For the competitor the mean was now 10.422 with a standard deviation 3.043. What method can they now use to compare the average strength of the shocks?
Solution: The word ‘average’ makes you think of the mean. You are comparing two means, with a total sample size of 200. There is no reason to assume that one mean is larger than the other. So or . If , then . You can certainly get away with method D1 which works for large sample sizes and this would be preferred if you must work with a calculator. D3 would also work and would be used on a computer, but it’s more effort. Of course, you could still test for equal variances and use method D2.
Question for a Later Exam: What if we want to compare three or more firms’ shock absorbers? Solution: One-way ANOVA.
7. Assume that the situation is identical to problem 5 above, but that an analysis of the data indicates that the distribution of strengths is highly skewed to the right. What method should be used now to compare the strength of the shocks?
Solution: The skewness of the data should alert you to the fact that you should compare medians rather than means, since it is unlikely for this small a sample that the sample means are normally distributed. If is the median for Wallaby’s shocks, we have . Since we are comparing medians and the data are not paired, use Method D5a.
Question for a Later Exam: What if we want to compare three or more firms’ shock absorbers? Solution: Kruskal-Wallis
8. A sample of ten customers are asked to rate their experience with two service firms on a scale of one to ten. Scores are as follows.
1 / 4 / 7
2 / 5 / 6
3 / 9 / 6
4 / 5 / 7
5 / 4 / 5
6 / 9 / 8
7 / 5 / 4
8 / 7 / 6
9 / 3 / 5
10 / 3 / 6
If we assume that the distribution of results is Normal, what method should we use to see if there is a difference between average ratings customers give to the firms?
Solution: You are comparing means before and after the course. You can get away with using means because the parent distributions are Normal. If is the mean of the second sample, you are hoping that , which, because it contains no equality is an alternate hypothesis. So your hypotheses are or . If , then . The important thing to notice here is that the data are in before and after pairs, so you use Method D4.