STAT 3900/4950 Homework/Lab 6 Spring, 2015
Question #1
(a)
Table 1 Table 2
a. As seen from Table 1, the column of observed N
b. In Table 2, p value is .000
c. In Table 2, chi-square test statistic is 27.125
Since the p-value is very small (less than 0.01), we reject the null hypothesis and conclude that the three different cooking types make the taste of the chips differently.
(b) The expected frequency is 16 for each chip cooking type.
(c) After combined type I and type 3:
Chip Count
Canola oil 33.00
Others 15.00
a. As seen from the above left able 1, the column of observed N
b. In the above right table, p value is .009
c. In the above right table, chi-square test statistic is 6.750
Expected frequencies of canola type and the other two combined are 24 and 24, respectively.
Since the (two-sided) p-value is very small (less than 0.01), we reject the null hypothesis and conclude that the canola oil cooked chips taste differently to the chips cooked by other two methods. Since the observed frequency (33) of canola type is higher, we can conclude that the data show strong support to Kristen hypothesis: Individuals prefer potato chips that are fried in canola oil over those that are fried in animal fat or baked.
(d) In summary, the chips fried in canola oil taste significantly better compared to those fried in animal fat or baked. When comparing chips fried in animal fat to those baked, their tastes are not much different.
Question #2
(a) From the SPSS output
1. The % of female students who took some advanced math classes is 13.1%
2. The % of female students who took no advanced math classes within those raised by their fathers is 70%
3. The % of female students raised by their father is 23.1%
4. The Chi-square statistic is 9.82
5. The two variables are dependent since all the p-values for these tests are smaller than 0.05.
(b) The side-by-side bar graph is shown below:
(c) The proportion of female high school students who take advanced math courses in high school depends on how they have been raised. The female students raised by both their father and mother have a bigger proportion of taking advanced math courses compared to those raised by their father primarily.
Question #3
H0: There is marginal homogeniety. The use of vitamins has no association to having disease X.
HA: There is no marginal homogeniety.
SPSS output:
SAS output:
McNemar's TestStatistic (S) / 11.4286
DF / 1
Pr > S / 0.0007
Simple Kappa Coefficient
Kappa / 0.3348
ASE / 0.0450
95% Lower Conf Limit / 0.2466
95% Upper Conf Limit / 0.4229
Conclusion: Reject H0. The McNemar chi-square = 11.4286 is high, with a p-value = 0.0007. The CASE people were less likely (34.1% v.s. 43.2%) to use vitamins than COTROL people (see the table below). In other words, one is more likely to have disease X if vitamins were not used.
SPSS Output: Analyze > crosstabs, choose row and column variables, and then tick Kappa under “statistics”
The level of agreement in vitamin uses between case and control groups is .335, medium to strong level.
Appendix: SAS Code
* Question 1;
/* (a), (b)*/
data chip;
input type $ count;
datalines;
1 7
2 33
3 8
;
run;
proc freq data=chip;
weight count;
table type/testp=(33.3 33.3 33.4);
run;
/* (c) */
data chip2;
set chip;
if type='1' or type='3' then type2=0;
else type2=1;
run;
proc freq data=chip2;
weight count;
table type2/testp=(50 50);
run;
* Question 3;
data matched;
input case $ control $ count;
datalines;
y y 100
y n 50
n y 90
n n 200
;
run;
proc freq data=matched;
weight count;
table case*control/agree;
run;
1