Assignment 2, Inferential Statistics I
Student Name:
Grade:
[Please download a copy of this document and put all your answers in this document.]
[You can study with other students on this assignment, but you must write the answers yourself. If your assignment paper is the same as other student, both of your grades would be low.] When indicating showing software output in a question, please place them right after your answer for each part of the question. Save the assignment using your last name and first name initial as part of the file name. If I were to submit the assignment, this assignment file name would be A2_ChangA.doc.
Part I
In a study, the researchers wish to see the percentage of registered voters in a population who were in favor of candidate A in an election was more than 40%. A random sample of registered voters in this population was taken. Among 800 people participated in the survey, 380 of them voted for candidate A.
a)Report the 95% confidence interval for estimating the percentage of people in the population who would vote for candidate A.Please use point estimate ± margin of error format to report the confidence interval. (Use the asymptotic method in R Hmisc package as described in the R instruction video.)
[Place your R output here.]
b)If the researchers would like to estimate a sample size for the study, and there is no prior knowledge of the proportion, how large a sample would be needed for constructing a confidence interval for estimating the percentage of people will be in favor of candidate A with a 95 confidence level and a 2% margin of error?
Part II
A research team collected information from a random sample of 50 students from 11th graders in Ohio. The survey question can be found in the following link:
The data in csv format can be downloaded from the following link:
“csv” file is comma separated value textfile. Each data value is separated by a comma. Therefore, you need to check Comma for Field Separator when loading the file. Survey data often have errors. If you found error in the data value, you must remove it with justification.
a)Perform a t- test for meanusing 5% as the level of significance to see if the average arm span for the 11th graders in Ohio is more than 160 cm. You must state null and alternative hypothesis, check normality assumption, report test statistic value, report p-value, and draw a proper conclusion. (Please complete the analysis even if the normality assumption is not satisfied at 5% level of significance.)
Null hypothesis:
Alternative hypothesis:
Report p-value from the normality test, make a quantile-comparison plot, and draw a conclusion:
Report the value of the t-test statistic =
Report p-value from the t-test and use it to draw the conclusion:
[Place your R output here.]
b)Find the 95% confidence interval for estimating the average arm span for the sampled population. (Please compute the confidence interval even if the normality assumption is not satisfied at 5% level of significance.)
[Place your R output here.]
c)Find the sample size so that one can have a 95% power to detect the difference in average arm span if the actual average were 165,a, at 5% level of significance, using the estimated standard deviation from the sample. (Use the sample size calculation formula in the lecture note.)
Part III
A group of investigators are studying a treatment that can help reducing LDL Cholesterol level. The following data shows the LDL at the beginning and the end of the observation period from a sample of participants randomly selected from a specific patient population who received the treatment.
Subject ID / 1 / 2 / 3 / 4 / 5 / 6 / 7Begin LDL / 189 / 142 / 154 / 241 / 161 / 217 / 195
After LDL / 180 / 122 / 112 / 190 / 126 / 181 / 170
a)Perform a t-test to test whether the average reduction in LDL (use LDL at the beginning minus LDL at the end) is greater than 20, at 5% level of significance. You must state null and alternative hypothesis, check normality assumption, report test statistic value,present p-value and use it conclude the t-test.
Null hypothesis:
Alternative hypothesis:
Report p-value from the normality test and conclusion:
Report the value of the t-test statistic =
Report p-value from the t-test and the conclusion:
Report p-value from the t-test and the conclusion:
[Place your R output here.]
b)Find the 95% confidence interval for estimating the reduction in LDL for the sampled population from the treatment.
[Place your R output here.]
c)Find the sample size so that one can have a 95% power to detect an average reduction in LDL that is one half of the standard deviation of the LDL reduction measures, at 5% level of significance for one-sided t-test.
Part IV
A team of industrial psychologists studies the emotional maturity of high school graduates and high school dropouts employed in the same type of work. The test scores shown below are result from a standardized test given to random samples of subjects selected from the two populations.
Graduates / 89 / 85 / 77 / 88 / 57 / 69 / 62Dropouts / 49 / 54 / 72 / 58 / 41 / 53 / 64
a)Check the normality assumption and report your finding using p-value from a normality test.
[Place your R output here.]
b)Perform a t-test to see if there is significant different between the population means. Also, properly conclude your analysis by providing your comments on the findings using p-value.
Null hypothesis:
Alternative hypothesis:
Report the value of the t-test statistic =
Report the degrees of freedom of the test statistic =
Report p-value from the t-test and the conclusion:
[Place your R output here.]
1