AMS 572 Lecture Notes

Oct. 10, 2011.

Review: Inference on two population means

1. Two normal pops, & are known. → exact Z

2. Two large samples → approximate Z

3. Two normal pops, & are unknown but

Pooled variance t (exact)

P.Q.

Exact (SAS)

4. Two normal pops, & are unknown, approximate t

P.Q.

More accurate d.f. ⇒ Satterthwaite method (SAS)

Quick & dirty ⇒ in-class exam

5. other situations ⇒ nonparametric method

Mann-Whitney U-test = Wilcoxon Rank-Sum Test (SAS)

6. Modern nonparametric method Bootstrap Resampling method

7. Transformation to Normal distribution Box-Cox transformation

e.g. X & Y are not normal, but ln(X) & ln(Y) are normal.

Inference on two population variances

* Both pop’s are normal, two independent samples

Sample 1 : ⇒

Sample 2 : ⇒

1. point estimator : (parameter of interest : )

Def. F-distribution Let , , are independent.

Then,

2. CI for

3. Test

Test Statistic

At the significance level , we reject if is too large or too small.

* conventional boundries / thresholds

SAS program for test on 2 pop means

1. paired samples

sample 1 10 23 16 18 … 33

sample 2 15 28 21 29 … 58

data paired;

input IQ1 IQ2;

diff=IQ1-IQ2;

datalines;

10 15

23 28

…

33 58

;

run;

proc univariate data=paired normal;

var diff;

run;

2. independent Samples

sample 1 10 23 16 18 … 33 (group 1)

sample 2 15 28 21 29 … 58 (group 2)

data indept;

input group IQ;

datalines;

1 10

1 23

1 16

…

1 33

2 15

2 28

…

2 58

;

proc sort data= indept;

by group;

run;

proc univariate data= indept normal;

var IQ;

by group;

run;

/* If both are nomal */

proc ttest data= indept;

var IQ;

class group;

run;

/* If at least 1 pop is NOT normal */

proc NPAR1WAY data= indept;

var IQ;

class group; Wilcoxon Rank-sum test

run;

Power and Sample Size Determination – Exact or Large Sample Z-test

1. Based on the maximum error / or the length of the CI.

Suppose we are using the exact or the large sample approximate z-test ;

Suppose the maximum error is E with probability

% CI for

2. Based on the power of the test

(1-sided test)

or (2-sided test)

Power and Sample Size Determination – Pooled Variance T-test

1. Sample size calculation in a C.I. scenario (Maximum error)

P.Q:

100(1-α)% CI for is

The length of the CI : L=

2. Inference on the test situation

Data: Two independent samples

Here and .

For a given α (e.g. 0.05 or 0.01) and a power=(1-β) (e.g. 85%), calculate the sample size.

Def.: Effect size= (e.g. Eff=1)

T.S : =

At α=0.05, reject in favor of iff

Power=(1-β)=P(reject |)=

= (Effect size=)

Example A new method of making concrete blocks has been proposed. To test whether or not the new method increases the compressive strength, 5 sample blocks are made by each method.

New Method / 14 / 15 / 13 / 15 / 16
Old Method / 13 / 15 / 13 / 12 / 14

a. Get a 95% for the mean difference of the 2 methods.

b. At =0.05, Can you conclude the new method is better? Provide p-value.

Write the SAS program for part (b)

Solution

a. Assume both populations are normal.

First, we check whether

Test Statistic :

It is reasonable to assume

Pooled-variance statistic (PQ)

95% CI for is

b. Assume both populations are normal.

First, we check whether

By part (a), we found that it is reasonable to assume

Test Statistic :

At =0.05, we reject if .

But

We cannot reject at =0.05.

SAS

data block ;

input method $ strength ;

datalines ;

new 14

new 15

new 13

new 15

new 16

old 13

old 15

old 13

old 12

old 14

;

run ;

proc univariate data=block normal plot ;

class method ;

var strength ;

run ;

proc ttest data=block ;

class method ;

var strength ;

run ;

proc npar1way data=block ;

class method ;

var strength ;

run ;

Example An experiment was done to determine the effect on dairy cattle of a diet supplement with liquid whey. While no differences were noted in milk production between the group with a standard diet (hay + grain + water) and the experimental group with whey supplement (hay + grain + whey), a considerable difference was noted in the amount of hay ingested. For a 2-tailed test with =0.05, determine the approximate number of cattle that should be included in each group if we want for . Previews study has shown

Solution

2. either both populations are normal or both sample size are large.

Example Do fraternities help or hurt your academic progress at college? To investigate this question, 5 students who joined fraternities in 1998 were randomly selected. It was shown that their GPA before and after they joined the fraternities are as follows.

Student / 1 / 2 / 3 / 4 / 5
Before / 3 / 4 / 3 / 3 / 2
After / 2 / 3 / 3 / 2 / 1
Diff. / 1 / 1 / 0 / 1 / 1

Please test the hypothesis at =0.05

Solution

Assumption : the difference follows a normal distribution.

Test statistic :

We reject at =0.05 and conclude fraternities does hurt…

SAS

data frat ;

input before after ;

diff = before – after ;

datalines ;

3 2

4 3

3 3

3 2

2 1

;

run ;

proc univariate data=frat normal ;

var diff ;

run ;