Confidence Intervals and Testing on Means
What of continuous?
EX: Instead of do you walk/bike might be interested in how far you travel to campus.
The pertinent questions:
· How many should we sample?
We calculate 100 for a certain
accuracy but no time to do it.
· What hypothesis test should we do?
· Build a confidence interval
Using the Central Limit Theorem…it assures us that sample means cluster in a bell about the population mean…with wiggle room of (stdev/sqrt of sample size)
With the continuous data we will use t-values instead of z-values (so 95% sure will no longer be 2, but something larger.) There is a t-distribution associated with every different sample size n. We say that we have a t with df=degrees of freedom equal to n – 1 (called n)
t is dashed
normal is solid
The t-tables are found in the text
Upper tail probability
df .10 .05 .025 .01…
1 3.078 6.314 12.706 31.821….
2 1.886 2.920 4.303 9.925 ….
3 1.638 2.353 3.182 4.541 …
. . . . ….
45
50 1.299 1.676 2.009 2.403
. . . . .
1000 1.282 1.645 1.960 2.326
------
80% 90% 95% 98%
Confidence level
Must read differently than the Normal tables
Q. Conduct a hypothesis test with a = .05 to test whether H0: m = 12 versus HA: m <12 commute miles to campus.
Step 1. see hypotheses above
Step 2. what type of data will you collect-- Discrete (binomial) or continuous?
Since it is continuous I will be relying on the t-distribution.
I still want to answer whether the data’s mean is far enough below my guess of 12 for me to say the campus average is lower and will do so by forming a z-score (now called a t-score)
and seeing if it is a big negative value
OR EQUIVALENTLY if the t-score’s corresponding p-value is small compared to a = .05.
Step 4.
Is t-score from the data > - (t-value) in table? Then our data supports m=12
Is t-score < - (t-value) in table? Then our data rejects m=12 (and we decide m<12)
Extension of Step 4. Can we do p-values? (hard with t-tables. Note: the p-value is sometimes a gross approximation from the t-tables…not as good as from the z-tables.)
Find an approximate p-value for your t-score from your data.
Is the p-value > a = .05 then our data supports m=12
Is the p-value < a = .05 then our data rejects m=12 (and we decide m<12)
Miles=
Miles=
Miles=
Miles=
Do you think Miles from campus follow a normal distribution? Remember if the original data is normal then the t-score
(-m)/(s/) follows a t-distribution with df = n – 1 (identifies the row in the t-table).
Go back and do the computations.
Now let’s instead build a 90% confidence interval for the mean number of miles traveled to campus.
Recall that is about equal to m with a ruler (wiggle room) of s/.
So 90% of the are within
t (df=n-1)* s/ of m.
What value does t (df=n-1) take? (Note: CI are always 2-tailed in this class).
We have n=4
We have =
We have t (df=n-1)
We have s =
So, we have s/ =
We can be 90% sure the population (all AASUers) travel between these miles to campus:
+- t (df=n-1) * s/ =
Which is the proper conclusion?
1. We are 90% sure that all AASUers travel between:
2. 90% of all AASUers travel this many miles to campus:
3. We are 90% sure that a randomly selected AASUer will have traveled between
miles to campus
4. The mean miles traveled to campus will be in: 90% of the time.
5. We are 90% sure that the mean number of miles traveled to campus by AASUers is in:
10