6.5 Interval Estimation

Recall,


Sampling Distribution of , the Sample Mean

If a simple random sample of size n is taken from a population having population mean and population standard deviation , and if the original population is normally distributed, then

,

If the original population is not necessarily normal, but the sample size n is large enough (), then

(central limit theorem),

DEFINITION:

A parameter is a numerical value that would be calculated using all of the values of the units in the population.

A statistic is a numerical value that is calculated using all of the values of the units in a sample.

Tip: One way to remember this distinction is this: The letter p is for population and parameter, while the letter s is for statistic and sample.

Population size N = 16

Sample size n=4

ESTIMATION:

§  What is the mean number of delay hours of Northwest Airline Flights to Chicago?

§  What is the mean weight of actresses living in Hollywood?

§  What is the mean of times per day a person in the U.S. uses a pain reliever?

Each of these questions is asking, “What is the value of the parameter?”

A confidence interval estimate for the population meanis an interval of values, computed from the sample data, for which we can be quite confident that it contains the population mean.

The confidence level is the probability that the estimation method will give an interval that contains the parameter (in this case). The confidence level is denoted by, where has common values of 0.10, 0.05, and 0.01, for 90%, 95%, and 99% confidence levels respectively.

In class, you were requested to construct the distribution of the average of all possible samples of size 2 from a small population of size 4. At the end of the activity we had the following conclusions:

-The distribution of the has a bell shape( Normal).

- The mean of the is the same as the population mean.

-The standard deviation of the is .

This Lead us to the conclusion that ~N() for large samples.

Let’s Think About It!

Recall, The CLT states that for sufficiently large samples,~N().

Use the Empirical Rule to answer the following questions

a.  68% of the fall with within 1 standard deviation of the mean. This is equivalent to saying that the mean that the mean is within 1standard deviation of the average of a sample 68% of the times.

Based on this fact, can you construct the interval that has 68% chance of containing the mean?

b.  95% of the fall with within almost 2 standard deviation of the mean. This is equivalent to saying that the mean is within almost 2 standard deviation of the average of a sample 95% of the times.

Based on this fact, can you construct the interval that has 95% chance of containing the mean?

c.  99.7% of the fall with within almost 3 standard deviation of the mean. This is equivalent to saying that the mean is within almost 3 standard deviation of the average of a sample 99.7% of the times.

Based on this, can you construct the interval that has 99.7% chance of containing the mean?

Let's Do It! 1

What is the 95% confidence interval for using a sample having the following statistics ?

What is the 90% confidence interval for using a sample having the following statistics?

Let's Do It! 2

The height of a random sample of 50 college students showed a mean of 174.5 cm. Construct a 99% confidence interval for the mean height of college students if the population of college students has a standard deviation of 6.9 cm.

Population Standard Deviation σ Is Unknown

When is unknown, we use the sample standard deviation s instead. The replacement of by s changes the distribution of the sample mean .

When is unknown, the distribution of is no longer N() when the sample size n is large. Instead, is said to follow another distribution called the Student’s t-Distribution.

The Student’s t-Distribution with (n-1) degrees of freedom

DEFINITION:

When data is used to estimate the standard deviation of a statistic, the result is called the standard error of the statistic.

The standard error of the mean (SEM) is the estimated standard deviation of the sample mean, SEM= .

Properties of the Student’s t-distribution

§  The t-distribution has a symmetric bell-shaped density centered at 0, similar to the N(0,1) distribution.

§  The t-distribution is “flatter” and has “heavier tails” than the N(0,1) distribution.

§  As the sample size increases, the t-distribution approaches the N(0,1) distribution.

Confidence Interval for a Population Mean

Where t(n-1,u ) is an appropriate u-percentile of the t-distribution.

This interval gives potential values for the population mean based on just one sample mean . This interval is based on the assumption that the data are a random sample from a normal population with unknown population standard deviation . If the sample size is large, the assumption of normality is not so crucial.

How to use Table F to find :

Example: Find the value for a 95% confidence interval when the sample size is 22.

Solution

The d.f. = (22 1)= 21, and u=1-.05/2=.975.

Find 21 in the far left column and .975 in the row labeled u. The intersection where the two meet gives the value for, which is 2.080.

Let's Do It! 3

A random sample of 25 bottles of buffered Aspirin contained on average 325.05 mg of aspirin with a standard deviation of 0.5 mg. Assume that the aspirin content is normally distributed.

•  What is the distribution to be used for interval estimation of the mean Aspirin content? Why?

•  Construct a 90% confidence interval for the mean content of Aspirin.

Let's Do It!4 Skin Cancer

: A dermatologist is investigating a certain skin cancer. Twenty five rats have this cancer and are treated with a new drug. The dermatologist is interested in the number of hours until the cancer is gone. He found that the sample produced an average of 322 hours and a standard deviation of 101 hours. Assuming normality,

a.  Compute a 90% confidence interval for the mean number of hours.

b.  Interpret the confidence interval constructed above.

Let's Do It! 5 Jogging and Pulse Rate

A random sample of 21 US adult males who jog at least 15 miles per week is taken and their pulse rate is measured. The sample had an average pulse rate of 52.6 beats /minute with a standard deviation of 3.22 beats /minute.

a.  Find a 95% confidence interval for the mean pulse rate of all US males who jog at least 15 miles per week. Assuming pulse rate is normally distributed.

b.  Interpret the interval obtained above.

c.  If the mean pulse rate of all US adult males is approximately 72 beats/minute. Does it appear that jogging at least 15 miles per week reduces the mean pulse rate? Explain

Using the TI

Construct a 99% confidence interval estimate for the mean if we have observations.


For the TI, a confidence interval for a population mean when the population standard deviation is unknown is called the one-sample t-Interval and abbreviated TInterval. This is option 8 under the TESTS menu obtained from the STAT button. You can have the sample data entered into a list, say L1, or just enter the Stats (the sample mean, sample standard deviation, and sample size). The steps are summarized below and the corresponding input and output screens are shown. Note that the Calculate option produces an output screen that provides the confidence interval (11.101, 14.099), the sample mean =12.6, sample

standard deviation s = 4.4, and the sample size n = 61.

Statistic Humor:

Did you hear about the statistician who was thrown in jail? He now has zero degrees of freedom.

Homework page 217: 13, 14, 15, 16, 29, 30, 33, 41.