Chapter 9 Estimating the Value of a ParameterKEY
Review on symbols:
the mean of a sample
the proportion of a sample
µ the true population mean
pthe true population proportion
the true population standard deviation
sthe sample standard deviation
n the size of the sample (number of data collected)
Which of the above is a parameter?
Which of the above is a statistic?
Chapter 9.1 Estimating a Population Proportion
Objective A :Point Estimate
A point estimate is the value of a statistic that estimates the value of a parameter.
The best point estimate of the population proportion is a sample proportion ( ).
The best point estimate of the population mean is a sample mean ().
Since varies from sample to sample, we use an interval based on to capture the unknown
population proportion with a level of confidence.
Objective B : Confidence Interval
A confidence interval for an unknown parameter consists of an interval of numbers based on
a point estimate.
The level of confidence represents the expected proportion of intervals that will contain the
parameter if a large number of different samples is obtained.
The level of confidence is denoted as . The level of confidence controls the width
of the interval.
Confidence interval estimates for a parameter are of the form:
Point estimate margin of error.
Confidence interval for :
where provided that .Used when p (the true proportion) is not known. Can be written as
The value of is called the critical value of the distribution.
The margin of error, , in a confidence interval for a population proportion is
given by . The width of the interval is determined by the margin of error.
Note: More confidence leads to a wider interval
Ex. 100% confidence vs50% confidence:
Example 1:Use StatCrunch to determine the critical value that corresponds to the given level
of confidence. Stat-calc-normal –between-μ=0, σ=1
(a) 90%(b) 95% (c) 98% (d) 92%
Diagram:diagram: diagram: diagram:
P(____< x <____) = .90 P(____< x <____) = .95 P(____< x <____) = .98 P(____< x <____) = .92
Compute
= -1.645 = -1.96 = -2.33 = -1.75
= 1.645 = 1.96 = 2.33 = 1.75
NOTE: for 95%
z is close 2 SD
Example 2: Determine the margin of error (E) for with and at a 99% level
of confidence. Assume this represents the number who admitted to have texted in the last month while driving from a sample of 900 people.
=
Statcrunch gives z = 2.58 for 99% level of confidence
=2.58=0.0421
Note: the 99% CI: 0.60 ± .0421
Example 3: A Rasmussen Reports national survey of 1000 adult Americans found that 18% dreaded
Valentine's Day. Construct a 95% confidence interval for the population proportion of adult
Americans who dread Valentine's Day. Explain what does the interval mean.
n=1000
95% CI:
0.18 ± 2SE (You can use 1.96 for z to be more exact. For 68% you can use z=1, for 99.7% use z=3))
Standard deviation of sample (or standard error of sample) = ==0.01215
So 0.18 ± 2SE = 0.18 ± 2(0.01215) = 0.18 ± .024
CI: (0.18 – 0.024, 0.18 + 0.024) = (0.156, 0.204) or between 15.6% and 20.4%
We are 95% confident that the true proportion of adults who dread Valentine’s Day is between 15.6% and 20.4%.
Summary of CI’s
68% use z = 195% z=2 (or 1.96)99.7% z= 3
90% z=1.6598% z = 2.3399% z =2.58
95% confidence interval: About 95 samples out of 100 will capture the true proportion and about 5 samples will not.
Example 4: Construct a confidence interval of the population proportion at the given level
of confidence where 80 students out of 200 came late to class for a lecture on a randomly selected day.
confidence
Stat-proportion stat-one sample-with summary-80 successes-200 observations-confidence interval 96%-compute
Lower limit, upper limit = (0.329, 0.471)
*We are 96% confident that the true population proportion of students who come lateis between 32.9% and 47.1%.
Example 5: In a study of 1228 randomly selected medical malpractice lawsuits, it is found that 856
of them were later dropped or dismissed.
(a) What is the best point of estimate of the proportion of medical malpractice lawsuits
that are dropped or dismissed?
(b) Construct a 99% confidence interval (by hand) for the population proportion
of medical malpractice lawsuits that are dropped or dismissed?
99%: use z = 2.58
CI = where SE =
CI = 0.697 ± 2.58(0.0131) = 0.697 ± 0.0338
CI : (.697 - .034, .697+.034) = (0.663, 0.731)
(c) Interpret the interval.
We are 99% confident that the true proportion of medical malpractice lawsuits that are dropped/dismissed is between 66.3% and 73.1%.
Objective C :Sample Size Needed for Estimating the Population Proportion
The sample size required to obtain a confidence interval for with a
margin of error is given by
Round up to the next integer
is a prior estimate of
If a prior estimate of is unavailable, the sample size required is
Round up to the next integer
So you can use if proportion is not given
Example 1 : An urban economist wishes to estimate the proportion of Americans who own
their homes. What size sample should be obtained if he wishes the estimate to be
within 0.02 with 90% confidence if:
(a) he uses a 2010 estimate of 0.669 obtained from the U.S Census Bureau?
He wants: with 90% confidence so use z = 1.65
0.02 = 1.65 SE
0.02 = 1.65
≈ 1507.17 round up to 1508. He should sample 1508 Americans so that the estimate is within 0.02 margin of error at a 90% confidence level.
Note: For MML you might have to be more precise and use z = 1.645 for 90%CI and z = 1.96 got 95%CI
(b) he does not use any prior estimates?
If not estimate is given for the proportion, then we use = .50
0.02 = 1.65
≈ 1707.5 round up to 1708. He should sample 1708 Americans. Note: if you do not round your answer will be 1702.
Example 2: In a Gallup poll conducted in October 2010, 64% of the people polled answered
"more strict" to the following question: "Do you feel that the laws covering the sale
of firearms should be made more strict as they are now?" Suppose the margin of
error in the poll was 3.5% and the estimate was made with 95% confidence. At
least how many people were surveyed?
(a) he uses a 2010 estimate of 0.64 obtained from the U.S Census Bureau?
with 95% confidence so use z = 1.96 or 2.
0.035 = 1.96 SE
0.035 = 1.96
≈ 722.5 round up to 723. So 723 people were surveyed in total.
Note: If z = 2 is used instead, then n = 753 people.
Example 3: A Gallup poll conducted in November 2010 found that 493 of 1050 adultAmericans
believe it is the responsibility of the federal government to makesure all Americans
have healthcare coverage.
(a) Obtain a point estimate for the proportion of adult Americans who believeit is the
responsibility of the federal government to make sure all Americanshave healthcare
coverage.
(b) Verify the requirements for constructing a confidence interval for are satisfied.
npq ≥ 10
1050(0.47)(0.53) ≥ 10
263 10 yes
Condition is met so (the sampling distribution) will be normally distributed
(c) Construct a 95% confidence interval for the proportion of adult
Americans who believe it is the responsibility of the federal government to make
sureall Americans healthcare coverage. Interpret the interval.
By hand: 95%: use z = 2
CI = where SE =
CI = 0.47 ± 2 (0.0154) = 0.47 ± 0.0308 ≈ 0.47 ± 0.031
CI : (0.47 - 0.031, 0.47+0.031) = (0.439, 0.501)
We are 95% confident that the true proportion of adults who believe the federal government should cover healthcare is between 43.9% and 50.1%.
Using Statcrunch (which will use z = 1.96)
Stat-proportion stat-one sample-with summary-493 successes-1050 observations-confidence interval 95%-compute
CI = (0.439, 0.500) or between 43.9% and 50.0%
(d) You wish to conduct your own study for the proportion of adult Americans who
believe it is the responsibility of the federal government to makesure all Americans
have healthcare coverage. What sample size would be needed for the estimate to be
within 3 percentage points with 90% confidence if you use the estimate obtained in
part (a). (Use statcrunch).
By hand it would be: 0.47 ± 0.03 and setting 0.03 = for 90% confidence z = 1.65
Statcruch: Stat-proportion-one sample- power/sample size-confidence interval 0.90-target proportion 0.47-width 0.06-compute
N= 749 rounding up. (The total width is 0.06 since it is 3% on each side of the normal curve.
(e) You wish to conduct your own study for the proportion of adult Americans who
believe it is the responsibility of the federal government to makesure all Americans
have healthcare coverage. What sample size would be needed for the estimate to be
within 3 percentage points with 90% confidence if you do not have a prior estimate?
By hand it would be: 0.50 ± 0.03 and setting 0.03 = for 90% confidence z = 1.65 and we would use = 0.50.
Statcruch: Stat-proportion-one sample- power/sample size-confidence interval 0.90-target proportion 0.50-width 0.06-compute
N = 752 (rounded up)
Note this is still close to our previous answer but larger.
Chapter 9.2 Estimating a Population Mean
Objective A : Point Estimate
The best point estimate of the population mean, , is the sample mean, .
Objective B :Student's - distribution
Properties of the - distribution
1. The - distribution is different for different degrees of freedom ().
2. The - distribution has the same general symmetric bell shape as the standard normal
distribution but its area in the tails is a little greater than the area in the tails of the standard
normal distribution due to the greater variability that is expected with small samples.
3. The - distribution has a mean of at the center of the distribution.
4. As the sample size gets larger, the - distribution gets closer to the standard normal
distribution.
Example 1: Use StatCrunch to determine the -value.
*Note: There is more variability for smaller sample sizes.
(a) Using Statcrunch find the -value such that the area in the right tail is 0.05 with 19 degreesof freedom.
Stat-Calc-T-degrees of freedom 19- P(x≥___) = 0.05 –compute
t value = 1.73
Note n = 20 (sample size)
You could also use P(x≤___) = 0.95
(b) Find the -value such that the area left of the -value is 0.02 with 6 degrees of freedom.
Stat-Calc-T-degrees of freedom 6- P(x≤___) = 0.02 –compute
t value = - 2.61
(c) Find the critical -value that corresponds to 95% confidence. Assume 12degrees of freedom.
Stat-Calc-T-between-degrees of freedom 12- P(___x≤___) = 0.95 –compute
(note sample size n = 13)
T value = ± 2.18
d) Find the critical -value that corresponds to 95% confidence. Assume 50 degrees of freedom.
Stat-Calc-T-between-degrees of freedom 50- P(___x≤___) = 0.95 –compute
(note sample size n = 51)
T value = ± 2.0086
e) What happened to the width of the interval as the sample size increased from 13 to 51?
As the sample size increases, the interval became more narrow. The distribution became more normal.
In general, the population standard deviation is unknown for estimating a population mean based
on a sample mean. The -distribution is used to off-set the additional variability introduced by using
in place of .
Objective C :Confidence Interval for a Population Mean
Constructing a Confidence Interval for
Point estimate margin of error
where.
provided the data come from a population that is normally distributed, or the sample size is large.
Example 1: A simple random sample of size has been obtained. From the normal probability
plot and boxplot, judge whether a -interval should be constructed.
Yes, no outliers in the normal probability plot and the boxplot is roughly symmetrical. Thus we can use the t distribution regardless of the sample size.
(b)
No, one slight outlier in the normal probabilty plot and boxplot is skewed to the left. Since the sample size is less than 30 we cannot use a t distribution.
Example 2: A simple random sample of size is drawn from a population that is normally distributed to investigate the age when working people start thinking about retirement .
The sample mean, , is found to be 50, and the sample standard deviation, , is found to be 8.
(a) Construct a 98% confidence interval for if the sample size, , is 20. (That is 20 people are surveyed.)
By hand: for 98%: use t = 2.54 from statcrunch (stat-calc-t)
CI:== where SE =
CI = = 50 ± 4.54
CI= (50 – 4.54, 50 + 4.54) ≈ (45.46, 54.54) (45.5, 54.5)
One can be 98% confident that the true mean age when people start thinking about retirement is between 45.8 and 54.2 years of age.
Statcrunch:
Stat-t stats-one sample-with summary-mean 50, SD 8, n=20, CI .98-compute CI= (45.46, 54.54) ≈ (45.5, 54.5)
(b) Use StatCrunch to construct a 98% confidence interval for if the sample size, , is 15.
How does decreasing the sample size affect the margin of error, ?
Stat-t stats-one sample-with summary-mean 50, SD 8, n=15, CI .98-compute CI= (44.6, 55.4) This interval is wider. So when sample size is decreased, the CI increases.
E = =5.25 The margin of error increased when sample size decreased (compared to 4.54 in part a).
You can also find E from the CI: E = = 5.4 (Difference due to rounding in computations)
(c) Construct a 95% confidence interval for if the sample size, , is 20.
Compare the results to those obtained in part (a). How does decreasing the level of
confidence affect the margin of error, ?
Stat-t stats-one sample-with summary-mean 50, SD 8, n=20, CI .95-compute CI=(46.3, 53.7) This CI is more narrow than the one in part a: (45.5, 54.5). So as the level of confidence increases, the interval will become wider.
E for 95% CI is E = = This is smaller than E in part a: 4.5
So the margin of error decreases as the level of confidence decreases.
In part a: About 98 samples out of 100 will capture the true proportion.
In part b: About 95 samples out of 100 will capture the true proportion.
(d) Could we have computed the confidence intervals in parts (a) to (c) ifthe population
had not been normally distributed? Why?No because the conditions to apply this formula would not have been met. The sample size ,n was less than 30 in both parts.
Example 3: Determine the point estimate of the population mean and margin of error for the
following confidence interval.
Lower bound: 5
Upper bound: 23
E = = ( so 5+9 = 14 and 23-9 = 14) The population mean, = 14.
You can also compute by finding the midpoint of the interval: 14
Margin of error, E = 9
Point estimate of the popultion mean, = 14
Note CI = 14 ± 9
Example 4 : How much time do Americans spend eating or drinking? Suppose for a random
sample of 1001 Americans age 15 or older, the mean amount of time spent eating
or drinking per day is 1.22 hours with a standard deviation of 0.65 hour.
(a) A histogram of time spent eating and drinking each day is skewed right. Use this
result to explain why a large sample size is needed to construct a confidence
interval for the mean time spent eating and drinking each day.
Since the population is not normally distributed, you need a large sample size to achieve a sampling distribution of that will be normally distributed. Thus, use n >30.
(b) Determinea 95% confidence interval for the mean amountof
time Americans age 15 or older spend eating and drinking each day. Interpret the interval.
By Hand:First find the t-value using Statcrunch for 95% confidence: stat-calc-t-between, df 1000, P(___x≤___) = 0.95
–compute
t = 1.96 (this is close to the z value for 95% since the sample size was large, which we can then also use 2SE’s)
CI:=where SE =
CI = = 1.22 ± 0.040
CI= (1.22 – 0.040, 1.22 + 0.040) ≈ (1.18, 1.26)
Using Statcrunch: Stat-t stats-one sample-with summary-mean 1.22, SD 0.65, n=1001, CI 0.95-compute
CI = (1.18, 1.26)
*One can be 95% confident that the true mean time spent eating and drinking each day is between 1.18 and 1.26 hours.
(c) Could the interval be used to estimate the mean amount of time a 9-year-old American
spends eating and drinking each day? Explain.No. The study was conducting using people who were 15 years old or more. Therefore, the point estimate of the mean is for a population that was 15 or older only.
Objective D : Determining the Sample Size
The sample size required to estimate the population mean, , with a level of confidence
within a specified margin of error, , is given by
whereis rounded up to the nearest whole number.
Note: *The -distribution approaches the standard normal - distribution as the samplesize increases. Z is used in this formula instead of t to approx. n.
Example 1: A researcher wanted to determine the mean number of hours per week(Sunday through
Saturday) the typical person watches television. Results from the Sullivan Statistics Survey
indicate that hours.
(a) How many people are needed to estimate the number of hours people watch television
per week within 2 hours with 95% confidence?
The standard deviation is hours. Note the mean was not provided.
Want CI:
2 = Can use z instead of t, z = 1.96 or t = 2
n = 54.02 round up to 55 people
55 people need to be surveyed per week so that the margin of error is within 2 hours at (at a 95% confidence level).
(b) How many people are needed to estimate the number of hours people watch television
per week within 1 hour with 95% confidence?
Let’s do this one using Statcrunch: Stat-z stat-one sample- power/sample size-select ‘confidence interval width’-confidence level 0.95, SD 7.5, width 2-compute
n = 217
(c) What effect does doubling the required accuracy have on the sample size?
If you want to be more accurate (within 1 hour instead of within 2 hours), increase the sample size. In this case it was increased at a ratio of≈ 4. If you double the accuracy, the sample has to be 4 times as large.
Chapter 9 Estimating a Population Standard Deviation (Supplementary Materials)
Finding CI for standard deviations
Objective A : Point Estimate
The best point estimate of the population variance,, is the sample variance,.
Objective B : Chi-Square Distribution
Example 1: Use StatCrunch to find the critical values and for the given level of confidence
and sample size.
(a) 90% confidence,
Stat Calculators Chi-Square DF 22 (n-1)
Between - P(___≤x ≤ ____) = 0.90 compute
The critical values are 12.338 and 33.924. (the ‘z’ values)
Objective C : Confidence Interval for a Population Variance or Standard Deviation
of the values of will lie between and . ( Recall: )
To find a confidence interval about , take the square root of the lower bound and upper bound.
Example 1: A simple random sample of size is drawn from a population that is known to be normally distributed. The sample variance, , is determined to be 19.8. (Thus the standard deviation is ).
(a) Use StatCrunch to construct a 95% confidence interval for if the sample size, , is 10.
Stat Variance Stats One Sample with summary Sample variance: 19.8, sample size: 10 Confidence interval for : 0.95 compute and record the results.
95% confidence interval results:
σ2: Variance of population
σ2 / 19.8 / 9 / 9.367722 / 65.99048
*One can be 95% confident that the true variance is between 9.37 and 65.99.
(b) If the sample size is increased to n = 25, how does increasing the sample size affect the width of theinterval?It will decrease the width (becomes narrower).
(c) If the confidence level is increased to 99%, how does increasing the level of
confidence affect the width of the confidence interval?
The interval becomes wider if you want higher confidence.
Example 2: Travelers per taxes for flying, car rentals, and hotels. The following data represent the total
travel tax for a 3-day business trip in eight randomly selected cities. It was verified that the
data are normally distributed. Use StatCrunch to construct a 90% confidence interval for the
standard deviation travel tax for a 3-day business trip. Interpret the interval.
First we need to compute the variance since the raw data has been provided.
Stat Input given data Summary Statistics Columns Var1 Variance compute
Summary statistics: