Review 3
Chapter 8
- A statistic any quantity computed from values in a sample (for example, , s, the sample median, the sample interquartile range and so on).The distribution of a statistic is called its sampling distribution.
A population parameter any quantity computed from values in a population (for example, , , the population median, the population interquartile range and so on).
- The difference between a statistic and a population parameter.
(1) A statistic is a sample characteristic, whereas a population parameter is a population characteristic.
(2) The observed value of a statistic varies from sample to sample. However, a population parameter is a fixed number, which is generally unknown.
2. Properties of the sampling distribution of
Let denote the mean of the observations in a random sample of size n from a population having mean and standard deviation . Denote the mean value of the distribution by and the standard deviation of distribution by . Then the following rules hold.
Rule 1: =
Rule 2: =.
Rule 3: When the population distribution is normal, the sampling distribution of is also normal for any sample size n. Thus, the standardized variable
has the standard normal (z) distribution.
Rule 4: (Central Limit Theorem) When n is sufficiently large (n≥30), the sampling distribution of is well approximated by a normal curve, even when the population distribution is not itself normal. So, the standardized variable
has approximately the standard normal (z) distribution.
- Properties of the sampling distribution of p
Let p be the proportion of S’s in a random sample of size n from a population whose proportion of S’s is . Denote the mean value of p by p and the standard deviation of p by p. Then the following rules hold.
Rule 1: p =
Rule 2:
Rule 3: (Central Limit Theorem) When n is large and is not too near 0 or 1 (n 10 and n(1- ) 10), the sampling distribution of p is approximately normal. Thus, the standardized variable
has approximately the standard normal (z) distribution.
Chapter 9
4. A point estimate of a population characteristic is a single number computed from sample data and represents a plausible value of the characteristic. A point estimate is obtained by (i) selecting an appropriate statistic; (ii) computing the value of the statistic for the given sample.
A statistic whose mean is equal to the value of the population characteristic being estimated is said to be an unbiased statistic. A statistic that is not unbiased is said biased.
- Criteria for choosing among competing statistics
a)First we choose an unbiased statistic if there is one.
b)If several unbiased statistics could be used for estimating a population characteristic, we choose the one with the smallest standard deviation.
- Statistics used to estimate some important population characteristics
Population characteristic to be estimated / Statistic to use / Unbiasedness
Population proportion, / p / Unbiased
Population mean, / / Unbiased
Population variance, 2 / s2 / Unbiased
Population standard deviation, / s / Biased
Population median / Sample median / Unbiased if symmetric
Biased if skewed
7.Aconfidence interval for a population characteristic is an interval of plausible values for the characteristic. It is constructed so that, with a chosen degree of confidence, the value of the characteristic will be captured inside the interval.
The confidence level associated with a confidence interval estimate is the success rate of the method used to construct the interval.
The standard error of a statistic is the estimated standard deviation of the statistic.
If the sampling distribution of a statistic is normal (approximately),the bound on error of estimation, B, associated with a confidence interval is
(z critical value)(standard deviation of the statistic).
8. The large-sample confidence interval for
When
(1) p is the sample proportion from a random sample, and
(2)the sample size n is large (np 10 and n(1-p) 10)
the general formula for a confidence interval for a population proportion is
p (z critical value)
The desired confidence level determines the z critical value. The three most commonly used confidence levels, 90%, 95%, and 99%, use z critical values 1.645, 1.96, and 2.58, respectively.
9. The sample size required to estimate a population proportion to within an amount B with a confidence level is
n = (1-)
The value of may be estimated using prior information. In the absence of any such information, using = .5 in this formula gives a conservatively large value for the required sample size.
10. The one-sample z confidence interval for
When
1. is the sample mean of a random sample,
2.the population distribution is normal or the sample size n is large (generally n 30), and
3.the population standard deviation is known
the formula for a confidence interval for a population mean is
(z critical value) ()
11. Let x1, x2, , xn be a random sample from a normal population distribution. Then the probability distribution of the standardized variable
has the t distribution with n-1 df.
12. The one-sample t confidence interval for
When
1.is the sample mean of a random sample
2.the population distribution is normal or the sample size n is large (generally n 30), and
3.the population standard deviation is unknown
the formula for a confidence interval for population mean is
(t critical value) ()
where the t critical value is based on n-1 df, which can be found by Appendix Table 3 on page 708.
13. The sample size required to estimate a population mean to within an amount B with a confidence level is
n =.
If is unknown, it may be estimated based on previous information or, for a population that is not too skewed, by using (range)/4.
14. Important examples in the Notes:
Examples: 8.1, 8.2, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4.
15.Exercise in class:A random sample of n = 12 four-year-old red pine trees was selected, and the diameter (in inches) of each tree's main stem was measured. The resulting observations are as follows:
11.3 10.7 12.4 15.2 10.1 12.1 16.2 10.5 11.4 11.0 10.7 12.0
(a)Give a point estimate of , the population mean diameter.
(b)Give a point estimate of the population median diameter.
(c)Give a point estimate of , the population proportionof trees whose main stem diameters are at least 12 inches.
(d)Compute a point estimate of σ, the population standard deviation of main stem diameter.
(e)Suppose that the diameter distribution is normal. Then the 90th percentile of the diameter distribution is μ+1.28σ (so 90% of all trees have diameters less than this value). Compute a point estimate for this percentile.