2.2 Normal Distributions (Bell Curve)
In many natural processes, random variation conforms to a particular probabilitydistribution known as the normal distribution, which is the most commonly observedprobability distributions. The normal curve was first used in the 1700’s by Frenchmathematicians and early 1800’s by German mathematician and physicist Karl Gauss.The curve is known as the Gaussian distribution and is also sometimes called a bell curve.
Normal curves
Curves that are symmetric, single-peaked, and bell-shaped. They are used to describe normal distributions.
The mean is at the center of the curve.
The standard deviation controls the spread of the curve.
The bigger the St Dev, the wider the curve.
There are roughly 6 widths of standard deviation in a normal curve, 3 on one side of center and 3 on the other side.
all have the same overall shape described by mean(μ) and standard deviation (σ).
Empirical Rule (68/95/99.7 Rule)
68% of observations are within 1 σ of μ (approx.!!! Really .6827)
95% of observations are within 2 σ of μ
99.7% of observations are within 3 σ of μ The questions about “area”, “percent”, “relative frequency” are answered.
EXAMPLE 1:The distribution of the heights of women is normal with mean of 64.5 and a standard deviation of 2.5. What percent of women are in the following ranges? :
1) P(x < 64.5) =2) P(x < 69.5) =3) P(x > 62) =
4) P(x > 57) =5) P(57 < x < 67) =6) P(59.5 < x <67) =
Notations: N(μ, σ); example above is N (64.5, 2.5)
FYI:dx Homework p137 23 - 26
What if the area you are interested in is not 1 or 2 standard deviations away from the mean? (reality…)
Standard Normal Distribution (z)
The conversion z = x - μchanges normal distributions into standard normal distribution.
σ
standard normal distributions are N(0 , 1) and use table A.
z-score -how many standard deviations away from the mean a score is & in what direction.
If sample data, what would the z-score formula look like? z =
EXAMPLE: Stat test scores are: 92, 91, 85, 77, 79, 88, 99, 69, 73, 84 If you scored ____, how did you do relative to the class?
a) 91:
b) 88:
c) 73:
EXAMPLE: A student took a math test and got an 80. He took a Latin test and got a 90. If the math scores had a mean of 70 with a standard deviation of 8 and Latin had a mean of 95 with a standard deviation of 3, in which class did he do relatively better?
Homework p 118 1-4
To find the approximate probability of the test score from the example above, the z-scores need to be looked up on Table A or using the calculator.
Example using Table A:
What proportion of all young women are greater than 68 inches tall, given that the distribution of heights for all young women follow N(64.5, 2.5)?
Step one – State: P(x > 68) on N(64.5, 2.5) Draw and label a normal curve
Step two – standardize x and label picture with z-score
z =
Step three – find the probability by using Table A, and the fact that the total area is equal to 1.
Step Four: Write a conclusion:
The proportion of young women that are______than _____ inches is approximately ______.
Use Table A to convert the following z-scores to probability. Draw a picture!!
1) P(z < 2.3) =2) P(z-1.52) = 3) P(z > -0.43) =
4) P(z > 3.1) =5) P(-1.52 < z < 2.3) =6) P(-3 < z < 3) =
Example of whole process:
A man’s wife is pregnant and due in 100 days. The corresponding probability densitydistribution function for having a child is approximately normal with mean 100 andstandard deviation 8. The man has a business trip and will return in 85 days and have to
go on another business trip in 107 days.
What is the probability that the birth will occur before his second trip?
1. Told births follow an approximately normal distribution.
2. Want:
3. Compute:
Now have: 107
Table gives:
Or
on calculator: normalcdf ( –10000,107,100,8) gives ______
4. There is about an ______chance that the baby will be born before the second
business trip.
EXAMPLE #3:
For 14 year old boys, cholesterol levels are ~ N(170,30).
a) What percent of boys have a level of 240 or more?
b) What percent of boys are between 170 and 240?
Normal Distribution Calculations
Process:
1) Normality—table is for normal distributions (or at least approximately normal distributions only.)
2) state in terms of x and draw the curve. Label with µ,σ, x
3) standardize with new graph (turn x into z).
4) use table A or calculator: normalcdf (lowerbound, upperbound, µ, σ)
5) answer the question (remember if the distributions is approximately normal, you have an approximate probability).
HW p 118 1 – 4, p 121 6 – 8 (not 7d)
Finding a Data Value from a z-score:
Z = x =
He is able to cancel his second business trip, and his boss tells him that he can returnhome from his first trip so that there is a ______chance that he will make it back for thebirth. When must he return home?
1. Told that distribution of births is approximately normal
Hint: use table backwards (on calculator use invNorm ( area, mean, SD])
2. We are given the probability and we want the raw score (day to return). First, rememberthat if there is a ______chance that he will make it on time, then there is a ______chance that he will not (table gives only values “less than”).
Probability statement: P(X < ? ) = ______
Use the table in reverse—find a z-score that gives .01 as the probability.
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
-2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
-2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
-2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
Search for the probability value that is closest to ______and find ______and
______. Since ______is closer to ______, use this value.
The corresponding z-score is -2.33. Now find the x that produces this z.
3. Have: -2.33 x -100
8
x =
or
on calculator: invNorm( )
4. He must return from his business trip in _____days.
Note: All four steps must still be shown. The calculator is only replacing the z-calculation. x
EXAMPLE #4:
SAT-V scores are ~ N(505,110)
1) How high must a student score to be the 30th percentile?
2) How high must a student score to get in the top 10%?
3) What scores contain the middle 50% of scores?
HW p 142 29 – 30, p 147 31 – 36 (not 32b)
Assessing Normality
The normal table (and normalcdf) ______becauseif the distribution is not approximately normal, the probabilities will be wrong. Sometimeswe are told we have a normal distribution. Sometimes we are given data and can usehistograms (dotplots or stemplots) to check for normality. It often easier to use normalprobability plots and look for linearity—______.
Normal probability plots give a visual way to determine if a distribution is approximatelynormal. These plots are produced by
doing the following:
1. The data are arranged from smallest to largest.
2. The percentile of each data value is determined.
3. From these percentiles, normal calculations are done to determine their correspondingz-scores.
4. Each z-score is plotted against its corresponding data value.
If the distribution is close to normal, the plotted points will lie ______.
Systematic deviations from a line indicate a non-normal distribution. In the first examplebelow, candy bar weights, an approximate normal distribution is shown.
Weights of Mounds Candy Bars
Computer output of a normal probability plot shows lines as boundaries—if the data fallswithin the lines, it is approximately normal.
In this example, the histogram and the normal probability plot both show that this data isnot approximately normal.
Assessing Normality
Method 1:1) make a histogram or stemplot to check for big outliers, skews, gaps, etc...
2) calculate x + s, use 68/95/99.7 rule to see if it is normal.
Method 2:1) make a normal probability plot (also-normal quantile plot)
You have a plot z vs. x
2) if the plot is close to a line, it is close to normal.
Using the calculator: STATPLOT, bottom right graph
right skew: largest observations are above a line drawn through the body of the data.
left skew: smallest are below the line.
EXAMPLE #5:
Is the following data normally distributed? Use both methods to check:
550561488507526555536529558565
557553562529544534579510527539
542547563534546530575568585550