Module III – A Bridge to Inference

Unit 6: Randomness, Probability, and Sampling Distributions

Randomness

Parameter: a number that describes the population. We do not generally know the values of parameters.

Statistics: a number we calculate from the sample data. We use a statistic to estimate the value of an unknown parameter.

Example: A car shop has a box full of ball bearings with mean diameter 2.5003 cm. This is within the specification for acceptance of the lot by the purchaser. By chance, an inspector chooses 100 bearings from the lot that have mean diameter 2.5009 cm. Because this is outside the specified limits, the lot is mistakenly rejected.

To distinguish between a sample and a population recall the following notation:

Mean of a pop.: A parameter

Mean of a sample: A statistic

The value of a statistic changes from sample to sample. In repeated random sampling, the value of the statistic varies – called sampling variability. The statistic varies in the short run, unpredictably. But is has a regular and predictable pattern in the long run.

The concept of probability

Something is random if individual outcomes are uncertain but there is a regular distribution of outcomes in a large number of repetitions.

The probability of an outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions

A Trial is independent when the outcome of one trial does not influence the outcome of another.

Probability Models:

Now that we know what randomness is, we want to describe the patterns that random phenomena exhibit. To describe the pattern, list all the possible outcomes and their probabilities.

Example: Tossing 2 coins.

All possible outcomes:

HH, HT, TH,TT where each set has ¼ chance of happening.

Sample Space, S, of a random phenomena is the set of all possible outcomes.

An event is any outcome (or set of outcomes) of a random phenomenon. An event is a subset of the sample space.

Example: Referring back to our last example.

S = {HH, HT, TH, TT}

EVENT

A probability model is a mathematical description of a random phenomenon consisting of a sample space S and a way of assigning probabilities to events.

Probability Rules

If A is any event, the probability of A is written as P(A).

Example: P(H) or P(T)

PROBABILITY RULES:

  1. Let A be a subset of the set S 0P(A)1
  1. P(S)=1
  1. P(A does not occur) = 1 –P(A)
  1. Two events A and B are disjoint(or mutually exclusive)if they have no outcomes in commons in common. If A and B are disjoint, then P(A or B) = P(A) + P(B)
  • Addition rule for disjoint events (they have no outcomes in common)

Assigning Probabilities – Finite Number of outcomes

Assign a probability to each individual outcome. Each probability must be a number between 0 and 1, and their sum must be 1.

The probability of any event is the sum of the probabilities of the outcomes making up the event.

Definition: Two events A and B are exhaustive of the sample space if, combined they cover all possible outcomes in the sample space.

Example: Rolling a dice

S = {1, 2, 3, 4, 5, 6}

123456

1/61/61/61/61/61/6

P(5) = 1/6

P(5 or 6) = P(5) + P(6)

= 1/6 + 1/6

disjoint = 2/6 = 1/3

P(not 5 or 6) = 1-P(5 or 6)

= 1 – 1/3

= 2/3

P(even no.) = P(2)+P(4)+P(6)

= 1/6 + 1/6 + 1/6

= 3/6 = ½

Example: 4.21

All human blood can be typed as one of O, A, B, or AB, but the distribution of the types varies a bit with race. Here is the probability model for the blood type of a randomly chose African American:

Blood TypeOABAB

Probability0.490.270.20 ?

What is the probability of type AB blood?

= 1 – (0.49 + 0.27 + 0.20)

= 1 – 0.96

= 0.04

Maria has type B blood. She can safely receive blood transfusions from people with blood types O and B. What is the probability that a randomly chosen African American can donate blood to Maria?

P(O or B) = P(O) + P(B)

= 0.49 + 0.20

= 0.69

Multiplication Rule for Independent Events

Two events A and B are independent in knowing that one occurs does not change the probability that the other occurs.

If A and B are independent:

P(A and B) = P(A)P(B)

Note: The multiplication rule holds only if A and B are independent.

Example: dice

P(5 and 6) = (1/6)(1/6) = 1/36

Example of independence

P(5 or 6) = 1/6 + 1/6 = 1/3

Example of disjoint.

Using a Venn diagram we can also show a disjoint event.

Note: the multiplication rule

P(A and B) holds if A and B are independent but not otherwise.

The addition rule:

P(A or B) = P(A) + P(B) holds if A and B are disjoint but not otherwise.

If A and B are disjoint then the fact that A occurs tells us that B cannot.

The general addition rule for any two events:

We know that if A and B are disjoint events, then

P(A or B) = P(A) + P(B). The addition rule extends to more than two events that are disjoint in the sense that no two have any outcomes in common.

Here is the addition rule for any two events disjoint or not.

P(A or B)=P(A)+P(B)–P(A and B)

In the previous examples we look at if we tossed a coin four times we can record the outcome as a string of heads or tails. We are interested in:

  • Let X be the number of heads
  • The possible values are 0,1,2,3,4 if x is a random variable because its values can vary when the coin is tossed repeatedly. We can define a random variable as:

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

There are two main ways of assigning probabilities to the values of a random variable.

  1. Discrete random variable – x has a finite number of possible values
  2. The probability distribution of x lists the values and their probabilities

i.e.

Value of x / x1………xk
Probability / p1………..pk

The probabilities but be between 0 and 1 and the sum of the probabilities is 1.

Note: we can find the probability of any event by adding the probabilities that make up the event.

The mean of a discrete random variable X is defined to be:

and its variance is:

Example

Consider the distribution of the amount X (in $) won on a spin of a slot machine:

X -1 5 10 50

P(x) 0.93 0.05 0.015 0.005

P(X≥$10) = 0.015+0.005

P(X≥$10) = 0.02

P(X<$5) = 0.93

  1. Continuous random variables
  2. When there are an infinite set of values that we have no way of listing or counting the individual values.

Ex. Thermometer – an infinite number of values lie along the thermometer and we cannot count them all.

We define a continuous random variable x, x takes all values in an interval of numbers.

  • The probability distribution of x is described by a density curve. The probability of any event is the area under the density curve and above the values of x that make up the event. The probability model for a continuous random variable assigns probabilities to an interval of outcomes rather than to individual outcomes.

Note: As we know, individual outcomes are assigned a probability of 0.

Normal distributions are probability distributions, based on a very large data set. Recall N(µ,σ), if x has the N(µ,σ) distribution then the standardized variable is z=x-µ

σ

is a standard normal random variable having the distribution N(0,1).

Example: heights of women appeared to be normally distribution with mean μ=64.5 inches and σ = 2.5 inches. We choose one women at random from the population and observe height x. Upon repeated sampling, we observe that the distribution of values of X is the same normal distribution.P(63  x  65)

=

= p(-0.6  Z  0.2)

= 0.5793 – 0.2743 = 0.3050

Example: The normal distribution with mean 6.84 and standard deviation 1.6 is a good description of the Iowa test vocabulary scores of seventh graders in Gary, Indiana. Let the random variable x be the Iowa test score of one Gary seventh grader chosen at random.

Write the event “ the student chosen has a score of 10 or higher” in term of x.

Find the probability of this event scores: N(6.8, 1.6)

P(x ≥ 10)

Sampling Distribution

We use the statistic (known) to estimate the unknown parameter μ and s to estimate the unknown parameter σ.

A SRS should represent the population fairly well, so the mean of the sample should be somewhere near the population mean μ.

But changes, depending on the sample.

Law of Large Numbers: Draw observations at random from any population with finite mean μ.

As the number of observations drawn increases, the mean of the observed values gets closer and closer to the mean μ of the population.

The sampling distributionof a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.

Mean and Standard deviation of

Suppose is the mean of a SRS of size n drawn from a large population with mean μ and standard deviation σ.

Then the mean of the sampling distribution of is μ and the standard deviation is .

Notes:

  • The sampling distribution has the same mean but smaller spread than the distribution of the observations in the population. .

Example:

X~N(μ, σ) where σ = 30 and n = 9 we will estimate μ with . Now the standard deviation becomes

=

  • Averages are less variable than individual observations.
  • For fixed σ, as sample size increases, gets smaller. The result of large samples are less variable than the results of smaller samples.

Note: To reduce variation by half, sample size has to increase by a multiple of 4.

Example: n = 36,

Now we will look at:

The Central Limit Theorem:

The shape of a probability distribution depends on the shape of the population distribution. If the population distribution is normal, then so is the distribution of the sample mean.

Sampling Distribution of a Sample Mean

If a population is distributed N(μ, σ), then the sample mean of n independent observations had the N(μ, ) distribution.

If the population is not normal, the distribution of changes shape as n increases.

The shape looks more and more like the normal distribution (as long as the population has finite σ).

When the shape of the population distribution is far from normal, a larger sample is needed for the distribution of to be close to normal.

In other words: Central Limit Theorem is saying… Draw a SRS size n from any population with mean μ and finite standard deviation σ. When n is large, the sampling distribution of is approximately normal, is approximately

N(μ, ).

Notes:

  • The CLT allows us to use normal probability calculations to answer questions about sample means (n large enough) even when the population is not normal.
  • The CLT says that the distribution of a sum or average of many small random quantities is close to normal
  • True even if the quantities are not independent
  • True even if they have different distributions
  • How large is large? n≥20 is a pretty good standard, but it really depends on how non-normal the population distribution is.

Example

NASA is producing a batch of nuts for their new space shuttle. The distribution of the diameters of the nuts is unknown, but it is know from past experience that them mean of the distribution is 1.8 mm and the standard deviation is 0.3 mm.

What is the probability of observing a sample of 100 nuts with mean less than 1.75?

NOTE: We use the Central Limit Theorem to find this probability. The sample size we have is sufficiently high (n≥20).

P(< 1.75)

= P(Z < )

= P(Z < )

=P(Z < )

= P(Z < -1.67) = 0.0475

Example: An educational researcher selects a random sample of 400 students’ scores from the population of scores on a national exam. The population mean is 485 points, and its standard deviation is 80 points. What is the probability that the sample of students have scores greater than 500?

P(> 500)

= P(Z > )

= P(Z > )

=P(Z > )

= P(Z > 3.75) = 0.00

Statistical Process Control

Statistical process control is a

collection of tools that when

used together can result in

process stability and variability

reduction

A process –is a chain of activities that turns inputs into outputs.

The goal is to make a process stable over time.

A Control chart is a statistical tool that monitors data over time.

A typical control chart:

A control chart has:

  • A horizontal line called the centre line (μ) around which the data vary.
  • Two horizontal lines called control limits and Upper control limit (UCL) and a lower control limit (LCL) at
  • Any that does not fall between the control limits gives evidence that the process is out of control.
To Make a control chart
1.Take samples of size n from the process at regular intervals. Plot the means x of these samples against the order in which the samples were taken

2.We know the sampling distribution of x under the process monitoring condistions is normal with mean µ and standard deviation σ/√n.

  • Draw a center line at µ

3.The 99.7% rule from the 68-95-99.7% rule for normal distribution says that, as long as the process remains in control, 99.7% of the values of x will fal between

x ±3(σ/√n)

Example: In a study of voter turnout,
15 people of voting age are randomly selected. The mean numbers of actually voted are listed below. Construct a control chart and determine whether the process is within statistical control. It is know that the mean number of voters 468.73 and the standard deviation is 83.

Mean numbers who voted:

608466552382536372

526398531364501365

551388491

LCL:

UCL:

CL = μ = 468.73


Binomial Distribution

We want a probability model for a count of successful outcomes. The binomial distribution is a common model.

Binomial Setting
  1. There are fixed number n observations.
  1. The n observations are all independent knowing one tells nothing about another.
  1. Each observation falls into one of only two categories – success or failure.
  1. The probability of success, p, is the same for each observation

Example: Toss a coin 4 times and count the number of heads.

X = # of heads

X is a random variable

The distribution of the count x of successes in the binomial setting is the binomial distribution with parameters n and p.

n is the number of observations

p is the probability of a success on any one observation (trial).

The count x can take on values from 0 to n.

Warning – not all counts have a binomial distribution. All of 1 – 4 must be satisfied.

Example: a couple has 4 kids, what is the probability that the first kid (x=1) is a boy.

P(boy) = 0.5

Binomial Probabilities

Binomial Coefficient

The number of ways of arranging x successes among n observations is given by the binomial coefficient.

  • for x = 1,……,n
  • is called “n choose x”
  • n! indicates a factorial

For any positive number n, its factorial (n!) is:

n! = n*(n-1)*(n-2)………3*2*1

Example:

2! = (2)*(1)=2

4! = (4)*(3)*(2)*(1) = 24

Note: 0!=1

  • if doing this by hand many factors will cancel out.

Example:

= = 15

Also note that is not a fraction . counts the number of different ways that k success can be arranges among n observations.

=

Binomial Probability

If x has the binomial distribution with n observation and probability p of success on each observation, the possible values of x are 0,1,2,…..,n. If k is any one of those values,

P(x) = px(1-p)n-x

P(x=4) = (0.6)4(1-0.4)5-4

= 0.2592

Example 5.22: A factory employs several thousands of workers, of whom 30% are Hispanic. If the 15 members of the union executive committee were chosen from the workers at random, the number of Hispanics on the committee would have the binomial distribution with n=15 and p=0.3.

  1. What is the probability that exactly 3 members of the committee are Hispanic?
  1. What is the probability that 3 or fewer members of the committee are Hispanic?

Answer:

  1. P(x=3) we know p = 0.3 and n=15

P(x) = px(1-p)n-x

P(x=3) = (0.3)3(1-0.3)15-3

=

=455*0.027*0.014 = 0.17

Therefore the probability of exactly 3 members of the community are Hispanic is 0.17 or 17%

P(x3)=

P(x=3)+P(X=2)+P(X=1)+P(X=0)

= (0.3)3(1-0.3)15-3 +…..+(0.3)0(1-0.3)15-0

= 0.0047 + 0.0305 + 0.0916 + 0.17

= 0.2968

Therefore the probability of three or fewer members of the community are Hispanic is 0.2968 or 30%

BINOMIAL MEAN AND STANDARD DEVIATION

If a count x has the binomial distribution with n observations and probability of success p, the mean and standard deviation of x are:

Note: X has a binomial distribution with parameters n and p.

Example: Previous example where

n = 15 p =0.3

So, we have

Note: these formulas only work for the binomial distribution!!!

Sample Proportions

In statistical sampling we often want to estimate the proportion of “successes” in a population. Our estimator is the sample proportion of successes:

^

= X/N

  • To distinguish between the proportion p and the count x. The count takes whole – number values between 0 and n, but a proportion is always a number between 0 and 1.
  • In the binomial setting, the count x has a binomial distribution. The proportion does not have a binomial distribution. We can however, do probability calculations about by restating them in terms of the count x and using binomial methods.

The mean and standard deviation of are:

µ =

σ =

Note: we will use it when the population is at least 20 times as large as the sample

  • in an SRS is an unbiased estimator of the population proportion p
  • The variability of about its mean, decreases as the sample size increases. Therefore a sample proportion from a large sample will usually be quite close to the population proportion p
  • The √n in the denominator means the sample size must be multiplied by 4 if we wish to divide the standard deviation in half.

A continuous random variable we have but do not discuss a lot about especially in this course is the uniform distribution. The uniform distribution looks like:

  • Has a constant height at 1 between a given interval and a height of 0 after.
  • To area under the density curve is 1, the area of a square wit base 1 and height 1. The probability of any event is the area under the density curve and about the event in question.

Example: Many random number generators allow users to specify the range of the random numbers to be produced. Suppose that you specify that the range is to be all numbers between 0 & 2. Call the random number generator Y. Then the density curve of the random variable Y has a constant height between 0 and 2 and a height of 0 elsewhere.

  1. What is the height of the density curve between 0 and 2.

Height = 1 / (2-0)

= ½

  1. What is the P(Y≤1)

= (1-0)* ½

= ½

  1. P(Y≥ 0.8)

=(2-0.8)* ½

= 0.6

1