Chapter 2

Probability

2.1Introduction

When we study Statistics, we must learn some basic concepts of Probability. Almost all of Inferential Statistics is grounded in the concepts of probability distributions. And of course, it is hard to understand the concepts of probability distributions without understanding the concepts of probability first. We will therefore learn some basic concepts of probability in this chapter so that in the next chapter we can understand the concepts of probability distributions, so that in future chapters we can understand inferential statistics.

The study of Probability is basically the study of chance phenomenon. The world that we live in is full of uncertainty. The world is always unfolding and one or more outcomes out of many possible outcomes occur at various times. For example, as a day unfolds, it may or may not rain. It may become warmer or cooler. It may become freezing cold or extremely hot. One or more of many possible outcomes will occur. As another example, suppose the average time for you to commute to work is 20 minutes. On a given day it may take you only 15 minutes or it may take you 28 minutes. Each time you set out to drive to work, you don’t know how long it will take, because it depends on traffic and traffic lights and many other factors that depend upon chance. When the stock market opens, it may go up or go down. If it goes up, it could go up by 1% or 2% or some other percent. There is always an element of chance that determines which outcome or outcomes occur. An understanding of the concepts of probability helps us understand our world better and hence allows us to make more informed decisions.

2.2 Experiments, Events and Outcomes

When studying probability, it is important to understand the difference between an experiment, an event and an outcome. An outcome is just the observed value of some variable in an experiment. An experiment results in one or more outcomes. A collection of one or more outcomes is called an event. We usually talk about probability of an event. We may also talk about the probability of an outcome. If an event consists of only one outcome then there is no difference between the probability of an event or the probability of an outcome. The set of all possible outcomes is also called collectively exhaustive set of outcomes of an experiment. Examples of experiments, events and outcomes:

Experiment / Set of all possible outcomes in the experiment (Collectively Exhaustive outcomes) / Event (Defined as) / Outcomes in the Event
Flip a coin once / Head, Tail (2 outcomes) / Head appears / Head (1 outcome)
Flip a coin twice / HH, TT, HT, TH (4 outcomes) / One Head appears / HT, TH (2 outcomes)
Flip a coin twice / HH, TT, HT, TH (4 outcomes) / Two Heads appear / HH (1 outcome)
Roll a six-sided die / 1,2,3,4,5,6 (6 outcomes) / Get a number below 3 / 1,2 (2 outcomes)
Roll a six-sided die / 1,2,3,4,5,6 (6 outcomes) / Get an odd number / 1,3,5 (3 outcomes)
Draw a card from a deck of 52 playing cards / 52 possible outcomes (no need to list them all here) / A Queen is drawn / Queen of Hearts or Diamonds or Clubs or Spades (4 outcomes)
Watch the gender of the next person entering the door / Male, Female (2 outcomes) / Male / Male
Watch the weather for the next four hours / No rain, no snow, light drizzle, light rain, heavy rain, light snow, heavy snow, icy rain, No rain and sunny, no rain and cloudy, etc. / Precipitation / Light drizzle, light rain, heavy rain, light snow, heavy snow, icy rain

So if my experiment is to flip a coin twice, there are four possible outcomes in this experiment (HH, TT, HT, TH). We can define many different events. If we define our event as two heads appear, there is one possible outcome in that event (HH). If we define our event as getting only one head, then there are two possible outcomes in that event (TH, HT). If we define our event as getting at least one head, then there are three possible outcomes in that event (TH, HT, HH).

2.3 Calculating Probability of an Event

There are three main approaches for calcuating probabilities – (1) The classicalapproach, (2) the relative frequencyapproach and (3) the subjective approach. The appropriate approach to use depends on the nature of the event. Some types of events lend themselves to the classical approach, others to the relative frequency approach and some to the subjective approach. For example, the probability of obtaining 2 heads in 2 flips of a coin can be obtained using the classical approach. The probability that 3 or more students in a particular class of 30 students will receive a grade of A can be calculated using the relative frequency approach. The probability that the stock market will gain at least 10% in value during the next 12 months would require a subjective approach.

The classical approach

The classical approach is also called the theoretical approach. There is a straightforward formula:

Probability(The Event) = (# of all possible outcomes in the event) /( # of all possible outcomes in the experiment)

This formula assumes that each outcome is equally likely. For situations in which the equal likelihood assumption of outcomes holds, this formula can be used, otherwise this formula should not be used.

An Example of the classical approach: Let our experiment be: Roll a six-faced dice once. Let the event of interest be: obtaining an even number. There are three possible outcomes in this event (2, 4 and 6). There are six possible outcomes in this experiment (1,2,3,4,5 and 6). All these outcomes are equally likely, so we can apply the classical approach formula.

The probability of the event of getting an even number = 3/6 or 0.5 because the number of all possible outcomes in the event is 3 and the number of all possible outcomes in the experiment is 6.

Another Example of the classical approach: Let the experiment be: Flip a coin two times. Let the event of interest be: getting two heads. There is only one possible outcome in this event (HH). There are four possible outcomes in this experiment (HH, TT, HT and TH), each with equal likelihood. The probability of the event of obtaining two heads is therefore ¼.

Another Example of the classical approach: Let the experiment be: Toss three coins once (or one coin thrice). The possible outcomes are HHH, HHT, HTH, HTT, THH, THT, TTH, TTT or 8 in number, each with equal likelihood. Let event A be defined as obtaining three Heads. P(A) = 1/8 or 0.125. Let event B be defined as obtaining two heads. P(B) = 3/8 or 0.375. Let event C be defined as getting more than one head. P(C) = 4/8 or 0.5. Please make sure you understand how these probabilities.

The Relative Frequency Approach:

In the relative frequency approach, we use historical data, if available, to compute probabilities. Suppose we want to know the probability that a student will obtain an A grade in a particular class. Why can we not use the classical approach in this example? We cannot use the classical approach because not every grade is equally likely. If there are five grades (A, B, C, D and F), we cannot say that the probability of getting an A grade is 1/5th. If it was, then we could use the classical approach. But getting a grade is not a random phenomenon. It can be controlled by how much you study and how well you understand the concepts in the course. So we have to resort to the relative frequency approach. We will assume that we have some historical data. We will look at the historical percentage of students who received an A grade by this professor for similar courses in the past. Let’s say it is 10%. So the probability of a student selected at random getting an A grade is 10%.

Example of relative frequency approach: Based on several years of historical data, suppose the relative frequency table for the grades received by students in a particular course is as follows:

Table 2.1

Age / Relative Frequency
A / 0.2
B / 0.3
C / 0.2
D / 0.1
F / 0.1

Based on this table, we can say that the probability that a randomly picked student will get either an A or a B grade is 0.5.

Another example of relative frequency approach: Based on several years of historical data, suppose the relative frequency table for the age of customers entering a particular store is as follows:

Table 2.2

Age / Relative Frequency
0 - 9 / 0
10-19 / 0.1
20-29 / 0.2
30-39 / 0.5
40-49 / 0.1
50+ / 0.1

The probability that the next person walking in this store is in their 30s is 0.5. The probability that the next person walking in this store is 50+ is 0.1. The probability that the next person walking in the store is below 30 is 0.3. In the relative frequency approach, we basically use the relative frequency of an outcome as a proxy for the probability of that outcome.

The Subjective Approach

There are many events for which neither the classical approach can be applied, because the outcomes are not equally likely, nor the relative frequency approach can be applied, because there is no historical data. For example – what is the probability that the stock market will give a return of 10 or more percentage this year? It is hard to have any historical data of the exact economic environment prevalent this year, because exact set of all economic factors hardly every repeat. In such cases, we rely on the subjective judgments of some experts.

Examples of subjective approach

What is the probability that a particular team will win tomorrow’s game?

What is the probability that your firm will earn a profit of more than 1 million dollars this year?

2.4 Some Basic Laws or Probability

Probability is either expressed as a fraction between 0 and 1 (including 0 and 1) or as a percentage between 0 and 100 (including 0 and 100). A probability of 1 (or 100%) implies a certain event. A probability of 0 (or 0%) implies that the event will certainly not occur.

Probability of Complements, Unions and Intersections of Events

A complement of an event is the opposite of the event. If the event A is defined as obtaining a head in one toss of a coin, then the complement of event A would be not obtaining a head in one toss of a coin. A complement of an event A is represented as AC. It is easy to see that P(A) + P(AC) = 1. Therefore, P(AC) = 1 – P(A).

A Union of two events implies that either of the two events occurs in an experiment. Union of two events A and B is denoted as A U B. For example, suppose the experiment is to roll a dice once. Suppose event A is defined as obtaining an even number and suppose event B is defined as obtaining a number less than or equal to 3. The possible outcomes in event A are 2,4 and 6 and the possible outcomes in event B are 1,2 and 3. The possible outcomes in A U B are 1,2,3,4 and 6. P(A U B), in our example is 5/6.

An Intersection of two events implies that both events occur. Intersection of two events A and B is denoted as A ∩ B. In the above example, where event A’s possible outcomes are 2,4 and 6 and event B’s possible outcomes are 1,2 and 3, there is only one possible outcomes in A ∩ B whose value is 2. The probability of A ∩ B, therefore, is 1/6.

Need for Counting Formulas for the Classical Approach

In the classical approach, we need to calculate the total number of possible outcomes in an experiment and total number of outcomes in an event. In the examples above, we used small numbers such as one or two flips of a coin or one roll of a die. It was easy to list all possible outcomes for the experiment and also for the event of interest. But what if our experiment involves say flipping a coin 20 times and our event was defined as obtaining 8 heads. Now it becomes impractical to list all possible outcomes as there will be over a million possible outcomes in the experiment and too many outcomes in the event to count. So we need some formulas for counting.

Factorials and Combinations

What is the factorial of a number n? Factorial of n is basically, n * (n-1) * (n-2) * … * 1. Factorial is denoted by the exclamation mark (!) written after the number. For example, n! = n*(n-1)*(n-2)*…*1. As another example, 3! = 3*2*1 = 6. As another example, 5! = 5*4*3*2*1 = 120. Calculating factorials manually, beyond 5 or 6 can become tedious. But Excel provides a very easy way to compute factorial using the =FACT() function. So in Excel, =FACT(5) will give you 120.

Factorials are used in formulas for counting. For example, the number of ways r heads show up in n flips of a coin = n!/(r!(n-r)!)

In mathematical notation, we call this nCr i.e, number of combinations with r heads in n flips of a coin. Or number of combinations with r successes in n trials.

So nCr = n!/(r!(n-r)!). This is also read as “n Choose r” or “n Combination r”.

Example: Number of ways of obtaining 2 heads in 3 flips of a coin = 3C2 = 3!/(2!(1!)) = 6/2 = 3.

Another Example: Number of ways of obtaining 8 heads in 20 tosses of a coin =

20C8 = 20!/(8!12!) = 20*19*18*17*16*15*14*13/(8*7*6*5*4*3*2*1) = 19*17*15*14*13/7 = 125,970.

Clearly, without the formula, it would have been quite cumbersome to figure out that there are 125,970 ways of getting 8 heads in 20 tosses of a coin.

Using Excel: Excel makes it very easy to figure out nCr. We use the function called =COMBIN(n,r)

2.5 Computing Probabilities for Two or More Events

Sometimes we are interested in dealing with two or more events simultaneously. For example, when dealing with two events, we may want to know the probability that both events occur or one or the other occurs etc. To handle such questions we need to first understand the concepts of Mutually Exclusive and Independent events.

Mutually Exclusive Events

Two events are said to be mutually exclusive if both events cannot occur simultaneously. The simplest example of two mutually exclusive events is: Event A is defined as obtaining a Head in a coin toss and event B is defined as obtaining a Tail. Clearly, events A and B can never occur simultaneously and are hence mutually exclusive.

An example of two events that are not mutually exclusive:

If the experiment is rolling a dice once and if event A is defined as obtaining an even number and event B is defined as obtaining a number less than 3. If the outcome of the experiment is 2, then both events A and B have occurred at the same time and hence they are not mutually exclusive.

Probability formulas for Union of two events:

If two events, A and B are mutually exclusive then P(A U B) = P(A) + P(B)

If two events, A and B are not mutually exclusive then P(A U B) = P(A) + P(B) – P(A ∩ B)

P(A U B) can also be written as P(A or B)

Example of union of two mutually exclusive events:

Suppose I roll a dice and define event A as getting a number less than 3 and event B as getting a number greater than 3. Events A and B are clearly mutually exclusive. P(A) = 2/6 = 1/3. P(B) = 3/6 = ½.
P(A U B) = 1/3 + ½ = 5/6. If you think about it, there are 5 possible outcomes in A U B: 1,2,4,5,6. So P(A U B) = 5/6.

Example of union of two not mutually exclusive events:

Suppose I roll a dice once and define event A as obtaining an even number and event B as obtaining a number less than 3, then P(A) = ½, P(B) = 1/3, P(A ∩ B) = 1/6. Using the probability formula, we can see that P(A U B) = ½ + 1/3 – 1/6 = 3/6+2/6 – 1/6 = 4/6.

Probability formulas for Intersection of two events:

If two events A and B are mutually exclusive, then P(A ∩ B) = P(A and B) = 0

Example: Suppose I roll a dice and define event A as getting a number less than 3 and event B as getting a number greater than 3. Events A and B are clearly mutually exclusive. There is no outcome common in A and B, therefore P(A ∩ B) = 0

If A and B are not mutually exclusive then P(A ∩ B) = # of outcomes common in both A and B/# of outcomes in the experiment.

Example: Suppose I roll a dice once and define event A as obtaining an even number and event B as obtaining a number less than 3, then P(A) = ½, P(B) = 1/3, P(A ∩ B) = 1/6 because there is one outcome common between events A and B.

Independent Events

Two events A and B are said to be independent if the occurrence of one has no effect on the probability of occurrence of the other. For example, if event A is defined as obtaining a head in a coin toss and event B is defined as getting an even number in a roll of dice. Clearly, the two events are independent of each other.

If events A and B are independent, then, P(A ∩ B) = P(A and B) = P(A).P(B)

Example: Suppose event A is defined as obtaining a head in a coin toss and event B is defined as getting an even number in a roll of dice. P(A) = 0.5 and P(B) = 0.5. The probability that both events A and B occur is = P(A and B) = 0.5 * 0.5 = 0.25

Note that if events A and B are mutually exclusive, they cannot be independent. If two events A and B are not independent, we talk about conditional probability.

Conditional Probability

If two events are not independent, then the occurrence of one event changes the probability of occurrence of the other event. If A and B are two events and if P(A) is the probability of event at A and P(B) is the probability of event B and suppose it is known that event B has occurred, then the probability of A is given by P(A | B) and it is read as Probability of A, given B.

P(A | B) = P(A ∩ B) / P(B)

Also,

P(B | A) = P(A ∩ B) / P(A)

From the above two formulas, we can also say that P(A ∩ B) = P(A | B). P(B) = P(B |A).P(A)

If two events A and B are independent, then P(A|B)=P(A), similarly, P(B | A) = P(B).

You can see that for two independent events A and B, P(A ∩ B) = P(A).P(B)

Example: Suppose I roll a dice two times and define event A as getting a total of 7 in both the rolls added together and define event B as getting a number less than 3 in the first roll. Here, P(B) = 1/3 and P(A) = 1/6 and P(A ∩ B) = 2/36 or 1/18. P(A | B) = 1/6. Using the formula, P(A | B) = P(A ∩ B) / P(B) = 1/18 / 1/3 = 1/6.

A Summary of the above formulas

Mutually Exclusive
Yes / No
Independent / Yes / P(A ∩ B) = P(A).P(B)
Not applicable / P(A U B) = P(A) + P(B) - P(A).P(B)
P(A |B) = P(A)
No / P(A ∩ B) = 0 / P(A ∩ B) = use classical approach
P(A U B) = P(A) + P(B) / P(A U B) = P(A) + P(B) - P(A ∩ B)
P(A | B) = 0 / P(A |B) = P(A ∩ B)/P(B)

2.6 Bayes’ Rule

Bayes Rule is used to compute P(A | B) if P(B |A) and P(B|AC) and P(AC) are known.

P(A|B)=P(B|A).P(A)/(P(B|A).P(A)+P(B|AC).P(AC))

Similarly,

P(B|A)=P(A|B).P(B)/(P(A|B).P(B)+P(A|BC).P(BC))

Example of Bayes Rule:

Suppose that 5% of the people with blood type O are left handed, 10% of those with other blood types are left handed and 40% of the people have blood type O. If you randomly select a left-handed person, what is the probability that he or she will have blood type O?

Suppose event A is that a person has blood type O and event B is that the person is left handed. The question is what is P(A | B).

We know that P(B | A) = 0.05 and P(A) = 0.40 and P(B | AC) = 0.10 and P(AC) = 0.60

According to Bayes rule, P(A | B) = 0.05*0.40/ (0.05*0.4 + 0.10*0.6) = 0.02/(0.02+0.06) = 0.02/0.08 = 2/8 = ¼.

Chapter Summary

  1. Probability is the study of chance phenomenon.
  2. We study probability so that we can understand probability distributions, which are the basis for inferential statistics.
  3. We usually talk about probability of an event.
  4. An event is a set of some outcomes; an experiment has some possible outcomes.
  5. Probabilities can be estimated using the classical approach or the relative frequency approach or the subjective approach.
  6. The classical approach can be applied if the outcomes of an experiment are all equally likely.
  7. In the relative frequency approach, the relative frequency of an outcomes in an event in historical data becomes the probability of the outcomes in that event.
  8. We also want to estimate joint probability of two events – either union or intersection of two events. The probability calculation of two events depends on whether the events are mutually exclusive or independent.
  9. Two events are mutually exclusive if they cannot both occur simultaneously.
  10. Two events are independent if the occurrence of one event does not affect the probability of occurrence of the other event.
  11. If two events are mutually exclusive, they cannot be independent.
  12. Two non-mutually exclusive events may be independent or may not be independent.
  13. If two events are independent, they must not be mutually exclusive.
  14. If two events are independent, then we are often interested in calculating the conditional probability of an event given that the other event has already occurred.

1