Chapter 2

Probability

Introduction

This chapter introduces the concept of probability. It is a central part of statistics and one that gives many students second thoughts about dropping the course. Some of the basic concepts you will find to be straightforward (e.g., sample space, subjective and objective probabilities, complements of an event, etc.) while other concepts (joint and conditional probabilities and Bayes’ theorem) may prove to be somewhat confusing at first glance. Perseverance on your part will get you through the chapter and on to the more application-oriented topics in the subsequent chapters.

Applicable Excel Templates used in this Chapter:

Bayes Revision.xls

Contingency Table.xls

Probability of at Least 1.xls

Permutation & Combination.xls

Applicable MegaStat commands:

MEGASTAT→ Probability → Counting Rules

Applicable MINITAB commands:

None

2-7.

Sample Space
First Toss / Second Toss > First
1 / 2
3
4
5
6
2 / 3
4
5
6
3 / 4
5
6
4 / 5
6
5 / 6
6 / none

There are 36 possible outcomes tossing two dice.

There are 15 possible outcomes where the second

toss is greater than the first.

P(Second Toss > First) = 15/36 = 0.417

2-8. Let R be “exposed to radio advertisement.” Let T be “exposed to television advertisement.”

a. Then RÈT represents the event that a randomly selected person will be exposed to either a radio or a television advertisement, or both.

b.  Then RÇT represents the event that a randomly selected person will be exposed to both a radio and a television advertisement.

2-12. We are given that 5 million Blackberry users were unable to use their devices. We also know that there are 18 million users of handheld devices of this kind. If a user is chosen at random, what is the probability that their device will not work?

P(device not working) = 5MK / 18M = 0.2778

2-13.  Continuing with the Blackberry problem in 2-12, 3 million of the 18 million users could not use their devices as cellphones. An additional 1 million could not use their devices as either a cellphone or a data device. Determine the probability of a randomly selected user not being able to use their device as either a cellphone or a data device.

P(nonfunctioning cell phone) = 3/18 = 0.1667

P(nonfunctioning data device and nonfunctioning cell phone) = 1/18 = 0.0556

P(nonfunctioning data device or nonfunctioning cell phone) = 0.2778 + 0.1667 – 0.0556 = 0.3889

2-18.  Given that P(Detect 1st) = .98 and P(Detect 2nd) = .94 and P(Detect 1st and Detect 2nd) = .93, compute the probability that at least one of the two will be detected (that is, the 1st or the 2nd or both will be detected).

In general, P(A È B) = P(A) + P(B) - P(A Ç B).

Here, P(Detect 1st È Detect 2nd) =

P(Detect 1st) + P(Detect 2nd) - P(Detect 1st Ç Detect 2nd) = .98 + .94 - .93 = 0.99.

2-20. We are given age and sex data for 20 managers.

34F, 49M, 27M, 63F, 33F, 29F, 45M, 46M, 30F, 39M, 42M, 30F, 48M, 35F, 32F, 37F, 48F,

50M, 48F, 61F

A manager will be chosen at random.

a.  Compute the probability the manager will be either a woman or over 50 years old, or both

Þ (P(F È 50) = P(F) + P(50) - P(F Ç 50)

= = 0.60.

2-24.  Continuing with problem 2-12, determine the probability that a randomly chosen user could use their Blackberry device.

From 2-12, we know the probability of a nonfunctioning device is 0.2778. To calculate the probability of a functioning device, we need to use the complement rule:

P(functioning data device) = 1 – P(nonfunctioning data device) = 1 – 0.2778 = 0.7222

2-27. If a large competitor buys a small firm, the firm’s stock will increase with probability 0.85. In other words: P(stock will Rise | Bought by large firm) = 0.85.

The purchase of the company has a probability of 0.40 of taking place, or P(being Bought by a large firm) = 0.40.

The probability that the purchase will take place and the firm’s stock will rise is determined by the intersection of the two events, P(R Ç B), found by:

P(R Ç B) = P(R | B) P(B) = (.85)(.40) = 0.34

2-28.  If interest rates decrease, then the probability the market will go up is 0.80. Using the symbol “½“ for “given,” this may be written P(market goes up ½ interest rates decrease) = 0.80. That is, the conditional part is the interest rates; if they decrease, then the market goes up, so the given part is interest rates decreasing. Also, we assume that the probability is 0.40 that interest rates will decrease, which may be written P(interest rates decrease) = 0.40. Then, we wish to compute the probability that the market will go up and the interest rates go down, so we seek P(market goes up Ç interest rates go down). The conditional law is

P(A|B) = ,

so P(Market goes up | interest rates decrease) =,

or P(M.up | Int.down) = 0.80 = ,

and solving for the intersection, P(M.up Ç Int.down) = 0.80(0.40) = .32.

2-33. Given the following table of counts:

Price Increase / No Price Increase / Total
Paid / 34 / 78 / 112
Not Paid / 85 / 49 / 134
Total / 119 / 127 / 246

a. Compute the probability a randomly selected stock increased in price:

P(price increase) = = .484.

b. Compute P(paid dividends) = = .455.

c. Compute P(price increase Ç paid dividends)

= = .138.


d. Compute P(not paid Ç no price increase)

= = .199.

e. Given a price increase, compute the probability it also paid dividends:

P(paid dividends½price increase) =

= = = .285.

Another way to view this is from a reduced-space perspective.

Compute P(paid dividends½price increase) = = .286.

In the previous version we considered the 34 out of 246 compared to 119 out of 246, but in the reduced space version, we recognize that “given price increase” restricts us to the 119 which had a price increase, and then out of this 119, what proportion also paid dividends?

It was 34 out of 119, or a proportion of .286. The viewpoints are equivalent, and the answers differ due to rounding.

f.  P(increased in price½paid no dividends)

=

= = = .6343.

g. Compute P(price increase È paid dividends) (i.e., either price increase or paid dividends or both) =

P(price increase) + P(paid dividends) - P(price increase and paid dividend)

= = .801.

2-36. According to a report, 65% of Americans are overweight or obese. The problem asks you to determine the probability that in a group of five randomly selected Americans at least one is overweight or obese. How do we start? Let’s determine what we do know. First, a random sample of five people is selected, which implies independence. Second, it is given that 65% of all Americans are overweight or obese. If this is so, then we also know that 35% are not overweight or obese.

So how do we proceed? We could set up the entire sample space of all the possible outcomes for five people being overweight or obese, starting at none are overweight up to all are overweight, and then calculating the probability of each outcome and adding up the relevant probabilities. This is the more time consuming way to do it. Let’s first determine the probability that no one in the group of five is overweight or obese, which would be one of the outcomes in the sample space. The probability that none of the five selected people are overweight is:

P(not overweight) = (0.35)(0.35)(0.35)(0.35)(0.35) = (0.35)5 = 0.0053 (approximately).

Since all the possible outcomes in our sample space must add up to 1.00, we can use the complement rule to determine the sum of the remaining probabilities of the other outcomes. The remaining outcomes include one person being overweight, two being overweight, three…,etc. Since the question ask us to determine the probability that at least one is overweight, we simply subtract our probability of none being overweight from 1.00:

P(at least one is overweight) = 1.00 – 0.00525 = 0.9947 (approximately).

Using the template (Probability of at least 1.xls), enter the probability of success (overweight) for the sample of size 5 in column C. The result is shown cell H4.

Probability of at least one success from many independent trials.
Success Probs
1 / 0.65 / Prob. of at least one success / 0.9947
2 / 0.65
3 / 0.65
4 / 0.65
5 / 0.65

2-38. We want to be sure that a package is delivered within one day so we decide to send the same package by three different delivery services. The three delivery firms have different success rates for on-time delivery: Firm A has a 90% success rate [P(A) = 0.90], Firm B has an 88% success rate [P(B) = 0.88], and Firm C has a 91% success rate [P(C) = 0.91]. We want to determine the probability that at least one of the packages arrives on time.

First, we assume independence in the events; i.e., the delivery by one service has no impact on the delivery of the other two services.

Second, let’s determine the probability that none of the three packages are delivered on time. To do this we need to calculate the failure rate for each firm. The failure rate for each firm is found by subtracting their success rate, expressed in decimal format, from 1.00, or:

Firm A failure rate: 1.00 - 0.90 = 0.10

Firm B failure rate: 1.00 - 0.88 = 0.12

Firm C failure rate: 1.00 - 0.91 = 0.09


The probability that none of the packages will be delivered on time is found by the multiplication of the three failure rates, since we assumed the events were independent of each other.

P(none are delivered on time) = (0.10)(0.12)(0.09) = 0.00108

Finally, the probability that at least one of the packages is delivered on time is same as asking for the probability that one or two or all three packages were delivered on time. The easiest way to do this is to use the “complement rule.” To determine the probability that at least one of the packages is delivered on time we subtract the probability that none of the packages were delivered on time from 1.00.

P(at least one arrives on time) = 1 - P(all three fail to arrive)

= 1.00 - (1 - .90)(1 - .88)(1 - .91) = 1.00 - 0.00108 = 0.99892

Using the template (Probability of at least 1.xls), enter the probability of success for each package delivery company in column C. The result is in cell H4.

Probability of at least one success from many independent trials.
Success Probs
1 / 0.9 / Prob. of at least one success / 0.9989
2 / 0.88
3 / 0.91

There is a 99.89% chance that at least one of the packages will be delivered on time.

2-42.  We are given the probabilities of three credit derivatives for making a profit, and we want to know the probability of at least on will make a profit.

Using template: Probability of at least 1.xls

Enter the three probabilities in cells: C4:C6

Probability of at least one success from many independent trials.
Success Probs
1 / 0.9 / Prob. of at least one success / 0.9900
2 / 0.75
3 / 0.6

The probability of at least one of the three investments makes a profit is 0.9900

2-44. This problem pertains to the data of problem 2-31, which provided the following table of the number of claims at an insurance company.

East / South / Midwest / West / Totals
Hospitalization / 75 / 128 / 29 / 52 / 284
Physician’s visit / 233 / 514 / 104 / 251 / 1,102
Outpatient treatment / 100 / 326 / 65 / 99 / 590
Totals / 408 / 968 / 198 / 402 / 1,976

The question is whether the event “hospitalization” is independent of the event “Midwest.” We know that in general (whether the events are independent or not) P(AÇB) = P(A½B)P(B), but if A and B are independent, P(A½B) = P(A), since the fact that B is given or came first doesn’t make any difference to factor A. Therefore, if P(A½B) = P(A), then events A and B are independent, and consequently, P(AÇB) = P(A)P(B).

a.  One test for independence: The events are independent if

P(hospitalization|Midwest) = P(hospitalization).

1.  P(hospitalization|Midwest) =

= =

P(hospitalization|Midwest) = 0.1465.

(This can also be computed more directly by = 0.1465.)

2. Also, P(hospitalization) = = 0.1437

3.  Then: is P(hospitalization|Midwest) = P(hospitalization)? Here 0.1465 ¹ 0.1437; therefore the two events are not statistically independent. Where the hospitalization occurs does make a difference.

b. Another test for independence: The two events are independent if P(hospitalizationÇMidwest) = P(hospitalization)P(Midwest).

1. P(hospitalizationÇMidwest) = = 0.0147.

2. P(hospitalization)P(Midwest)? = 0.0144.