Conditional Probability and Independence

In this chapter, we deal with problems in which some information given may influence the probabilities in question. The resulting conditional probabilities arise frequently in poker problems and in other applications. In the particular case where the observed events do not influence the probabilities of relevant future events, we say the events are independent. This situation rarely occurs in problems involving one particular hand of Texas Hold’em, because the deck is finite and is not reshuffled after each card is dealt, so the appearance of one or more cards substantially changes the distribution for future cards. However, since the deck is reshuffled between hands, what occurs on one hand may typically be considered independent of what occurs on other hands, so problems involving independence can arise when considering collections of hands.

3.1 Conditional Probability

Often one is given information that can provide some insight into the probability that an event will occur, and it can be important to take this information into account. In poker, for instance, you may be able to compute that the probability of a particular player being dealt AA on a given hand is 1/221, but if you were dealt an ace on the same hand, then the probability of your opponent having AA drops substantially, and if you were dealt AA, then the chance of your opponent having AA decreases dramatically. Alternatively, you may note that your probability of being dealt KK is 1/221, but if you have already looked at one card and have seen that it is a king, then your probability of being dealt KK improves considerably. The resulting probabilities are called conditional, since they depend on the information received when you look at your cards. We will approach the solutions to the conditional probability problems described above momentarily, but first, let us discuss the notation and terminology for dealing with conditional probabilities.

The conditional probability that event A occurs, given that event B occurs, is written P(A | B), and is defined as P(AB)÷ P(B). Recall from Chapter 1 that P(AB) is the probability that both event A and event B occur. Thus the conditional probability P(A | B) is the probability that both events occur divided by the probability that event B occurs. In the Venn diagram analogy, P(A | B) represents the proportion of the area covered by shape B that is covered by both A and B. In other words, if you were to throw a pencil at a paper and the pencil were equally likely to hit any spot on the paper, P(A | B) represents the probability that the pencil lands in the region covered by shapes A and B, given that it lands in shape B.

If P(B) = 0, then P(A | B) is undefined, just as division by zero is undefined in arithmetic. This makes sense, since if event B never happens, then it does not make much sense to discuss the frequency with which event A happens given that B also happens.

In some probability problems, both P(B) and P(AB) are given or are easy to derive, and thus the conditional probability P(A | B) is also easy to calculate (see Example 3.1.1, which was mentioned at the beginning of this section).

Example 3.1.1 — Suppose you have seen that your first card is the K♥. What is the conditional probability that you have KK, given this information?

Answer — Let A represent the event that you are dealt KK, and let B be the event that the first card dealt to you is the K♥. We seek P(A | B), which by definition is equal to P(AB)÷ P(B). P(B) is clearly 1/52, since each card is equally likely to appear as the first card dealt to you. Note that in computing P(B), we are ignoring our extra information that our first card is the K♥. In general, whenever we use the notation P(B), we mean the probability of B occurring, given no special information about this hand. In other words, we may interpret P(B) as the proportion of hands in which event B will occur and clearly this is 1/52. P(AB) is the probability that you are dealt one of the permutations [K♥, K♣], [K♥, K♦], or [K♥, K♠], with the cards appearing in the order specified. Thus P(AB)= 3/(52 × 51) = 3/2,652, or 1 in 884. The resulting conditional probability P(A | B) is therefore 1/884 ÷ 1/52 = 1/17. ■

Like many probability problems, this question can be tackled in different ways. One alternative is to consider that, given that your first card is the K♥, each of the other 51 cards is now equally likely to be dealt to you as your next card. Since exactly three of these give you KK, the conditional probability of you being dealt KK, given that your first card is the K♥, is simply 3/51 = 1/17.

Example 3.1.2 — What is the probability that you are dealt KK , given that your first card is a king?

Answer — The only difference between this problem and Example 3.1.1 is that now your first card can be any king, not necessarily the K♥. However, this problem can be handled just as Example 3.1.1, and in fact the answer is the same. Again, let A represent the event that you are dealt KK, and let B be the event that the first card dealt to you is a king. P(B) is 4/52 = 1/13, and now AB is the event that both cards are kings and the first card is a king, which is the same as the event that both cards are kings. In other words, in this example, if A occurs then Bmust occur also, so AB = A. Thus, P(AB) = P(A) = C(4,2)/C(52,2) = 6/1326 or 1/221, and P(A | B) = 1/221 ÷ 1/13 = 1/17. ■

Note that it makes sense that this probability would be the same as in Example 3.1.1. The fact that your first card is the K♥ should not make it any more or less likely that you have KK than if your first card were any other king.

Example 3.1.3 — What is the probability that you are dealt KK, given that you are dealt a king? That is, imagine you have not seen your cards, and a friend looks at your cards instead. You ask the friend, “Do I have at least one king?” and your friend responds, “yes.” Given this information, what is the probability that you have KK?

Answer — Note how the information given to you is a bit different from Example 3.1.2. It should be obvious that the event that your first card is a king is less likely than the event that your first or second card is a king. Let A represent the event that you are dealt KK and B the event that at least one of your cards is a king. We seek P(A | B) = P(AB)÷ P(B), and as in Example 3.1.2, AB = A, so P(AB) = P(A) = 1/221.

P(B) = P(your 1st card is a king or your 2nd card is a king)

= P(1st card is king)+ P(2nd is king) – P(both are kings)

= 4/52 + 4/52 – 1/221 = 33/221.

Therefore, P(A | B)= 1/221 ÷ 33/221 = 1/33. ■

The difference between the answers to Examples 3.1.2 and 3.1.3 sometimes confuses even those well versed in probability. One might think that, given that you have a king, it shouldn’t matter whether it is your first card or your second card (just as it did not matter in Example 3.1.1 whether it is a heart) and that therefore the probability of your having KK given that either of your cards is a king should also be equivalent, so it is surprising that this is not the case.

The explanation for the difference between Examples 3.1.2 and 3.1.3 can perhaps best be explained in terms of areas corresponding to Venn diagrams (Figure 3.1). Recall that one may equate the conditional probability P(A | B) with the probability of a pencil hitting a target A, given that the pencil randomly falls somewhere in shape B. In Examples 3.1.2 and 3.1.3, the size of the target (A) is the same, but the area of shape B is almost twice as large in Example 3.1.3 compared with that in Example 3.1.2.

Note that the three axioms of basic probabilities from Chapter 1 apply to conditional probabilities as well. That is, no matter what events A and B are, P(A | B) is always non-negative, P(A | B)+ P(Ac | B) = 1, and P(A1orA2or … or An | B) = P(A1 | B) + P(A2 | B) + … + P(An| B)for any mutually exclusive events A1, A2, … An. As a result, the rules for basic probabilities given in Chapters 1 and 2 must also apply to conditional probabilities. Thus, for instance, for any events A1, A2, and B, P(A1 or A2 | B) = P(A1 | B)+ P(A2 | B)– P(A1A2 | B). Similarly, conditional on B, if events A1,…, An are equally likely and exactly one of them must occur, then each has conditional probability 1/n, and if one wishes to determine the conditional probability of some event A that contains exactly k of these elements, then the conditional probability of A given B is simply k/n. Thus, as in Chapter 2, sometimes conditional probabilities can be derived simply by counting.

In the next two examples, we return to the scenarios described earlier in this section.

Example 3.1.4 — Suppose you have A♥ 7♦. Given only this information about your two cards, what is the probability that the player on your left has pocket aces?

Answer — Let A represent the event that the player on your left has pocket aces and let B be the event that you have A♥ 7♦. Given B, each combination of 2 of the remaining 50 cards is equally likely to be dealt to the player on your left. There are C(50,2) = 1,225 distinct combinations of these 50 cards, so each has conditional probability 1/1225. The event A contains exactly C(3,2) = 3 of these events (A♣ A♦, A♣ A♠, or A♦ A♠), so P(A | B) = 3/1,225 or approximately 1 in 408.33. ■

Example 3.1.5 — In an incredible hand from the final table of the $3,000 no-limit Texas Hold’em WSOP event in 2007, with only eight players left, after Brett Richey raised, Beth Shak went all in with A♥ A♦, and the player on her left, Phil Hellmuth, quickly re-raised all-in. Hellmuth had the other two aces. What is the chance of this happening? Specifically, suppose you have A♥ A♦. Given only this information about your two cards, what is the probability that the player on your left has pocket aces?

Answer — Again, let A represent the event that the player on your left has pocket aces. Let B = the event that you have A♥ A♦. Given B, each of the C(50,2) = 1,225 equally likely combinations of 2 of the remaining 50 cards has conditional probability 1/1,225. The event A now contains just C(2,2)= 1 of these events (namely A♣ A♠), so P(A | B) = 1/1,225. ■

Incidentally, it turned out that Richey called with K♣ K♠ and the board came 10♠ 3♦ 7♠ 8♣ 4♣ eliminating Richey in eighth place, and Hellmuth and Shak split the pot.

Example 3.1.5 — On day 4 of the 2015 WSOP Main Event, after Mike Cloud raised to 15,000 with A♣ A♠, Phil Hellmuth Jr. called with A♥ K♠, and Daniel Negreanu called from the big blind with 6♦ 4♥. The flop came K♣ 8♥ K♥, giving Hellmuth the lead, and when the hand was over he ended up winning a pot of 161,000 chips. Before the flop, given only the cards Cloud, Hellmuth, and Negreanu had, what was the probability of Hellmuth flopping a better hand than Cloud?

Answer — Note first that if the flop came KKA, then Cloud would still be in the lead with a higher full house. The only ways for Hellmuth to have the lead on the flop is for the flop to contain three kings, QJ10, or two kings and x, where x is any card other than an ace or king. Given the 6 cards the players had, each of the C(46,3) = 15,180 possible flop combinations is equally likely. One such combination contains three kings, 4 × 4 × 4 = 64 correspond to QJ10, and C(3,2) × 42 correspond to KKx, so given the hole cards of the 3 players, the probability of Hellmuth taking the lead on the flop is (1 + 64 + 3×42) ÷ 15,180 ~ 1.26%. ■

3.2 Independence

Independence is a key concept in probability, and most probability books discuss enormous numbers of examples and exercises involving independent events. In poker, however, where cards are dealt from a finite deck and hence any information about a particular card tends to change the probabilities for events involving other cards, it is more common for events to be dependent. Because the deck is shuffled between hands, one may think of events involving separate hands as independent, but for events in the same hand, one typically would come to the wrong conclusion by computing probabilities as if the events were independent. It is similarly important in many scientific disciplines to be wary of applying rules for independent events in situations involving dependence.

Independence is defined as follows. Events A and B are independent if P(B|A) = P(B). Note that, if this is the case, then P(A|B) = P(A) provided P(B)0, so the order of the two events A and B, i.e., the decision about which event is called A and which is called B, is essentially irrelevant in the definition of independence.

The definition agrees with our notion of independence in the sense that, if A and B are independent, then the occurrence of A does not influence the probability that event B will happen. As a simple example, suppose you play two hands of Texas Hold’em. A is the event that you are dealt pocket aces on the first hand, and B is the event that you are dealt pocket aces on the second hand. Because the cards are shuffled between hands, knowledge of A’s occurrence should not influence the probability of B, which is 1/221 regardless of whether A occurred or not.

It is generally accepted that the following events may be assumed independent:

Outcomes on different rolls of a die

Results of different flips of a coin

Outcomes on different spins of a spinner

Cards dealt on different poker hands

Sampling from a population is analogous to drawing cards from a deck. In the case of cards, thepopulationis only of size 52, whereas in scientific studies, the population sampled may be enormous. Dealing cards and then replacing them and reshuffling the deck before the next deal is called sampling with replacement. When sampling with replacement, events before and after each replacement are independent. However, when sampling without replacement, such events are dependent.

When dealing two cards from a deck, for instance, one typically does not record and replace the first card before dealing the second card. In such situations, what happens on the first card provides some information about the possibilities remaining for the second card, so the outcomes on the two cards are dependent. If the first card is the ace of spades, then you know for certain that the second card cannot be the ace of spades. Similarly, when sampling at random from a population in a scientific study, one typically samples without replacement, and thus technically the outcomes of such samples are dependent. If, for instance, you are measuring people’s hand sizes and the first person in your sample has the largest hands in the population, then you know that the second person in your sample does not have the largest hands, so the two hand sizes are dependent. However, when sampling from a large population such as a city, state, or country with several million residents, it is often reasonable to model the outcomes as independent even when sampling without replacement, because information about one observation provides so little information about the next observation. Thus, in scientific studies involving samples without replacement from large populations, technically the observations are dependent, but one typically assumes independence of the observations because calculations performed on the basis of independence are so close to correct. When dealing with a population of 52; however, this is not the case.

A few more general words about independent events are in order. First, if A and B are independent, then so are the pairs (Ac,B), (A,Bc), and(Ac, Bc). This makes sense intuitively: if knowledge about A occurring does not influence the chance of B occurring, then knowledge about Ac should not influence the chance of B either. Mathematically, it is easy to see that for instance P(Ac | B) = P(AcB)/P(B) = [P(B)– P(AB)]/P(B) = 1 – P(AB)/P(B)= 1 – P(A | B), which equals P(Ac) if A and B are independent.

3.3 Multiplication Rules

Multiplication rules are useful when calculating the probability that a number of events will all occur, i.e., when considering the probability that event A1and event A2and … and event Akwill occur. In such situations, it is important to determine whether the events A1, A2, etc. are independent or dependent, although multiplication rules can be used in both cases. In general, for any sets A and B,

P(AB) = P(A) × P(B|A),

provided P(A)0 so that P(B | A) is defined. This is sometimes called the general multiplication rule, and it follows directly from the definition of the conditional probability P(B | A). Slightly more generally, for any sets A, B, C, D, …, as long as all the conditional probabilities are well defined,

P(ABCD…) = P(A) × P(B|A) × P(C|AB) × P(D|ABC) × ….

Note that for independent eventsA and B, P(B|A) = P(B), so P(AB) = P(A) × P(B), provided A and B are independent. Thus the general multiplication rule simplifies a bit:

P(ABCD…) = P(A) × P(B) × P(C) × P(D) ×..., if A,B,C,D,… are independent.

We will refer to the above as the multiplication rule for independent events.

The next two examples below show the importance of determining whether events are dependent or independent when using the multiplication rules.

Example 3.3.1 — Suppose you and I play two hands of poker. Compare the probabilities of these two events:

Event Y: you get pocket aces on the first hand and I get pocket aces on the second hand.

Event Z: you get pocket aces on the first hand and I get pocket aces on the first hand.

Answer — Your cards on the first hand and my cards on the second hand are independent, so by the multiplication rule for independent events,

P(Y) = P(you get AA on 1st hand) × P(I get AA on 2nd hand)= 1/221 × 1/221 = 1/48,841.