Chapter 5. Basic Concepts of Probability
Part II

Compound Probabilities: Common Sense Expanded by Calculation

The usefulness of "reducing" common-sense concepts of probability to precise statements of proportionality is that they can then be subjected to the powerful analytical apparatus of mathematical calculation. Under certain circumstances they can be added, subtracted, multiplied, or divided. And while these operations themselves are elementary and commonplace, they end up telling us things about probability that go far beyond the raw intuitions of common sense.

There are basically two ways in which individual probability values can be linked together mathematically, and these in turn correspond to two basic kinds of logical linkage. The first is associated with the common-sense meaning of the word "and," and the second with the common-sense meaning of the word "or." In formal logic, the relationships denoted by these words are spoken of as conjunction and disjunction, respectively. Conjunctive probability questions take the general form, "What is the probability of having A and B occur?" and disjunctive questions take the form, "What is the probability of having A or B occur?" (Compound probabilities are sometimes described in the language of set theory. In this case, conjunction will be spoken of as "intersection" and disjunction will be described as "union.")

¶Conjunctive Probabilities: 'A and B', 'A and B and C', etc.

Here again the basic concept is firmly rooted in common sense. Suppose you were to toss a penny twice. There is a 50% chance of getting a head on the first toss; and then, if you do get a head on the first toss, there is also a 50% chance of getting a head on the second toss. The probability that you will get a head on the first toss and on the second toss is therefore 50% of 50%. The rest is just elementary arithmetic. One-half of one-half is one-quarter; 50% of 50% is 25%; or in decimal form, .5x.5=.25. If you were to toss a penny three times, the probability that all three tosses would come up heads would be 50% of 50% of 50%, which is 12.5%; or again in decimal form, .5x.5x.5=.125. And so on for four tosses, five, ten, a hundred, or a thousand.

The general principle is that conjunctive probabilities are linked together by the mathematical act of multiplication. Thus, in the abstract, the probability of having A and B occur is the product of the two separate component probabilities for A and B:

P(A and B) = P(A) x P(B)

The probability of having A and B and C occur is the product of the three separate component probabilities for A and B and C:

P(A and B and C) = P(A) x P(B) x P(C)

and so on.

In the simple case of repeatedly tossing a coin, the probability of getting a head on any particular toss is completely independent of the outcome of any other toss, past, present, or future. If you get a head on the first toss, the probability of getting a head on the second toss is P(H)=.5; and if you get a tail on the first toss, the probability of getting a head on the second toss is also P(H)=.5. No matter how many times you have tossed the coin, no matter how many heads have already come up, or how many tails, the probability of getting a head on the next toss is still exactly P(H)=.5. There are many other kinds of situations, however, where the probability of an event is not independent but dependentthat is, where the probability of one event depends on the outcome of some other event.

We will illustrate this distinction with an example that will also show how calculated probabilities sometimes end up telling a story quite different from what common-sense intuitions might have led you to expect. Imagine a room that contains 4 females and 6 males. Question 1: If you were to select 3 persons from this room at random, what is the probability that all 3 would be females? And question 2: If you were to select 3 persons from this room at random, what is the probability that all 3 would be males? Since the room contains more males than females, you will surely see intuitively that the probability for all-3-males will be greater than the probability for all-3-females. But I expect it will not be intuitively obvious to you at all that the probability for all-3-males is more than five times as great as the probability for all-3-females. Here is how it works. The critical point to take note of is that the probability of the outcome for each successive draw, after the first, depends on the outcome(s) of the preceding draw(s).

First for the possibility that all 3 of the persons selected will be females. Clearly, the probability of selecting a female on the first draw is the ratio 4/10, since there are 10 persons in the room, of whom 4 are females. But then, once you have made your first selection, there remain only 9 persons in the room from whom to make your second selection; and if your first selection is a female, then only 3 of the remaining persons are females. Thus, if your first selection happens to be a female, the probability of selecting a female on the second draw is not 4/10, but rather 3/9.

After the second selection there are only 8 persons left in the room; and if both of the persons already drawn are females, then only 2 of these 8 remaining are females. Thus, the probability that the third draw will also be a female is not 4/10 nor 3/9, but rather 2/8. The probability of selecting females in all three draws in this situation is therefore

P(all 3 females) = (4/10)x(3/9)x(2/8) = .033

The logic is the same for the possibility of selecting males on all 3 draws. The probability of selecting a male on the first draw is 6/10, since the room contains 10 persons, of whom 6 are males. But then, once you have made your first selection, there remain only 9 persons in the room from whom to make your second selection; and if your first selection is a male, then only 5 of the persons remaining are males. Thus, if your first selection happens to be a male, the probability of selecting a male on the second draw is not 6/10, but rather 5/9. After the second selection there are only 8 persons left in the room; and if both of the persons already drawn are males, then only 4 of these 8 remaining are males. Thus, the probability that the third draw will also be a male is not 6/10 nor 5/9, but rather 4/8. The probability of selecting males on all three draws in this situation is therefore

P(all 3 males) = (6/10)x(5/9)x(4/8) = .167

which you will note is about 5 times larger than the P=.033 probability of selecting females on all three draws.

In both of these calculations the basic principle is the same as the one we outlined at the beginning of this section: the conjunctive probability of any two or more events is equal to the product of their separate component probabilities. It is simply a matter of making sure that the values assigned to the component probabilities are the rightones.

¶Disjunctive Probabilities: 'A or B', 'A or B or C', etc.

The principle of disjunctive probabilities is even simpler. If you toss a coin, there is a 50% chance that it will come up heads, a 50% chance that it will come up tails, and thus a 100% chance that it will come up either heads or tails. With each lottery ticket you purchase you have a 1 in 100 million chance of winning. With two lottery tickets your chance of winning with either the first or the second is therefore 2 in 100 million. With three tickets, the chance of winning with either the first or the second or the third is 3 in 100 million; and so forth. When two or more chance events are disjunctively linked by the word "or," the corresponding mathematical linkage is just simple addition. Thus, the probability of having A or B occur is equal to the sum of the two component probabilities for A and B:

P(A or B) = P(A) + P(B)

The probability of having A or B or C occur is the sum of the three separate component probabilities for A and B and C:

P(A or B or C) = P(A) + P(B) + P(C)

and so on.

To put it concretely, imagine a class of 30 students composed of 4 freshmen, 12 sophomores, 10 juniors, and 4 seniors. The instructor announces that one of these students will be randomly selected to win the class lottery prize, which is an automatic A in the course. The probability that the lucky winner will be either a freshman or a sophomore is

 4 freshmen30 students + 12 sophomores30 students = 1630 = .53

and the probability that it will be either a freshman or a sophomore or a junior is

 4 freshmen30 students + 12 sophomores30 students + 10 juniors30 students = 2630 = .87

The restriction on this additive operation is that component probabilities can be added together in this simple fashion only when they pertain to possibilities that are mutually exclusive. In the example that we have just examined, each of the four academic-class categories excludes all the others. If you are a freshman, you cannot at the same time be a sophomore or a junior or a senior. If you are a sophomore, you cannot at the same time be a freshman or a junior or a senior. And so forth.

But now consider the following variation on this classroom theme. Suppose the class has a total of 26 students, of whom 12 are sophomores and 14 are juniors. Of the sophomores, 7 are females and 5 are males; and of the 14 juniors, 8 are females and 6 are males. Question: In randomly selecting one of the members of this class to win the lottery, what is the probability that the student selected will be either a sophomore or a female?

Here is how not to answer the question. The probability of selecting a sophomore is 12/26 (right!), and the probability of selecting a female is 15/26 (right!). The probability of selecting either a sophomore or a female is therefore

 P(soph. or female) = P(soph.) + P(female) = 1226 + 1526 = 2726 = 1.038[Wrong!]

Applying the simple additive formula in this situation would lead you to conclude that the chance of selecting either a sophomore or a female is about 104%—which is patently absurd. For reasons described earlier, the probability that any particular event or outcome will occur must always fall within the range bounded at the bottom by P=0 (0%) and at the top by P=1.00 (100%).

The reason why simple addition does not work in this situation can be gleaned by going back and looking closely at some details. When you say that the component probability of selecting a sophomore is 12/26, what you are really saying is

 P(soph.) = 7 soph. females + 5 soph. males26 students = 1226

And when you say that the component probability of selecting a female is 15/26, what you are really saying is

 P(female) = 7 soph. females + 8 jr. females26 students = 1526

In and of themselves, these two component probabilities are quite correct—but now see what happens when you add them together.

 P(soph.)+P(female) = 7 soph. females + 5 soph. males26 students+ 7 soph. females + 8 jr. females26 students = 1226+1526 = 2726 = 1.038

The complication here is that the two components, sophomore and female, are not mutually exclusive. Seven of the 26 students are both sophomores and females, and these 7 students are being counted twice: once because they are sophomores, and then again because they are females. If you are counting apples in a basket and mistakenly count some of them twice, you end up with an inflated measure of the number of apples. If you are counting the elements that enter into a probability calculation and count some of them twice, you end up with an inflated measure of probability.

The method commonly recommended for correcting this inflation is to calculate the disjunctive probability of "A or B" by simple addition, as we have just done, and then subtract from that inflated sum the conjunctive (multiplicative) probability of "A and B." That is, for cases where the component probabilities are not mutually exclusive

P(A or B) = P(A) + P(B)  P(A and B)

The reason why this method works is that in subtracting the conjunctive probability of "A and B," you are in effect removing from the inflated sum the amount that was counted twice.

There are, however, two limitations of this method. The first is that the elemental probabilities, P(A) and P(B), are sometimes not independent, in which case the conjunctive portion of the above formula, " P(A and B)," will involve some more or less complicated conditional probabilities. Here again is an illustration of how not to do something.

 P(soph. or female)  = P(soph.)+P(female) — P(soph and female)  = (12/26)+(15/26)—[(12/26)(15/26)]  = 1.038 — .266  = .772 [Wrong!]

Although this result is not obviously preposterous, as was our earlier P=1.038, a moment's reflection will show that it clearly cannot be correct. Among the 26 students are 6, namely the junior males, who are neither sophomores nor females. The probability of selecting one of these is 6/26=.231; and the probability of selecting someone other than one of these, namely a sophomore or a female, is accordingly 1.231=.769. [Correct!]

The first result, .772, is not merely wrong by a certain small amount. It is wrong fundamentally and in principle, because the probability of selecting someone who is both a sophomore and a female is not simply

 P(soph.) x P(female) 12 soph.26 students x 15 females    26 students = .266 [Wrong!]

Sure enough, the probability of selecting a sophomore is 12/26 (the number of sophomores divided by the total number of students). But then, if you do select a sophomore, the probability that this particular one of the 12 sophomores will also be a female is not 15/26 (the number of females divided by the total number of students), but rather 7/12 (the number of female sophomores divided by the total number of sophomores). Hence
 P(soph.) x P(female) 12 soph.26 students x 7 soph. females12 soph. = 7 soph. females26 students = .269 [Correct!]

So the true probability of selecting either a sophomore or a female, correcting for inflation and taking account of the overlap between "sophomore" and "female," is
 P(soph. or female)  = P(soph.)+P(female) — P(soph and female)  = (12/26)+(15/26)—[(12/26)(7/12)]  = 1.038 — .269  = .769 [Correct!]
which is of course identical with our earlier calculation of 1.231=.769.

The second limitation is that even when the elemental probabilities are independent, the method will not work when there are more than two items in the disjunction (A or B or C; A or B or C or D; etc.). For example, suppose you are tossing 3 coins, A, B, and C. What is the probability of getting a head (H) on either A or B or C? When you set it up in the manner just described, the result is
 P(H[A] or H[B] or H[C]) = P(H[A]) + P(H[B]) + P(H[C]) — P(H[A] and H[B] and H[C]) = (.5+.5+.5) — (.5x.5x.5) = 1.375[Wrong!]

which, again, is patently absurd.

The method that will work in such multi-item disjunctions, and which I recommend for two-item disjunctions as well, is actually quite a simple one. In any particular situation, there is a certain probability that the event or outcome in question will occur, and a certain probability that it will not occur; and taken together, these two complementary probabilities must always add up to P(Total)=1.0. Thus, one way of determining the probability that a disjunction ("A or B," "A or B or C," etc.) will occur is to figure out the probability that it will not occur, and then subtract that amount from 1.0. That is

 P(that x will occur) = P(Total) — P(that x will nor occur) = 1 — P(that x will nor occur)

For example, when tossing 3 coins, A, B, and C, the only way you could not get the disjunctive outcome of a head (H) on either A or B or C would be to get the conjunctive outcome of a tail (T) on A and B and C. The probability of getting a tail on all 3 of the tosses is

.5x.5x.5 = .125

So the complementary probability of not getting a tail on all 3 tosses, which can happen only if you get a head on at least one of the tosses—A or B or C—is

1.0  .125 = .875

In principle, virtually any probability situation you might ever encounter within the context of inferential statistics can be seen as a more or less complex combination of elemental conjunctive and disjunctive probabilities. In actual practice you will not normally need to see it this way, because most inferential statistical procedures are streamlined to the point where these elemental constituents are no longer visible. Nonetheless, it is important that you have a sense of what is going on behind the scenes in these streamlined procedures. In the next section we will examine some of the basic ways in which conjunctive and disjunctive probabilities can be combined, and then in Chapter 6 we will see how these potentially very complex and unwieldy combinations can be transformed into structures of elegant simplicity.
End of Chapter 5, Part II.