Subchapter 8a.
The Fisher Exact Probability Test

Toward the end of the main body of Chapter 8 we noted that chi-square procedures can be legitimately applied only if all values of E are equal to or greater than 5. The best way to avoid running afoul of this limitation is of course to base your observations on relatively large samples. But alas in real life there are many situations where all we can muster is a relatively small sample, and we must make the best of it. For the special case of two rows by two columns, a useful alternative to chi-square in situations of this sort is the Fisher Exact Probability Test. There is, in fact, one respect in which the Fisher test is plainly superior to a chi-square test, even in those cases where a 2x2 chi-square might be legitimately employed. It is that chi-square is intrinsically non-directional, whereas the Fisher procedure is capable of being applied as either a directional test or a non-directional test.

The bad news about the Fisher test is that the calculations required for it can be quite laborious and about as pleasant as an afternoon of root-canal surgery. The good news is that the computer on which you are now working will happily do all the tedious number-crunching for you, leaving your mind free to wrap itself around the underlying concepts.

First the concepts. To illustrate, suppose that 19 subjects who display a certain medical condition are assessed with respect to whether they do or do not possess two other characteristics, X and Y. On the basis of earlier informal observations on patients of this particular type, the investigators are aiming to test the directional hypothesis that subjects who possess characteristic X will tend also to show characteristic Y, and that those who do not show characteristic X will also tend not to show characteristic Y.

 X No Yes Y Yes 2 7 9 No 8 2 10 10 9 19

As indicated in the adjacent table, the results of the assessment are clearly in line with the hypothesis. Of the 9 subjects who show characteristic X, all but 2 also show characteristic Y; and of the 10 subjects who do not show characteristic X, all but 2 also do not show characteristic Y. Alternatively, you can say that in 15 of the 19 subjects the presence or absence of characteristic X corresponds with the presence or absence, respectively, of characteristic Y. All that remains is to determine whether this ostensible association between X and Y reflects anything more than mere chance coincidence.

In case it is not obvious from the bare numbers in the above table, please take a moment to note that any attempt to perform a chi-square test on this array of frequencies would come to a grinding halt as soon as we calculated the values of E, since three of these are smaller than the required minimal value of 5. The next table shows the specific values of E for each cell, as calculated according to the procedure described in the main body of Chapter 8.
 X No Yes Y Yes E=4.7 E=4.3 9 No E=5.3 E=4.7 10 10 9 19

The basic precept:
Chi-square procedures can be legitimately applied only if all values of E are equal to or greater than 5.

¶The Logic of the Fisher Test

 X No Yes Y Yes 2 7 9 No 8 2 10 10 9 19

The null hypothesis in our illustrative example is that there is no association between X and Y. And the question of statistical significance is accordingly this: If the null hypothesis were true—if any ostensible association between characteristics X and Y were the result of nothing more than mere chance coincidence—how likely is it that we might end up with a result this large or larger?

To develop the logic of the Fisher test, suppose that our investigators have performed their initial assessment and counted up the number of subjects who do and do not show characteristics X and Y, but have not yet sorted their subjects according to the correspondences of X and Y. In this case, all they would have would be the marginal totals shown in the following table.
 X No Yes Y Yes 9 No 10 10 9 19

Given these marginal totals, there are 10 possible ways in which the specific correspondences between X and Y might sort themselves out by mere chance. We will label these as Ô1, Ô2, Ô3, and so on, with "Ô" standing in as an abbreviation for "outcome."

 Ô1 Ô2 Ô3 Ô4 Ô5 Ô6 Ô7 Ô8 Ô9 Ô10 9 0 8 1 7 2 6 3 5 4 4 5 3 6 2 7 1 8 0 9 1 9 2 8 3 7 4 6 5 5 6 4 7 3 8 2 9 1 10 0 "this large or larger"
Outcomes falling toward the left end of this array would betoken ostensible negative association between X and Y, those falling toward the right end would betoken ostensible positive association, and those lying toward the middle would approximate zero association. I expect it is intuitively obvious to you that the mere-chance probability of an outcome is greatest toward the middle of the range and decreases sharply as you go out toward either extreme. The particular outcome observed by the investigators is Ô8, so the question of statistical significance in this case takes the specific form: If the null hypothesis were true, how likely is it that we could end up with either Ô8 or Ô9 or Ô10?

 Ô8 Ô9 Ô10 2 7 1 8 0 9 8 2 9 1 10 0

In principle, the application of the Fisher test to a situation of this sort is quite simple: figure out the exact probability for each possible outcome that is "this large or larger," and then add up these separate disjunctive probabilities to get the answer. All the rest is nuts-and-bolts calculation, informed by some basic concepts of probability examined in Chapter 5.

There are several paths we could follow at this point, all of which would eventually lead to the same destination. I choose the one most likely to make sense to the beginning student and least likely to bring the non-mathematical reader to the brink of tears. As you saw in Chapter 5, the mere-chance probability of any particular outcome is fundamentally a proportion of the general form
 P(outcome) = number of possibilities favorable to the occurrence of the outcometotal number of pertinent possibilities

We begin by figuring out the value of the denominator of the expression, which is the total number of pertinent possibilities. Here is something else you will recognize from Chapter 5. If you were to toss 19 coins, the number of possible combinations that would yield exactly 9 heads out of those 19 tosses would be

 N!k!(N—k)! = 19!9!10! = 92,378

 X No Yes Y Yes 9 No 10 10 9 19

The same concept applies when you are assessing 19 subjects with respect to the possession or non-possession of characteristic X. The total number of combinations that would yield exactly 9 subjects displaying X and 10 subjects not displaying it would be

 19!9!10! = 92,378

 X No Yes Y Yes 9 No 10 10 9 19

And similarly for characteristic Y. The total number of combinations that would involve exactly 9 subjects displaying Y and 10 subjects not displaying it would be

 19!10!9! = 92,378

The total number of possible outcomes in which you might have exactly 9 out of 19 subjects displaying X and exactly 10 out of the same 19 subjects displaying Y is accordingly

92,378 x 92,378 = 8,533,694,884

So here is how our probability calculation would begin to shape up for any particular one of the 10 possible outcomes enumerated above, Ô1 through Ô10.

 P(outcome) = number of possibilities favorable to the occurrence of the outcometotal number of pertinent possibilities
 P(outcome) = ?8,533,694,884

All we now have to do for each particular outcome is replace the question mark with a specific numerical value.
 X No Yes Y Yes 0 9 9 No 10 0 10 10 9 19
We begin with Ô10, which is the most extreme of the possible "this large or larger" outcomes and by far the easiest to examine. For each of the 92,378 possible combinations that would yield exactly 9 subjects displaying X and 10 subjects not displaying it, there is exactly one of the Y combinations that would involve every instance of X being associated with an instance of Y and every instance of not-X being associated with an instance of not-Y. So the probability of this particular outcome would be
 P(Ô10) = 92,3788,533,694,884 = 192,378

 X No Yes Y Yes 1 8 9 No 9 1 10 10 9 19

The logic is the same with the less extreme outcomes, although rather more complicated in computational detail. Here, to illustrate, is the next less extreme outcome, Ô9. For each of the 92,378 possible combinations that would yield exactly 9 subjects displaying X and 10 subjects not displaying it, the number of combinations that would produce the particular correspondences of X and Y shown in the first column of the above table (X="No") would be
 10!1!9! = 10

and the number of combinations that would produce the particular correspondences shown in the second column (X="Yes") would be
 9!8!1! = 9

So the number of combinations that would produce the overall array of frequencies in the gray cells of the table would be

10 x 9 x 92,378 = 8,314,020

and the probability associated with this outcome is accordingly
 P(Ô9) = 8,314,0208,533,694,884 = 9092,378

 X No Yes Y Yes 2 7 9 No 8 2 10 10 9 19

One more turn of this wheel and we are finished, at least for the present example. Here is the table for the actually observed outcome, Ô8. For each of the 92,378 possible combinations that would yield exactly 9 subjects displaying X and 10 subjects not displaying it, the number of combinations that would produce the particular correspondences of X and Y shown in the first column of the table would be
 10!2!8! = 45

and the number of combinations that would produce the particular correspondences shown in the second column would be
 9!7!2! = 36

So the number of combinations that would produce the overall array of frequencies in the gray cells of this table would be

45 x 36 x 92,378 = 149,652,360

and the probability associated with this outcome is accordingly
 P(Ô8) = 149,652,3608,533,694,884 = 162092,378

The rest of the ride is a smooth one along a familiar path. The probability of getting a result "this large or larger" is the sum of the separate probabilities for Ô8, Ô9, and Ô10:

 2 7 1 8 0 9 8 2 9 1 10 0
 162092,378 + 9092,378 + 192,378 = 171192,378 = .0185

And that, in a nutshell, is the probability that our investigators can take to the printing press. If the null hypothesis were true, the exact probability of finding a positive association between X and Y as large as the one observed would be a scant P=.0185. The investigators can therefore reject the null hypothesis with a comfortable degree of confidence and conclude that characteristics X and Y do tend to be associated for this particular type of subject.

¶A Simpler Formulaic Approach

Once you have the concepts, the actual calculations for the Fisher test can be performed by way of a relatively simple formulaic structure. For any 2x2 contingency table, designate the cell frequencies as a, b, c, and d, and the marginal totals as a+b, c+d, etc., with N equal to the sum of the frequencies in the four cells.

 a b a+b c d c+d a+c b+d N

Given this notation, the exact probability of any particular outcome can then be calculated according to the formula

P(outcome)
=
(a+b)! (c+d)! (a+c)! (b+d)!
N! a! b! c! d!
 In performing factorial operations recall that 0!=1 and 1!=1.

 0 9 9 10 0 10 10 9 19
Thus, for outcome 10 in the above example:
P(Ô10)
=
9! 10! 10! 9!
19! 0! 9! 10! 0!
= .000010825
 Note that this rounded decimal value is equivalent to the ratio 1/92,378 calculated above.

 1 8 9 9 1 10 10 9 19
For outcome 9:
P(Ô9)
=
9! 10! 10! 9!
19! 1! 8! 9! 1!
= .000974258
 Equivalent to 90/92,378

 2 7 9 8 2 10 10 9 19
And for Ô8, the observed outcome:
P(Ô8)
=
9! 10! 10! 9!
19! 2! 7! 8! 2!
= .017536642
 Equivalent to 1620/92,378

¶The Fisher Test for Directional and Non-Directional Hypotheses

 9 10 10 9 19
If you were to perform these calculations for each of the 10 possible outcomes in the example, the composite result would constitute the sampling distribution of outcomes for the general case where you have a 2x2 contingency table with these particular marginal totals. The following graph shows the outlines of this sampling distribution. As usual in such depictions, the vertical axis represents relative frequency, which is of course the same thing as probability.          Ô1 Ô2 Ô3 Ô4 Ô5 Ô6 Ô7 Ô8 Ô9 Ô10

As our investigators begin with a directional hypothesis, their probability assessment needs to refer to only one of the two tails of this distribution, namely, the "this large or larger" portion on the right that includes the probabilities for Ô8, Ô9, and Ô10. If they had instead begun with a non-directional hypothesis—in effect, "Let's do this and see if there's an association one way or the other"—they would also have to refer to the portion of the left tail that includes the possible outcomes that are "this large or larger" in the opposite direction.

 Ô1 Ô5 Ô10 9 0 5 4 0 9 1 9 5 5 10 0

Within this context, the concept of "this large or larger" is defined by the degree to which the frequencies within the cells are arrayed disproportionately, so as to fall predominantly along the upward diagonal, as in Ô10, or along the downward diagonal as in Ô1. The degree of disproportion within any array of cell frequencies—in effect, the degree of ostensible association in either direction—can be measured by the absolute difference

 disproportion = aa+b — cc+d The following table shows this value as calculated for each of the 10 possible outcomes. The disproportion measure for Ô8, the observed outcome, is indicated in red, and the measures for the other outcomes that are "this large or larger," in either direction, are shown in boldface.

 Ô1 Ô2 Ô3 Ô4 Ô5 Ô6 Ô7 Ô8 Ô9 Ô10 9 0 8 1 7 2 6 3 5 4 4 5 3 6 2 7 1 8 0 9 1 9 2 8 3 7 4 6 5 5 6 4 7 3 8 2 9 1 10 0 0.90 0.69 0.48 0.27 0.06 0.16 0.37 0.58 0.79 1.00

The short of it is that the test of a non-directional hypothesis would also have to fold in the probabilities for Ô1 and Ô2, which you would find upon calculation to be

P(Ô1) = .000108251and

P(Ô2) = .004384161

Adding these values to the right-tail probability (.0185) already calculated yields a non-directional two-tailed probability of P=.023.

¶A Handy Number-Cruncher

Although the concepts are relatively simple, the calculations, as I warned at the outset, can be rather laborious. I am therefore providing you with this handy calculator, which will perform the Fisher exact probability test for any particular 2x2 table of cross-classified frequency data, providing that the numbers are not too large. Just for practice, and to assure yourself that the calculator actually works, you might want to try entering some of the 2x2 frequency arrays for the various outcomes, Ô1, Ô2, etc., of the foregoing example. You need not enter anything into the cells that contain "-----", as these marginal totals will be calculated automatically.

Although the Fisher test is designed for use with relatively small samples, the programming for this calculator will actually handle fairly large samples, up to about N=100, depending on how the frequencies are arrayed within the four cells.

Data Entry:

Probability:
 one-tail: two-tail:
 You will find a stand-alone version of this calculator on the VassarStats computational site.

End of Subchapter 8a.