©Richard Lowry, 1999
All rights reserved.
Group A  Group B  
4.64.74.9 5.15.25.5 5.86.16.5 6.57.2  5.25.35.4 5.66.26.3 6.87.78.0 8.1  
mean  5.6  6.5 
Raw Measure  Rank  from Sample 
4.6 4.7 4.9 5.1 5.2 5.2 5.3 5.4 5.5 5.6 5.8 6.1 6.2 6.3 6.5 6.5 6.8 7.2 7.7 8.0 8.1  1 2 3 4 5.5 5.5 7 8 9 10 11 12 13 14 15.5 15.5 17 18 19 20 21  A A A A A B B B A B A A B B A A B A B B B 
N  = n_{a} + n_{b}  
= 11 + 10 = 21  
Tied Ranks. Note that two of the entries in the rawmeasures column each have the value of 5.2. As these two entries fall in the sequence where ranks #5 and #6 would be, in the absence of such a tie, they are each given the average of these two ranks, which is 5.5. Similarly for the two rawmeasure entries whose value is 6.5; each is accorded the average of ranks #15 and #16, which is 15.5. If there were three rawmeasure entries tied for ranks 8, 9, and 10, each would receive the average of those ranks, which is 9. And so on.

Raw Measures  Ranked Measures  
Group A  Group B  Group A  Group B  
4.6 4.7 4.9 5.1 5.2 5.5 5.8 6.1 6.5 6.5 7.2 
5.2 5.3 5.4 5.6 6.2 6.3 6.8 7.7 8.0 8.1 
1 2 3 4 5.5 9 11 12 15.5 15.5 18 
5.5 7 8 10 13 14 17 19 20 21  A & B Combined  
sum of ranks  96.5  134.5  231  
average of ranks  8.8  13.5  11 
T_{A} =  the sum of the n_{a} ranks in group A  
T_{B} =  the sum of the n_{b} ranks in group B  
T_{AB} =  the sum of the N ranks in groups A and B combined 
T_{A} =  96.5  [with n_{a}=11]  
T_{B} =  134.5  [with n_{b}=10]  
T_{AB} =  231  [with N=21] 
T_{AB}  =  N(N+1) 2 
for N=4: (4x5)/2=10 for N=5: (5x6)/2=15 for N=21: (21x22)/2=231 
mean rank_{AB}  =  N(N+1) 2  x  1 N  =  N+1 2 
for N=4: 5/2=2.5 for N=5: 6/2=3 for N=21: 22/2=11 
So if the null hypothesis were true, we would expect the separate averages of the A ranks and the B ranks each to approximate this same overall mean value
T_{A} = n_{a}(N+1)/2 = 11(21+1)/2 = 121  
and  
T_{B} = n_{b}(N+1)/2 = 10(21+1)/2 = 110 
_{T} = sqrt  [  n_{a}n_{b}(N+1) 12  ] 
_{T} = sqrt  [  (11)(10)(21+1) 12  ]  = ±14.2 
If the preceding two steps have not been obvious, the next one surely will be. Given the mean and standard deviation of a normally distributed sampling distribution, you can then fold the observed value of either T_{A} or T_{B} into an appropriate version of a
Designating
T_{obs}  as the observed value of either T_{A} or T_{B}; 
_{T}  as the the mean of the corresponding sampling distribution of T; and 
_{T}  as the standard deviation of that sampling distribution, 
the general structure of the ratio is
z  =  (T_{obs}—_{T})±.5 _{T} 
correction for continuity: —.5 when T_{obs}>_{T} +.5 when T_{obs}<_{T} 
z_{A}  =  (96.5—121)+.5 14.2  z_{B}  =  (134.5—110)—.5 14.2  
=  —24 14.2  = —1.69  =  +24 14.2  = +1.69 
In our example, the investigators begin with the directional hypothesis that Treatment A will prove the more effective. This entails that sample A will tend to have the smaller values among the raw measures (indicating lower levels of claustrophobic tendency), hence the smaller ranks, hence an observed value of T_{A} smaller than its nullhypothesis value of 121, hence a value of z_{A} with a negative sign. With the opposite directional hypothesis—
Level of Significance for a  
Directional Test  
.05  .025  .01  .005  .0005 
NonDirectional Test  
  .05  .02  .01  .001 
z_{critical}  
1.645  1.960  2.326  2.576  3.291 
Here as well we begin with some things that can be known by dint of sheer logic. Continuing with our claustrophobia example, recall that
This first point will be fairly obvious. The maximum possible value of T_{A} in our example would be the sum of the highest
Similarly, the maximum possible value of T_{B} would be the sum of the highest
For any particular combination of n_{a} and n_{b}, these maximum possible values can be reached through the formulas
T_{A[max]}  = n_{a}n_{b} +  n_{a}(n_{a}+1) 2 
T_{A[max]}  = (11)(10) +  11(12) 2  =  176 
T_{B[max]}  = n_{a}n_{b} +  n_{b}(n_{b}+1) 2 
T_{B[max]}  = (11)(10) +  10(11) 2  =  165 
U_{A}  = T_{A[max]} — T_{A} 
U_{A}  = n_{a}n_{b} +  n_{a}(n_{a}+1) 2  — T_{A} 
U_{B}  = T_{B[max]} — T_{B} 
U_{B}  = n_{a}n_{b} +  n_{b}(n_{b}+1) 2  — T_{B} 
maximum possible value  observed value  
T_{A}  176  96.5  
T_{B}  165  134.5 
It does not matter which of these values you use, so long as you are consistent. The reason it does not matter is that U_{A} and U_{B} are mirror images of each other. For any given values of n_{a} and n_{b}, the sum of U_{A} and U_{B} will always be equal to the product of
U_{A}+U_{B} =  n_{a}n_{b}  
U_{A} =  n_{a}n_{b}—U_{B}  
U_{B} =  n_{a}n_{b}—U_{A} 
More generally, the nullhypothesis values of U are given by the identity
U_{A} = U_{B} = (n_{a}n_{b})/2 
U_{A} = U_{B} = (11)(10)/2 = 55 
When I tell you that the total number of possible combinations in this example is
N! n_{a}!n_{b}!  =  21! 11!10!  = 352,716  
I cannot show you the full scope of this table, because it is under copyright and I would have to pay a fee to reproduce it. Fortunately the underlying principles cannot be held in copyright, so what I have done instead is program the following table to calculate some of the critical values of U directly. Enter any particular values of n_{a} and n_{b} into the designated cells, click the "Calculate" button, and the corresponding critical values of U will appear in the table. The only restriction is that n_{a} and n_{b} must both be between 5 and 20, inclusive. (For cases where the size of either sample is smaller than 5, you will need to consult a table in the back of some hard
Before you start plugging any new numbers, however, please be sure to read the explanatory text that follows the table.
Home  Click this link only if the present page does not appear in a frameset headed by the logo Concepts and Applications of Inferential Statistics 