Assumptions of oneway ANOVA for correlated samples (see text of Chapter 15 for details)
~equalinterval scale of measurement
~independence of measures within each group
~normal distribution of source population(s)
~equal variances among groups
~homogeneity of covariance
We have noted several times that the analysis of variance is quite robust with respect to the violation of its assumptions, providing that the
k groups are all of the same size. In the correlatedsamples ANOVA this provision is always satisfied, since the number of measures in each of the groups is necessarily equal to the number of subjects in the repeatedmeasures design, or to the number of matched sets in the randomized blocks design.
Still, there are are certain kinds of correlatedsamples situations where the violation of one or more assumptions might be so thoroughgoing as to cast doubt on any result produced by an analysis of variance. In cases of this sort, a useful
nonparametric alternative can be found in a rankbased procedure known as the
Friedman Test.
There are two kinds of correlatedsamples situations where the advisability of the nonparametric alternative would be fairly obvious. The first would be the case where the k measures for each subject start out as mere rankorderings._{}
E.g.: To assess the likely results of an upcomming election, the 30 members of a presumably representative "focus group" of eligible voters are each asked to rank the 3 candidates, A, B, and C, in the order of their preference (1=most preferred, 3=least preferred).

And the
^{}second would be the case where these measures start out as mere ratings.
_{}
E.g.: The members of the "focus group" are instead asked to rate candidates on a 10point scale (1=lowest rating, 10=highest).

In both of these situations the assumption of an equalinterval scale of measurement is clearly not met. There's a good chance that the assumption of a normal distribution of the source population(s) would also not be met. Other cases where the equalinterval assumption will be thoroughly violated include those in which the scale of measurement is intrinsically nonlinear: for example, the decibel scale of sound intensity, the Richter scale of earthquake intensity, or any logarithmic scale.
 Violin

subjects
 A
 B
 C

1
2
3
4
5
6
7
8
9
10

9.0
9.5
5.0
7.5
9.5
7.5
8.0
7.0
8.5
6.0

7.0
6.5
7.0
7.5
5.0
8.0
6.0
6.5
7.0
7.0

6.0
8.0
4.0
6.0
7.0
6.5
6.0
4.0
6.5
3.0

I will illustrate the Friedman test with a ratingscale example that is close to my amateur violinist's heart. The venerable auction house of Snootly & Snobs will soon be putting three fine 17thand 18thcentury violins, A, B, and C, up for bidding. A certain musical arts foundation, wishing to determine which of these instruments to add to its collection, arranges to have them played by each of 10 concert violinists. The players are blindfolded, so that they cannot tell which violin is which; and each plays the violins in a randomly determined sequence (BCA, ACB, etc.).
They are not informed that the instruments are classic masterworks; all they know is that they are playing three different violins. After each violin is played, the player rates the instrument on a 10point scale of overall excellence
(1=lowest, 10=highest). The players are told that they can also give fractional ratings, such as 6.2 or 4.5, if they wish. The results are shown in the adjacent table. For the sake of consistency, the n=10 players are listed as "subjects."
¶Logic and Procedure
 Original Measures

 Ranked Measures

subjects
 A
 B
 C
 A
 B
 C

1
2
3
4
5
6
7
8
9
10

9.0
9.5
5.0
7.5
9.5
7.5
8.0
7.0
8.5
6.0

7.0
6.5
7.0
7.5
5.0
8.0
6.0
6.5
7.0
7.0

6.0
8.0
4.0
6.0
7.0
6.5
6.0
4.0
6.5
3.0

3
3
2
2.5
3
2
3
3
3
2

2
1
3
2.5
1
3
1.5
2
2
3

1
2
1
1
2
1
1.5
1
1
1

The Friedman test begins by rankordering the measures for each subject. For the present example we will assign the rank of "3" to the largest of a subject's three measures, "2" to the intermediate of the three, and "1" to the smallest. Thus for subject 1, the largest measure is in column A, the next largest in column B, and the smallest in column C; so the sequence of ranks across the row for subject 1 is 3,2,1. For subject 2 it is 3,1,2. And so forth. (The guidelines for assigning tied ranks are described in Subchapter 11a in connection with the MannWhitney test.)
The null hypothesis in this scenario is that the three violins do not differ with respect to whatever it is that blindfolded expert players would judge to be the overall excellence of an instrument.
 Ranked Measures

subjects
 A
 B
 C

1
2
3
4
5
6
7
8
9
10

3
3
2
2.5
3
2
3
3
3
2

2
1
3
2.5
1
3
1.5
2
2
3

1
2
1
1
2
1
1.5
1
1
1

sums
 26.5
 21.0
 12.5

means
 2.65
 2.10
 1.25

This would entail that each of the six possible sequences of A,B,C ranks
1,2,3
1,3,2
2,1,3
2,3,1
3,1,2
3,2,1
is equally likely, hence that the three columns will tend to include a random jumble of 1's, 2's, and 3's, in approximately equal proportions. In this case, the sums and the means of the columns would also tend to come out approximately the same.
In most respects you will find the logic of the Friedman test quite similar to that of the KruskalWallis test examined in Subchapter 14a. For any particular value of
k (the number of measures per subject), the mean of the ranks for any particular one of the n subjects is
(k+1)/2.
Thus for
k=3, as in the present example, it is
4/2=2; for
k=4, it would be
5/2=2.5; and so on. On the null hypothesis, this would also be the expected value of the mean for each of the
k columns. Similarly, the expected value for each of the column sums would be this amount multiplied by the number of subjects:
n(k+1)/2. For the present example, with
n=10, it would be
(10)(4)/2=20.
The following items of symbolic notation are the same ones we used in connection with the KruskalWallis test:


 T_{A} =
 the sum of the n ranks in column A

 M_{A} =
 the mean of the n ranks in column A



 T_{B} =
 the sum of the n ranks in column B

 M_{B} =
 the mean of the n ranks in column B



 T_{C} =
 the sum of the n ranks in column C

 M_{C} =
 the mean of the n ranks in column C



 T_{all} =
 the sum of the nk ranks in all columns combined
[In all cases equal to nk(k+1)/2]_{T}

 M_{all} =
 the mean of the nk ranks in all columns combined
[In all cases equal to (k+1)/2]



·The Measure of Aggregate Group Differences
Also the same as in the KruskalWallis test is the measure of the aggregate degree to which the
k group means differ. It is the betweengroups sum of squared deviates defined in Subchapter 14a as
SS_{bg(R)}, the
"(R)" serving as a reminder that this particular version of
SS_{bg} is based on ranks. The following table summarizes the values needed for the calculation of
SS_{bg(R)}.
 A
 B
 C
 All

counts
 10
 10
 10
 30

n=10 [subjects]_{T}
k=3 [measures per subject]_{T}
nk=30

sums
 26.5
 21.0
 12.5
 60.0

means
 2.65
 2.10
 1.25
 2.0

As in all other cases heretofore examined, the squared deviate for any particular group mean is equal to the squared difference between that group mean and the mean of the overall array of data, multiplied by the number of observations on which the group mean is based. Thus, for each of our current three groups
 A:
 10(2.65—2.0)^{2} = 4.225



 B:
 10(2.10—2.0)^{2} = 0.100

 C:
 10(1.25—2.0)^{2} = 5.625


 SS_{bg(R)} = 9.950

Once again we can write the conceptual formula for SS_{bg(R)} as
 SS_{bg(R)} =([n_{g}(M_{g}—M_{all})^{2}]


As usual, the subscript "g"
means "any particular group."

Except now, since each group is necessarily of the same size, it can be reduced to the simpler form
 SS_{bg(R)} =(n(M_{g}—M_{all})^{2}


n = number of subjects

For the same reason, the computational formula (less susceptible to rounding errors) can take the simpler form
 SS_{bg(R)}
 =
 .(T_{g})^{2} n_{a}
 —
 (T_{all})^{2} nk_{a}


n = number of subjects_{T}
k = measures per subject

For the present example, with n=10,
k=3, and values of
T_{g} and
T_{all} as indicated above, this would come out as
 SS_{bg(R)}
 =
 (26.5)^{2}+(21.0)^{2}+(12.5)^{2} 10
 —
 (60)^{2} 30



 =
 9.95

·The Sampling Distribution of SS_{bg(R)}
When we examined the KruskalWallis test in Subchapter 14a, we saw that
SS_{bg(R)} can be converted into the measure designated as
H, which can then be referred to the sampling distribution of chisquare for
df=k—1. The same is true of the Friedman test; the only difference is in the details of the conversion. For the Friedman test, the resulting measure is spoken of simply as a value of chisquare and takes the form

 x
 =
 SS_{bg(R)} k(k+1)/12


which for the present example comes out as

 x
 =
 9.95 3(3+1)/12
 =
 9.95 1
 = 9.95


When k is equal to 3, the application of this "conversion" formula is merely pro forma; for in this case the denominator of the ratio will always come down to 3(4)/12=1, so the resulting value of chisquare will always be equal to SS_{bg(R)}. This, however, will not be so when k is something other than 3. With k=4 the denominator will be 4(5)/12=1.67; with k=5 it will be 5(6)/12=2.5; and so on.
The following graph is borrowed once again from Chapter 8. As you can see, the observed value of =9.95, when referred to the appropriate sampling distribution of chisquare, is significant beyond the .01 level.
 Theoretical Sampling Distribution of ChiSquare for df=2

Our musical arts foundation can therefore conclude with considerable confidence that the observed differences among the mean rankings for the three violins reflect something more than mere random variability, something more than mere chance coincidence among the judgments of the expert players.
·An Alternative Computational Formula
Textbook accounts of the Friedman test usually give a different computational formula for chisquare. Its advantage is that it can be (slightly) more convenient to use. Its disadvantage is that it does not give you the faintest idea of just what the measure is measuring. But here it is anyway, just in case you ever need to recognize it.

 x
 =
 12 nk(k+1)
 .(T_{g})^{2}— 3n(k+1)


As you can see in connection with our present example, the result comes out quite the same either way.

 x
 =
 12 (10)(3)(4)
 [(26.5)^{2}+(21.0)^{2}+(12.5)^{2}]_{x} — (3)(10)(4)



 =
 (0.1 x 1299.5) — 120 = 9.95


web site has a page that will perform all steps of the Friedman test, including the rankordering of the raw measures.