You can think of this version of the analysis of variance as an extension of the correlatedsamples
ttest described in
Chapter 12. The most conspicuous similarity between the two is in the way the data are arrayed. In the correlatedsamples
ttest we typically have a certain number of subjects, each measured under two conditions, A and B. Or alternatively, we have a certain number of matched pairs of subjects, with one member of the pair measured under condition A and the other measured under condition B.
It is the same structure with the correlatedsamples ANOVA, except that now the number of conditions is three or more: ABC, ABCD, and so forth. When the analysis involves each subject being measured under each of the
k conditions, it is sometimes spoken of as a
repeated measures or
within subjects design. When it involves subjects matched in sets of three for
k=3, four for
k=4, and so on, with the subjects in each matched set randomly assigned to one or another of the
k conditions, it is described as a
randomized blocks design. (In this latter case, each set of
k matched subjects constitutes a "block.") Thus,
for k=3:
Repeated Measures_{}
Subject
 A
 B
 C

1

subj1 under
condition A

subj1 under
condition B

subj1 under
condition C

Each row represents one
subject measured under
each of k conditions.

2

subj2 under
condition A

subj2 under
condition B

subj2 under
condition C

3

subj3 under
condition A

subj3 under
condition B

subj3 under
condition C

And so on.

Randomized Blocks_{}
Block
 A
 B
 C

1

subj1a under
condition A

subj1b under
condition B

subj1c under
condition C

Each row includes
k matched subjects,
each measured under
one or another of the
k conditions

2

subj2a under
condition A

subj2b under
condition B

subj2c under
condition C

3

subj3a under
condition A

subj3b under
condition B

subj3c under
condition C

And so on.

In both versions, repeated measures and randomized blocks, the utility of the correlatedsamples ANOVA is the same as for the correlatedsamples
ttest: it is highly effective in removing the extraneous variability that derives from preexisting individual differences. It is the same point made in Chapter 12. In some cases individual differences might be the very essence of the phenomena that are of interest. But there are also many situations where they are merely irrelevant clutter.
Up to a point, the logic and procedure of the correlatedsamples ANOVA are the same as for the independentsamples version described in Chapters 13 and 14. In the independentsamples ANOVA,
SS_{T} is analyzed into two complementary components,
SS_{bg} and
SS_{wg}. Each of the latter, divided by its respective value of
df, then yields a value of
MS; and these, in turn, yield the
Fratio. The numerator of the ratio,
MS_{bg}, reflects the aggregate differences among the means of the
k groups of measures; and the denominator,
MS_{wg}, reflects the random variability, commonly described as "error," that exists inside the
k groups.
In most reallife situations, a certain amount of the variability that exists inside the
k groups will reflect preexisting individual differences among the subjects. In the medication experiment described in Chapter 14, for example, it is possible that the prior aversive conditioning had been more effective in some subjects than in others; or that some subjects were more prone to agitation than others; or simply that some were stronger than others and could therefore pull harder. Each of these sources of variability would be entirely extraneous to the question the investigators were aiming to answer. Of course, one way to avoid the extraneous clutter would be to ensure at the outset that all subjects are equally well conditioned, equally prone to agitation, equally strong, and so forth. But that would surely be more easily said than done.
In the correlatedsamples ANOVA, the extraneous clutter is not avoided. It is faced headon, identified, and
removed. Whether by repeated measures or randomized blocks, the correlatedsamples design allows us to identify the portion of
SS_{wg} that is attributable to preexisting individual differences. This portion, designated as
SS_{subjects}, is dropped from the analysis; and the portion that remains,
SS_{error}, is then used as the measure of sheer, cussed random variability.
To illustrate the procedures we will take an another example involving sensorymotor coordination. The following picture and diagram represent a device known as a rotary pursuit apparatus, often used in psychological research as an instrument for measuring sensorymotor coordination. On its top is a turntable that can be set to rotate at preselected speeds. Toward the edge of the turntable is a metal disk (a) that revolves around the center of the turntable as the latter rotates. Connected to the apparatus is a handheld stylus (B) with a metal tip. The subject's task is to pursue the revolving disk with the stylus, keeping the stylus in contact with the disk as much as possible. When the two are in contact an electrical circuit is closed, thus providing a record of how well the subject does on the task.
 Courtesy of Lafayette Instruments


An investigator is interested in assessing the effects of rhythmic auditory stimulation on performance of the rotary pursuit task. To this end, she has each of 18 randomly selected human subjects perform the task under each of three conditions. In all conditions, the turntable rotates counterclockwise at a constant rate of one rotation per second. In condition A there is no auditory stimulation other than the sound normally made by the apparatus. In the other two conditions the subject hears a periodic clicking sound. In condition B, the click occurs twice per second; and in condition C, six times per second. (The investigator selects these particular frequencies on certain theoretical grounds, which we need not go into.) To obviate the possibility of sequence effects, three of the subjects perform the task in the sequence ABC; three perform it in the sequence ACB; and so on for all the other of the six possible sequences: BAC, BCA, CAB, and CBA.
The following table shows the measures of how well each of the 18 subjects performed under each of the
k=3 conditions. Also shown are the means of the three groups of measures and the mean performance of each individual subject across the three conditions. To emphasize the substantial individual differences that occur among the subjects, they are listed in the order of their respective individual mean levels of performance, from highest to lowest. Immediately below the table is Figure 15.1, showing the pattern of group
means, ABC, in graphic form.
 Conditions ("cps"=clicks per second)

Sub jects
 A [0cps]
 B [2cps]
 C [6cps]

 Subject Means

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

35 32 33 32
31 29 29 27
27 28 27 27
24 24 17 17
14 13

39 35 32 32
33 30 31 29
31 27 27 26
29 25 16 15
15 13

32 31 28 29
26 29 27 27
24 24 23 23
19 19 18 17
12 13

35.3 32.7 31.0 31.0
30.0 29.3 29.0 27.7
27.3 26.3 25.7 25.3
24.0 22.7 17.0 16.3
13.7 13.0

Group Means
 25.9
 26.9
 23.4

Figure 15.1. Mean Performance in Conditions A, B, and C


 Please keep in mind that the data in this example are completely imaginary. I do not know whether an actual experiment of this sort would produce a pattern of group means resembling the one shown here.

So here we are again: on the one hand, ostensible differences among the group means; and on the other, the everpresent possibility that the observed "effect" results from nothing more than mere random variability.
¶Parallels with the independentsamples ANOVA
The first few steps in the analysis are exactly the same as you saw in Chapter 14. Basic preliminary numbercrunching on the data in columns A, B, and C of the above table yields the following values of
∑X_{i} and
∑X^{2}_{i} for the three groups and for total array of data.
 A [0cps]
 B [2cps]
 C [6cps]
 All groups combined

 N_{a}=18
∑X_{ai}=466
∑X^{2}_{ai}= 12800
 N_{b}=18
∑X_{bi}=485
∑X^{2}_{bi}= 14021
 N_{c}=18
∑X_{ci}=421
∑X^{2}_{ci}= 10443
 N_{T}=54
∑X_{Ti}=1372
∑X^{2}_{Ti}= 37264

These in turn allow for the calculation of the following sums of squared deviates.
 A [0cps]
 B [2cps]
 C [6cps]
 All groups combined

 SS_{a}=735.8
 SS_{b}=952.9
 SS_{c}=596.3
 SS_{T}=2405.0


(If it is not clear where the four values of SS are coming from, click here for an account of the computational details.)

As in the independentsamples ANOVA, the withingroups
SS is the sum of the
k separate values of
SS_{g} (recall that the subscript "g" means "any particular group"). Thus
 SS_{wg}
 = SS_{a} + SS_{b} + SS_{c}


 = 735.8 + 952.9 + 596.3


 = 2285.0

Similarly, the betweengroups
SS can then be calculated as
 SS_{bg}
 = SS_{T} — SS_{wg}


 = 2405.0 — 2285.0


 = 120.0

Here again it is a good idea to check the accuracy of one's calculations up to this point by also fetching SS_{bg} through the computational formula
 SS_{bg}
 =
 (∑X_{ai})^{2} N_{a}
 +
 (∑X_{bi})^{2} N_{b}
 +
 (∑X_{ci})^{2} N_{c}
 —
 (∑X_{Ti})^{2} N_{T}



 =
 (466)^{2} 18
 +
 (485)^{2} 18
 +
 (421)^{2} 18
 —
 (1372)^{2} 54



 =
 120.0

As illustrated by the following diagram, the aggregate differences
(SS_{bg}) among the means of the 3 groups are rather tiny in comparison with the variability
(SS_{wg}) that occurs inside the groups. If you were performing the analysis as an independentsamples ANOVA, this would not be great news unless you were hoping to find a nonsignificant result. For that huge value of
SS_{wg} would also give you a large value of
MS_{wg}, hence a rather puny value of
F.

SS_{T}=2405.0
SS_{bg}=120.0
SS_{wg}=2285.0



As already mentioned, the virtue of the correlatedsamples procedure is that it takes the analysis a step further. It has arranged things in advance so that it can identify the portion of
SS_{wg} that derives from preexisting individual differences. And once identified, this portion—in the present case, a very large portion—can be removed. The details of identification and removal will be described in Part 2.