All rights reserved.
The first five columns of the following table are a somewhat streamlined version of the data table you saw in Part 1. The means for the individual subjects are now highlighted in blue, and next to each subject's mean, in a new column, is a calculation whose general form you will surely recognize from Chapter 14. The difference between each subject's mean and the mean of the total array of data is taken as a deviate:
|
| deviate = Msubj* MT
|
|
We will use the subscript "subj*" to mean
"any particular subject." The first subject
would be subj1; the second, subj2; and so on.
| |
The square of that difference is accordingly a squared deviate:
|
| squared deviate = (Msubj* MT)2
| |
And that squared deviate is then weighted by
k=3, which is the number of measures (X
ai, X
bi,
and Xci) on which the subject's mean is based.
|
| weighted squared deviate = k(Msubj* MT)2
| |
It is the same general structure we used in Chapter 14 to calculate the value of
SSbg from scratch, to measure the aggregate differences among the
k groups. Now what we come out with is
SSsubj, which is a measure of the aggregate differences among the 18 individual subjects.
subjects
| A
| B
| C
|
| Subject Means
21 22 23 24 25 26 27 28 29 210 211 212 213 214 215 216 217 218
|
235 232 233 232
231 229 229 227
227 228 227 227
224 224 217 217
214 213
|
239 235 232 232
233 230 231 229
231 227 227 226
229 225 216 215
215 213
|
232 231 228 229
226 229 227 227
224 224 223 223
219 219 218 217
212 213
|
235.3 232.7 231.0 231.0
230.0 229.3 229.0 227.7
227.3 226.3 225.7 225.3
224.0 222.7 217.0 216.3
213.7 213.0
|
3(35.325.4)2=294
3(32.725.4)2=159.9
3(31.025.4)2=94.1
3(31.025.4)2=94.1
3(30.025.4)2=63.5
3(29.325.4)2=45.6
3(29.025.4)2=38.9
3(27.725.4)2=15.9
3(27.325.4)2=10.8
3(26.325.4)2=2.4
3(25.725.4)2=0.3
3(25.325.4)2=0
3(24.025.4)2=5.9
3(22.725.4)2=21.9
3(17.025.4)2=211.7
3(16.325.4)2=248.4
3(13.725.4)2=410.7
3(13.025.4)2=461.3
MT=25.4
| SSsubj=2179.4
| | |
I show you this particular method for calculating
SSsubj only because it gives a clear idea of the conceptual structure of the measure. It would not be the method to use for practical computational purposes, as it is likely to end up with a considerable accumulation of rounding error. Also, if you are doing it by hand, it can be rather laborious. The following method is somewhat less laborious and will in any event minimize the risk of rounding errors. You will recognize it as analogous to the computational formula for
SSbg that we worked through in Part 1; namely
| SSbg
| =
| (∑Xai)2 Na
| +
| (∑Xbi)2 Nb
| +
| (∑Xci)2 Nc
|
| (∑XTi)2 NT
|
Except now the string of items to the left of the minus sign is replaced by another string that looks like this:
| (∑Xsubj1)2 k
| +
| (∑Xsubj2)2 k
| +
| (∑Xsubj3)2 k
| +
| . . .
|
|
and so on, for as many subjects
as you have in the analysis.
|
That is: For each subject in the analysis, take the sum of that subject's scores across all
k conditions, square that sum, and then divide the result by
k, which is the number of scores on which the sum is based.
Since
k is the same for each subject, this portion of the formula to the left of the minus sign can be rewritten as
| (∑Xsubj*)2 k
|
|
That is: For each subject in the analysis, take
the sum of that subject's scores across all k
conditions, square that sum, add up these
squared sums across all subjects, and then
divide the total of squared sums by k.
|
and the entire formula can then be written more compactly as
| SSsubj
| =
| (∑Xsubj*)2 k
|
| (∑XTi)2 NT
|
The next version of our data table shows the sum for each subject, along with the calculation of the square of each subject's sum.
subjects
| A
| B
| C
|
| Subject Sums
| (∑Xsubj*)2
21 22 23 24 25 26 27 28 29 210 211 212 213 214 215 216 217 218
|
235 232 233 232
231 229 229 227
227 228 227 227
224 224 217 217
214 213
|
239 235 232 232
233 230 231 229
231 227 227 226
229 225 216 215
215 213
|
232 231 228 229
226 229 227 227
224 224 223 223
219 219 218 217
212 213
|
2106 298 293 293
290 288 287 283
282 279 277 276
272 268 251 249
241 239
|
1062=11236
982=9604
932=8649
932=8649
902=8100
882=7744
872=7569
832=6889
822=6724
792=6241
772=5929
762=5776
722=5184
682=4624
512=2601
492=2401
412=1681
392=1521
(∑Xsubj*)2=111122
| | |
Recalling that
∑XTi=1372 and NT=54, our calculation of
SSsubj is accordingly
| SSsubj
| =
| (∑Xsubj*)2 k
|
| (∑XTi)2 NT
|
|
| =
| 111122 3
|
| (1372)2 54
|
|
| =
| 2181.7
| | |
As always, one must be careful not to get so lost in the calculations as to lose sight of the concepts. What we have just calculated with
SSsubj is the amount of variability within the total array of data that derives from individual differences among the subjects. And now that this amount has been identified, it can be removed. The process of removal is one of simple subtraction. The total amount of within-groups variability is
SSwg=2285.0. Of this amount,
SSsubj=2181.7 is attributal not to sheer, cussed random variability, but to extraneous pre-existing individual differences among the 18 subjects. Remove the latter from the former, and what remains is
| SSerror
| = SSwgSSsubj
|
| = 2285.02181.7
|
| = 103.3
| | |
Here is one of the same diagrams you saw in Part 1, except now the specific numerical values have been included for the several values of
SS. The bottom line is that a huge chunk of
SSwg is attributable to individual differences among the subjects, and that chunk is now removed. From this point on, it is all a smooth ride across familiar terrain.
¶Calculation of MSbg, MSerror, and F
We begin by sorting out the several components of degrees of freedom. The first three are the same as we would have with a one-way independent-samples ANOVA:
|
| dfT = NT1 = 541 = 53
|
|
Note that:T
dfT = dfbg+dfwg
|
| dfbg = k1 = 31 = 2
|
and
|
| dfwg = NTk = 543 = 51
| | | | |
The only difference is that now we break
dfwg into two further components, corresponding respectively to the two components of within-groups variability,
SSsubj and
SSerror. For
dfsubj the basic concept is once again "the number of items minus one." In this case the number of items is the number of subjects, which we will designate as N
subj. Thus
|
|
| dfsubj = Nsubj1 = 181 = 17
| |
Remove this component from
dfwg, and what remains is
|
|
| dferror = dfwgdfsubj = 5117 = 34
| |
With these, we can then calculate the two relevant
MS values as
|
| MSbg
| =
| SSbg dfbg
| =
| 120.0 2
| = 60.0
| |
and
|
| MSerror
| =
| SSerror dferror
| =
| 103.3 34
| = 3.0
| |
which in turn give us
|
| F
| =
| MSbg MSerror
| =
| 60.0 3.0
| = 20.0 with df=2,34
| |
Figure 15.2 shows the sampling distribution of
F for
df=2,34, and the adjacent table shows the corresponding portion of
Appendix D. As indicated,
F=3.28 and
F=5.29 mark the points in this distribution beyond which fall 5% and 1%, respectively, of all possible mere-chance outcomes, assuming the null hypothesis to be true.
Figure 15.2. Sampling Distribution of F for df=2,34
|
df denomi- nator
| df numerator
1
| 2
| 3
34
| 4.13 7.44
| 3.28 5.29
| 2.88 4.42
| | | |
As the observed value of F=20.0 falls far to the right of F=5.29, the aggregate differences among the means of the three groups of measures, A|B|C, can be adjudged significant well beyond the .01 level. Our investigator can accordingly reject the null hypothesis, concluding with a high degree of confidence that the aggregate mean differences among the three groups of measures stem from something more than mere chance coincidence.
|
|
|
ANOVA Summary Table
| Source
| SS
| df
| MS
| F
| P
between groups ("effect")
| 120.0
| 2
| 60.0
| 20.0
| <.01
within groups
| 2285.0
| 51
|
|
·error
| 103.3
| 34
| 3.0
·subjects
| 2181.7
| 17
|
|
TOTAL
| 2405.0
| 53
| | | |
The assumptions and step-by-step computational procedures for the one-way correlated-samples ANOVA will be outlined in Part 3 of this chapter. It is also in Part 3 that we will see how the Tukey HSD test can be adapted to perform pairwise comparisons among the individual group means in a correlated-samples ANOVA.