©Richard Lowry, 1999-
All rights reserved.
¶Example 1. Comparative Effects of Two Methods of Hypnotic Induction
X = the score on the index of primary suggestibility Y = the score on the index of hypnotic induction |
Method A | Method B | |||||
Sub- ject | X_{a} | Y_{a} | Sub- ject | X_{b} | Y_{b} | |
a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 |
5 10 12 9 23 21 14 18 6 13 |
20 23 30 25 34 40 27 38 24 31 |
b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 |
7 12 27 24 18 22 26 21 14 9 |
19 26 33 35 30 31 34 28 23 22 | |
Means | 13.1 | 29.2 | 18.0 | 28.1 |
If you have not already done so, please click here to place a version of the data table into the frame on the left. |
Items to be calculated: SS_{T(Y)} SS_{wg(Y)} SS_{bg(Y)} |
Items to be calculated: SS_{T(X)} SS_{wg(X)} |
Items to be calculated: SC_{T} SC_{wg} |
1. Calculations for the Dependent Variable Y
The following table shows the values of Y along with the several summary statistics required for the calculation of SS_{T(Y)}, SS_{wg(Y)}, and SS_{bg(Y)}:
Y_{a} | Y_{b} | ||||
20 23 30 25 34 40 27 38 24 31 |
19 26 33 35 30 31 34 28 23 22 | for total array of data |
Click here if you wish to see the details of calculation for this data set. | ||
N | 10 | 10 | 20 | SS_{T(Y)} = 668.5 SS_{wg(Y)} = 662.5 SS_{bg(Y)} = 6.0 | |
.∑Y_{i} | 292 | 281 | 573 | ||
.∑Y_{i}^{2} | 8920 | 8165 | 17085 | ||
SS | 393.6 | 268.9 | |||
Mean | 29.2 | 28.1 | 28.7 |
Just as an aside, note the discrepancy between SS_{bg(Y)}=6.0 [0.9% of total variability] and SS_{wg(Y)}=662.5 [99.1% of total variability] This reflects the fact mentioned earlier, that the mean difference between the groups is quite small in comparison with the variability that exists inside the groups. |
The next table has the same structure as the one above, but now what we show are the values of X along with the several summary statistics required for the calculation of SS_{T(X)} and SS_{wg(X)}. (We need not bother with SS_{bg(X)}, since it does not enter into any subsequent calculations.)
X_{a} | X_{b} | ||||
5 10 12 9 23 21 14 18 6 13 |
7 12 27 24 18 22 26 21 14 9 | for total array of data |
Click here if you wish to see the details of calculation for this data set. | ||
N | 10 | 10 | 20 | SS_{T(X)} = 908.9 SS_{wg(X)} = 788.9 | |
.∑X_{i} | 131 | 180 | 311 | ||
.∑X_{i}^{2} | 2045 | 3700 | 5745 | ||
SS | 328.9 | 460.0 | |||
Mean | 13.1 | 18.0 | 15.6 |
You will recall from Chapter 3 that the raw measure of the covariance between two variables, X and Y, is a quantity known as the sum of co-deviates:
SC = ∑(X_{i}—M_{X})(Y_{i}—M_{Y}) | (X_{i}—M_{X}) = deviate_{X} (Y_{i}—M_{Y}) = deviate_{Y} (X_{i}—M_{X})(Y_{i}—M_{Y}) = co-deviate_{XY} |
SC | = ∑(X_{i}Y_{i}) — | (∑X_{i})(∑Y_{i}) N |
We will be calculating two separate values of SC: one for the covariance of X and Y within the total array of data, and another for the covariance within the two groups.
The next table shows the cross-products of X_{i} and Y_{i} for each subject in each of the two groups. (E.g., the first subject in group A had
Groups | |||
A | B | For group A: o∑X_{ai} = 131 o∑Y_{ai} = 292 For group B: o∑X_{bi} = 180 o∑Y_{bi} = 281 For total array: o∑X_{Ti} = 311 o∑Y_{Ti} = 573 | |
X_{a}Y_{a} | X_{b}Y_{b} | ||
100 230 360 225 782 840 378 684 144 403 |
133 312 891 840 540 682 884 588 322 198 | for total array of data | |
Sums | 4146 | 5390 | 9536 |
.∑(X_{ai}Y_{ai}) | .∑(X_{bi}Y_{bi}) | .∑(X_{Ti}Y_{Ti}) |
SC_{T} | = ∑(X_{Ti}Y_{Ti}) — | (∑X_{Ti})(∑Y_{Ti}) N_{T} |
SC_{T} | = 9536 — | (311)(573) 20 | = 625.9 |
For Group A: | |||
SC_{wg(a)} | = ∑(X_{ai}Y_{ai}) — | (∑X_{ai})(∑Y_{ai}) N_{a} |
SC_{wg(a)} | = 4146 — | (131)(292) 10 | = 320.8 |
For Group B: | |||
SC_{wg(b)} | = ∑(X_{bi}Y_{bi}) — | (∑X_{bi})(∑Y_{bi}) N_{b} |
SC_{wg(b)} | = 5390 — | (180)(281) 10 | = 332.0 |
the sum of which will then yield the within-groups covariance measure of
SC_{wg} | = SC_{wg(a)} + SC_{wg(b)} | |
= 320.8 + 332.0 = 652.8 |
We begin with a summary of the values of SS and SC obtained so far, as these will be needed in the calculations that follow. Recall that Y is the variable in which we are chiefly interested, and X is the covariate whose effects we are seeking to remove.
X | Y | Covariance | |||
SS_{T(X)} = 908.9 SS_{wg(X)} = 788.9 |
SS_{T(Y)} = 668.5 SS_{wg(Y)} = 662.5 SS_{bg(Y)} = 6.0 |
SC_{T} = 625.9 SC_{wg} = 652.8 | |||
For handy reference, click here to place a version of this table into the frame on the left.] |
From Chapter 3 you know that the overall correlation between X and Y (both groups combined) can be calculated as
r_{T} | = |
SC_{T} sqrt[SS_{T(X)} x SS_{T(Y)}] | |||
= |
625.9 sqrt[908.9 x 668.5] | ||||
= | +.803 |
The proportion of the total variability of Y attributable to its covariance with X is accordingly
(r_{T})^{2} = (+.803)^{2} = .645 |
In the first step of this series, we adjust SS_{T(Y)} by removing from it this proportion of covariance. Given
668.5 x .645 = 431.2 |
668.5—431.2 = 237.3 [tentative] |
[adj]SS_{T(Y)} | = SS_{T(Y)}— |
(SC_{T})^{2} SS_{T(X)} | "[adj]" = "adjusted" | ||
= 668.5 — | (625.9)^{2} 908.9 | ||||
= 237.5 |
By analogy, the aggregate correlation between X and Y within the two groups can be calculated as
r_{wg} | = |
SC_{wg} sqrt[SS_{wg(X)} x SS_{wg(Y)}] | |||
= |
652.8 sqrt[788.9 x 662.5] | ||||
= | +.903 |
The proportion of the within-groups variability of Y attributable to covariance with X is therefore
(r_{wg})^{2} = (+.903)^{2} = .815 |
662.5 x .815 = 539.9 |
662.5—539.9 = 122.6 [tentative] |
[adj]SS_{wg(Y)} | = SS_{wg(Y)}— |
(SC_{wg})^{2} SS_{wg(X)} | |
= 662.5 — | (652.8)^{2} 788.9 | ||
= 122.3 |
The adjusted value of SS_{bg(Y)} can then be obtained through simple subtraction as
[adj]SS_{bg(Y)} | = |
[adj]SS_{T(Y)}
— [adj]SS_{wg(Y)} | |
= | 237.5 — 122.3 = 115.2 |
b_{wg} | = | SC_{wg} SS_{wg(X)} | |||
= | 652.8 788.9 | = +.83 |
M_{X} | M_{Y} | |
group A | 13.1 | 29.2 |
group B | 18.0 | 28.1 |
combined | 15.55 | 28.65 |
[adj]M_{Ya} = 29.2 + (2.45x.83) = 31.23 | |
and | |
[adj]M_{Yb} = 28.1 — (2.45x.83) = 26.07 |
This is the conceptual structure for the adjustment of the means of Y for the groups. Once you have the concept, the mechanics of the process can be accomplished somewhat more expeditiously via the following computational formula:
For Group A: | |||
[adj]M_{Ya} | = M_{Ya} — b_{wg}(M_{Xa}—M_{XT}) | ||
= 29.2 — .83(13.1—15.55) | |||
= 31.23 |
For Group B: | |||
[adj]M_{Yb} | = M_{Yb} — b_{wg}(M_{Xb}—M_{XT}) | ||
= 28.1 — .83(18.0—15.55) | |||
= 26.07 |
As with the corresponding one-way ANOVA, the final step in a one-way analysis of covariance involves the calculation of an
F = | MS_{effect} MS_{error} | = | MS_{bg} MS_{wg} | = | SS_{bg}/df_{bg} SS_{wg}/df_{wg} |
[adj]df_{wg(Y)} = N_{T}—k—1 | for the present example: 20—2—1 = 17 |
df_{bg(Y)} = k—1 | for the present example: 2—1 = 1 |
F | = | [adj]SS_{bg(Y)}/df_{bg(Y)} [adj]SS_{wg(Y)}/[adj]df_{wg(Y)} | ||
= | 115.2/1 122.3/17 | |||
= | 16.01 [with df=1,17] |
df denomi- nator | df numerator | ||
1 | 2 | 3 | |
17 | 4.45 8.40 | 3.59 6.11 | 3.20 5.19 |
M_{Ya}=29.2 versus M_{Yb}=28.1 | |
[adj]M_{Ya}=31.23 versus [adj]M_{Yb}=26.07 |
The analysis of covariance has the same underlying assumptions as its parent, the analysis of variance. It also has the same robustness with respect to the non-satisfaction of these assumptions, providing that all groups have the same number of subjects. There is, however, one assumption that the analysis of covariance has in addition, by virtue of its co-descent from correlation and regression—namely, that the slopes of the regression lines for each of the groups considered separately are all approximately the same.
The operative word here is "approximately." Because of random variability, it would rarely happen that two or more samples of bivariate XY values would all end up with precisely the same slope, even though the samples might be drawn from the very same population. And so it is for the slopes of the separate regression lines for our two present samples. They are clearly not precisely the same. The question is, are they close enough to be regarded as reflecting the same underlying relationship between X
Home | Click this link only if the present page does not appear in a frameset headed by the logo Concepts and Applications of Inferential Statistics |