All rights reserved.
Example 1.
raw data
| B
0 units
| 1 unit
A
| 0 units
|
20.4 17.4
20.0 18.4
24.5 21.0
19.7 22.3
17.3 23.3
|
20.5 26.3
26.6 19.8
25.4 28.2
22.6 23.7
22.5 22.6
1
unit
|
22.4 19.1
22.4 25.4
26.2 25.1
28.8 21.8
26.3 25.2
|
34.1 21.9
32.6 28.5
29.0 25.8
29.0 27.1
25.7 24.4
| | | |
In order to test the separate and mutual effects of two drugs, A and B, on physiological arousal, researchers randomly and independently sorted 40 laboratory rats into four groups of 10 subjects each. Each group received a certain dosage of drug A (zero units or 1 unit) and a certain dosage of drug B (zero units or 1 unit). The dependent variable was a standard measure of physiological arousal. As in the earlier two-drug illustration, one of the groups served as a control, receiving only an inert placebo containing zero units of A and zero units of B. The adjacent table shows the consequent measures of physiological arousal for each subject in each of the four groups.
Performing the requisite number-crunching on this array yields the summary values shown in the next table. The values pertaining to the four groups of measures in the original table of raw data are subscripted as g1, g2, g3, and g4; those pertaining to the rows are subscripted as r1 and r2; and those pertaining to the columns are subscripted as c1 and c2. The values deriving from the entire array of data (all groups combined) are as usual subscripted "T." It is worth a moment of your time to look back and forth between the table above and the one below to make sure you have a clear sense of where the summary values are coming from. (Click
here if you would like a printable summary of these and other relevant tables of information as you work your way through this example.)
summary data
| B
0 units
| 1 unit
|
| rows
A
| 0 units
|
Ng1=10
∑Xg1=204.3
∑X2g1=4226.3
|
Ng2=10
∑Xg2=238.2
∑X2g2=5741.4
|
Nr1=20
∑Xr1=442.5
1
unit
|
Ng3=10
∑Xg3=242.7
∑X2g3=5961.34
|
Ng4=10
∑Xg4=278.1
∑X2g4=7855.3
|
Nr2=20
∑Xr2=520.8
|
columns
|
Nc1=20
∑Xc1=447.0
|
Nc2=20
∑Xc2=516.3
|
NT=40
∑XT=963.3
∑X2T=23784.4
| | | | |
Up to this point it is all a tangle of numbers with no immediately discernible structure. It is only when you calculate the various meansfor groups, rows, columns, and total array of datathat the pattern becomes visible. These means are given in the next table, along with a plot of the group means analogous to the ones shown in the earlier illustrations.
means
| B
0 units
| 1 unit
|
| rows
|
|
A
| 0 units
| Mg1=20.43
|
Mg2=23.82
|
Mr1=22.13
1
unit
| Mg3=24.27
| Mg4=27.81
| Mr2=26.04
|
columns
| Mc1=22.35
| Mc2=25.82
| MT=24.08
| | | |
The resemblance between the above graph and the plot of our earlier Scenario 2 is no accident. When doing actual research, you must of course take your data as they come. The nice thing about generating illustrative data for a textbook is that you can shape them in any way you want. For this example I have deliberately arranged things so that there will be main effects for the row and column variables, but no interaction effect. The rationale is that understanding the presence of an interaction effect is best arrived at by first understanding its absence.
The next table shows the sums of squared deviates within each of the four groups, as well as for all four groups combined. These values are calculated exactly as they are in the corresponding one-way ANOVA, according to the general computational formula
preliminary SS values
| B
0 units
| 1 unit
A
| 0 units
| SSg1=52.44
|
SSg2=67.48
1
unit
| SSg3=71.02
| SSg4=121.37
|
| SST=585.70
| | | | |
Also as in the one-way ANOVA,
SSwg is given by the sum of the
SS measures within the several groups. In the present case there are four groups; hence
| SSwg
| = SSg1 + SSg2 + SSg3 + SSg4
|
| = 52.44 + 67.48 + 71.02 + 121.37
|
| = 312.31
| | |
The similarity continues for one more step, and then we shift gears. Once you have
SST and
SSwg, the measure of between-groups
SS can be reached through simple subtraction:
| SSbg
| = SST SSwg
|
| = 585.70 312.31
|
| = 273.39
| | |
Here again I recommend performing a computational check by calculating
SSbg from scratch. The structure is the same as before. The only difference is in the new subscriptions: g1, g2, etc. Please pay particularly close attention to this structure, for we will soon be applying it twice again with different casts of characters.
| SSbg
| =
| (∑Xg1)2 Ng1
| +
| (∑Xg2)2 Ng2
| +
| (∑Xg3)2 Ng3
| +
| (∑Xg4)2 Ng4
|
| (∑XT)2 NT
|
|
| =
| (204.3)2 10
| +
| (238.2)2 10
| +
| (242.7)2 10
| +
| (278.1)2 10
|
| (963.3)2 40
|
|
| =
| 273.39
| | |
Breaking SSbg into its Component Parts
As indicated earlier,
SSbg has three complementary components: one,
SSrows, measures the differences among the means of the two or more rows; another,
SScols, measures the mean differences among the two or more columns; and the third,
SSinteraction, is a measure of the degree to which the row and column variables interact. To save space, we will henceforth refer to the interaction component as
SSrxc, the subscription "rxc" being an abbreviation of the conventional shorthand expression "rows by columns."
For practical computational purposes, the simplest way to proceed is to calculate
SSrows and
SScols, and then subtract those two from
SSbg to get the third component,
SSrxc. Conceptually, the calculation of
SSrows and
SScols is exactly like the calculation of
SSbg when you are doing it from scratch.
SSrows ~ conceptual
Consider the means of the two rows in our example:
Mr1=22.13 and
Mr2=26.04. The null hypothesis is expecting these means to be the same, and that can happen only if they are both equal to
MT=24.08. The following conceptual procedure will be a familiar sight by now, so can be offerred without commentary.
| row 1
| row 2
observed row mean
| 22.13
| 26.04
expected row mean
| 24.08
| 24.08
deviate
| 1.95
| +1.96
squared deviate
| 3.80
| 3.84
squared deviate weighted
by number in row (20)
| 76.0
| 76.8
| | | | | |
Take the sum of these two weighted squared deviates and you have
SSrows = 76.0 + 76.8 = 152.8[tentative]
SScols ~ conceptual
It is the same logic and procedure for
SScols. The null hypothesis expects the means of the columns to be the same, and that can happen only if they are both equal
to MT=24.08.
| col 1
| col 2
observed column mean
| 22.35
| 25.82
expected column mean
| 24.08
| 24.08
deviate
| 1.73
| +1.74
squared deviate
| 2.99
| 3.03
squared deviate weighted
by number in column (20)
| 59.8
| 60.6
| | | | | |
Hence
SScols = 59.8 + 60.6 = 120.4[tentative]
I have marked both of the above values as "tentative" because calculations that start out with rounded numbers are at risk of accumulating substantial rounding errors. For practical purposes it is better to use the following computational formulas. Both follow the same basic pattern as when you are calculating
SSbg from scratch
| SSbg
| =
| (∑Xg1)2 Ng1
| +
| (∑Xg2)2 Ng2
| +
| (∑Xg3)2 Ng3
| +
| (∑Xg4)2 Ng4
|
| (∑XT)2 NT
|
only now the items to the left of the minus sign pertain not to the individual groups of measures, but to the rows or the columns.
SSrows ~ computational
| SSrows
| =
| (∑Xr1)2 Nr1
| +
| (∑Xr2)2 Nr2
|
| (∑XT)2 NT
|
|
Click here if you would
like a printable summary
of the data on which these
calculations are based.
|
|
| =
| (442.5)2 20
| +
| (520.8)2 20
|
| (963.3)2 40
|
|
| =
| 153.27
| | |
SScols ~ computational
| SScols
| =
| (∑Xc1)2 Nc1
| +
| (∑Xc2)2 Nc2
|
| (∑XT)2 NT
|
|
| =
| (447.0)2 20
| +
| (516.3)2 20
|
| (963.3)2 40
|
|
| =
| 120.06
| | |
SSrxc ~ computational
Once you have these two components of
SSbg, the
SS measure of interaction can then be reached through simple subtraction:
| SSrxc
| = SSbg SSrows SScols
|
| = 273.39 153.27 120.06
|
| = 0.06
| | |
As promised, the interaction effect in this example is essentially zero. The advantage of the simple subtractive procedure by which we have arrived at this conclusion is that it is quick and easy. The disadvantage is that does not give the slightest clue of the underlying logic of the process. The following procedure is more cumbersome and more prone to rounding errors, though at the same time more revealing of the inner workings of
SSrxc.
SSrxc ~ conceptual
It begins once again with the concept of the null hypothesis. To streamline things a bit, we will first lay out some items of symbolic notation:
T
| Mg* =
| the mean of any particular one of the the individual groups of measures
| Mr* =
| the mean of the row to which that group belongs
| Mc* =
| the mean of the column to which that group belongs
| | |
If there is zero interaction between the row and column variables, then the mean of any particular one of the the individual groups,
Mg*, should be a simple additive combination of
Mr* and
Mc*. The specific form of the combination is
| [null]Mg* = Mr* + Mc* MT
|
|
Click here for a brief account
of the logic of this formula.
|
Thus, for group 1, which falls in row 1 and column 1:
| [null]Mg1
| = Mr1 + Mc1 MT
|
| = 22.13 + 22.35 24.08
|
| = 20.40
| | |
For group 2, which falls in row 1 and column 2:
| [null]Mg2
| = Mr1 + Mc2 MT
|
| = 22.13 + 25.82 24.08
|
| = 23.87
| | |
And so forth.
Here is the same table of means you saw earlier, except now I also include (in red) the results of the calculation of
[null]Mg* for each of the four groups. The observed means of the groups (20.43,
23.82, etc.) appear in black.
means
| B
0 units
| 1 unit
|
| rows
A
| 0 units
| 20.43 20.40
| 23.82 23.87
|
Mr1=22.13
1
unit
| 24.27 24.31
| 27.81 27.78
| Mr2=26.04
|
columns
| Mc1=22.35
| Mc2=25.82
| MT=24.08
| | | | |
As you can see, there is only the tiniest bit of difference between the observed group means and the means that would be expected if there were no rows-by-columns interaction.
Here again is that familiar conceptual structure by which you can convert the differences between observed and expected mean values into a meaningful measure of
SS:
| g1
| g2
| g3
| g4
observed group mean
| 20.43
| 23.82
| 24.27
| 27.81
expected group mean
| 20.40
| 23.87
| 24.31
| 27.78
deviate
| +0.03
| 0.05
| 0.04
| +0.03
squared deviate
| 0.0009
| 0.0025
| 0.0016
| 0.0009
squared deviate weighted
by number in group (10)
| 0.009
| 0.025
| 0.016
| 0.009
| | | | | |
The sum of these weighted squared deviates comes out to the same
SSrxc=0.06 calculated earlier with the simple subtractive procedure.
For those whose memories, like mine, fall short of being photographic, here is a summary of the several
SS values we have now calculated for this example:
|
| Total:
| SST = 585.70
|
| within groups:
| SSwg = 312.31
|
| between groups:
| SSbg = 273.39
|
| rows:
| SSrows = 153.27
|
| columns:
| SScols = 120.06
|
| interaction:
| SSrxc = 0.06
| | | | | |
df
The following table lists the respective degrees of freedom that are associated with these values of
SS. Note that
dfrxc, the degrees of freedom for rows-by-columns interaction, is calculated in the same way as for a two-dimensional (rows-by-columns) chi-square test; namely,
| dfrxc = (r1)(c1)
|
|
r = number of rows
c = number of columns
|
All the other
df structures are much as you would expect on the basis of previously examined versions of ANOVA. Note that the number of individual groups, or cells, in a rows-by-columns matrix is always equal to the product of r multiplied by c, rendered here as "rc." Thus, for the present example,
rc=2x2=4.
| in general
| for the present example
Total
| dfT = NT1
| 401=39
| Note thatT
dfT=dfwg+dfbg
within- groups (error)
| dfwg = NTrc
| 40(2)(2)=36
between- groups
| dfbg = rc1
| (2)(2)1=3
rows
| dfrows = r1
| 21=1
| Note thatT
dfbg=dfrows+dfcols+dfrxc
columns
| dfcols = c1
| 21=1
interaction
| dfrxc = (r1)(c1)
| (21)(21)=1
| | | | | | |
MS
As in previous versions of ANOVA, the relevant values of
MS are in each case given by the ratio
SS/
df. Thus, for rows, columns, and interaction:
|
|
|
|
|
|
| MSrows
| =
| SSrows dfrows
|
| MScols
| =
| SScols dfcols
|
| MSrxc
| =
| SSrxc dfrxc
|
|
| =
| 153.27 1
|
|
| =
| 120.06 1
|
|
| =
| 0.06 1
|
|
| =
| 153.27
|
|
| =
| 120.06
|
|
| =
| 0.06
|
|
|
|
|
| | | |
These are the values of
MS that will appear in the numerators of the three
F-ratios that will complete the analysis. The denominator in each case will be the error term,
|
| MSerror
| =
| SSwg dfwg
|
|
| =
| 312.31 36
| = 8.68
| |
F
And
There are the three bottom lines of the analysis:
|
|
|
|
|
|
| Frows
| =
| MSrows MSerror
|
| Fcols
| =
| MScols MSerror
|
| Frxc
| =
| MSrxc MSerror
|
|
| =
| 153.27 8.68
|
|
| =
| 120.06 8.68
|
|
| =
| 0.06 8.68
|
|
| =
| 17.67
|
|
| =
| 13.84
|
|
| =
| 0.01
| with df=1,36
|
| with df=1,36
|
| with df=1,36
| | | |
Figure 16.1 shows the sampling distribution of
F for
df=1,36, and the adjacent table shows the corresponding portion of
Appendix D. As indicated,
F=4.11 and
F=7.40 mark the points in this distribution beyond which fall 5% and 1%, respectively, of all possible mere-chance outcomes, assuming the null hypothesis to be true.
Figure 16.1. Sampling Distribution of F for df=1,36
|
df denomi- nator
| df numerator
1
| 2
| 3
36
| 4.11 7.40
| 3.26 5.25
| 2.87 4.38
| | | |
Clearly our miniscule value of
Frxc=0.01 for rows-by-columns interaction falls nowhere near what would be needed for significance even at the basic .05 level. The values for the two main effects, however
(Frows=17.67 and
Fcols=13.84), are both significant well beyond the .01 level.
The fundamental meaning of the significant row and column effects is that the difference between the two row means (22.13 vs 26.04) and the difference between the two column means (22.35 vs 25.82) each reflect something more than mere random variability. In the present example, where there is essentially zero interaction between the row and column variables, the interpretation of these two main effects would be entirely straightforward: 1 unit of A produces greater arousal than zero units of A; and 1 unit of B produces greater arousal than zero units of B.
| col 1
[B=0]
| col 2
[B=1]
| row
means
row 1 [A=0]
| |
| 22.13
row 2 [A=1]
| |
| 26.04
column means
| 22.35
| 25.82
| | | |
However, do keep in mind that the row means for the two levels of drug A are measured across the two levels of drug B, and that the column means for the two levels of drug B are measured across the two levels of drug A. As we will see in Example 2, this rows-by-columns complexity can make the interpretation of the main effects considerably less obvious when the two independent variables
are interacting.
But for the present example it is all plain and simple. Each of the two drugs appears to increase arousal, and there is no indication that they interact with each other. When presented in combination, their effects are merely additive.
ANOVA Summary Table
| Source
| SS
| df
| MS
| F
| P
between groups
| 273.39
| 1
rows
| 153.27
| 1
| 153.27
| 17.67
| <.01
columns
| 120.06
| 1
| 120.06
| 13.84
| <.01
interaction
| 0.06
| 1
| 0.06
| 0.01
| ns
within groups (error)
| 312.31
| 36
| 8.68
TOTAL
| 585.70
| 39
"ns" = "non-significant"
| | | | | | | |