©Richard Lowry, 1999-
All rights reserved.
Among the states near the top of the list in 1993 (verbal and math SAT averages combined) were Iowa, weighing in at 1103; North Dakota, at 1101; South Dakota, at 1060, and Kansas, at 1042. And down near the bottom were the oft-maligned "rust belt" states of the northeast: Connecticut, at 904; Massachusetts, at 903; New Jersey, at 892; and New York, more that 200 points below Iowa, at 887. You can easily imagine the joy in DesMoines and Topeka that day, and the despair in Trenton and Albany. For surely the implication is clear: The state educational systems in Iowa, North Dakota, South Dakota, and Kansas must have been doing something right, while those in Connecticut, Massachusetts, New Jersey, and New York must have been doing something not so right.
Powell, B., & Steelman, L. C. "Bewitched, bothered, and bewildering: The uses and abuses of SAT and ACT scores." Harvard Educational Review,66, 1, 27—54.
See also Powell, B., & Steelman, L. C. "Variations in state SAT performance: Meaningful or misleading?" Harvard Educational Review,54, 4, 389—412.
State | Percentage taking SAT | Average SAT score |
Iowa North Dakota South Dakota Kansas | 5 6 6 9 | 1103 1101 1060 1042 |
Connecticut Massachusetts New Jersey New York | 88 81 76 74 | 904 903 892 887 |
State | X_{i} Percentage taking SAT | Y_{i} Average SAT score |
1_{i} 2_{i} ::::_{i} 49_{i} 50_{i} | X_{1} X_{2} ::::_{i} X_{49} X_{50} | Y_{1} Y_{2} ::::_{i} Y_{49} Y_{50} |
The next step in bivariate coordinate plotting is to lay out two axes at right angles to each other. By convention, the horizontal axis is assigned to the X variable and the vertical axis to the Y variable, with values of X increasing from left to right and values of Y increasing from bottom to top. |
In Figure 3.1b the X axis does begin at zero, because any value much larger than that would lop off the lower end of the distribution of X_{i} values; whereas the Y axis begins at 800, because the lowest observed value of Y_{i} is 838. |
Toggle!
Actually, in this particular example there are two somewhat different patterns that the 50 state data points could be construed as fitting. The first is the pattern delineated by the solid downward slanting straight line, and the second is the one marked out by the dotted and mostly downward sloping curved line that you will see if you click the line labeled "Toggle!" [Click "Toggle!" again to return to the straight line.] |
Pair | X_{i} | Y_{i} | ||
a b c d e f | 1 2 3 4 5 6 | 6 2 4 10 12 8 |
As it happens, this line of best fit will in every instance pass through the point at which the mean of X and the mean of Y intersect on the graph. In the present example, the mean of X is 3.5 and the mean of Y is 7.0. Their point of intersection occurs at the convergence of the two dotted gray lines. |
r = |
observed covariance maximum possible positive covariance | |
For any n numerical values, a, b, c, etc., the geometric mean is the n^{th} root of the product of those values. Thus, the geometric mean of a and b would be the square root of axb; the geometric mean of a, b and c would be the cube root of axbxc; and so on. |
r = |
observed covariance sqrt[(variance_{X}) x (variance_{Y})] | |
Recall that "sqrt" means "the square root of." |
In order to get from the formula above to the one below, you will need to recall that the variance (s^{2}) of a set of values is simply the average of the squared deviates: SS/N. |
r = |
SC_{XY} sqrt[SS_{X} x SS_{Y}] | |
To understand this kinship, recall from Chapter 2 precisely what is meant by the term "deviate." |
For any particular item in a set of measures of the variable X, Similarly, for any particular item in a set of measures of the variable Y, |
As you have probably already guessed, a co-deviate pertaining to a particular pair of XY values involves the deviate_{X} of the X_{i} member of the pair and the deviate_{Y} of the Y_{i} member of the pair. The specific way in which these two are joined to form the co-deviate is |
co-deviate_{XY} = (deviate_{X}) x (deviate_{Y}) |
And finally, the analogy between a co-deviate and a squared deviate: |
For a value of X_{i}, the squared deviate is For a value of Y_{i} it is And for a pair of X_{i} and Y_{i} values, the co-deviate is |
r = |
observed covariance maximum possible positive covariance | |
r = |
SC_{XY} sqrt[SS_{X} x SS_{Y}] | |
Pair | X_{i} | Y_{i} | X_{i}^{2} | Y_{i}^{2} | X_{i}Y_{i} | ||||
a b c d e f | 1 2 3 4 5 6 | 6 2 4 10 12 8 | 1 4 9 16 25 36 | 36 4 16 100 144 64 | 6 4 12 40 60 48 | ||||
sums | 21 | 42 | 91 | 364 | 170 | ||||
r = |
SC_{XY} sqrt[SS_{X} x SS_{Y}] | ||
= |
23.0 sqrt[17.5 x 70.0] | = +0.66 | |
r^{2} = (+0.66)^{2} = 0.44 |
Home | Click this link only if the present page does not appear in a frameset headed by the logo Concepts and Applications of Inferential Statistics |