67 The contrast with the Asian experience on the SATs is striking. The Asian Math mean rose from 509 to 535. Of this increase, none of it was due to decreases in students scoring less than 200 (compared to 22 percent for blacks), while a remarkable 54 percent was due to gains in the 700 and up group (compared to 3 percent for blacks). Meanwhile, on the Verbal test, the Asian mean rose from 396 to 415 from 1980 to 1993. Of this, only 17 percent occurred because of reductions in Asians scoring in the 200s (compared to 51 percent for blacks), while 9 percent occurred because of increases in Asians scoring in the 700s (compared to 0.4 percent for blacks). The Asian increase in test scores has been driven by improvements among the best students, while the black increase has been driven by improvements among the worst students. We are unable to find any artifacts in the changing nature of the black and Asian SAT pools that would explain these results. The continued Asian improvement makes it difficult to blame the slowdown in black improvement in the last decade on events that somehow made it impossible for any American students to make progress. Explanations could be advanced based on events specific to blacks.
68 Snyderman and Rothman 1988. The sample was based on random selections from the Members and Fellows of the American Educational Research Association, National Council on Measurement in Education, six divisions of the American Psychological Association (Developmental Psychology, Educational Psychology, Evaluation and Measurement, School Psychology, Counseling Psychology, and Industrial and Organizational Psychology), the Behavior Genetics Association, the Cognitive Science Society, and the education division of the American Sociological Association.
69 Brody 1992, p. 309.
70 Gould 1984, pp. 26-27.
71 Gould 1984, p. 32. See Lewontin, Rose, and Kamin 1984, p. 127, for a similar argument.
72 Gould 1984, p. 33.
73 The ramifications for public policy are dealt with in detail in Chapters 19 and 20, concerning affirmative action.
74 We do not include in the text any discussion of Phillipe Rushton’s intensely controversial writings on the differences among Asian, white, and black populations. For a brief account, see Appendix 5.
75 A similar example can be found in Lewontin 1970, one of the most outspoken critics of the IQ enterprise in all its manifestations.
76 The calculation proceeds as follows: The standard deviation of IQ being 15, the variance is therefore 225. We are stipulating that environment accounts for .4 of the variance, which equals 90. The standard deviation of the distribution of the environmental component of IQ is the square root of 90, or 9.49. The difference between group environments necessary to produce a fifteen-point difference in group means is 15/9.49, or 1.58, and the difference necessary to produce a three-point difference is 3/9.49, or .32. The comparable figures if heritability is assigned the lower bound value of .4 are 1.28 and .26. If heritability is assigned the upper-bound value of .8, then the comparable figures are 2.24 and .45.
77 Stevenson et al. 1985.
78 Lynn 1987a.
79 Frydman and Lynn 1989.
80 Iwawaki and Vernon 1988; McShane and Berry 1988.
81 Vernon, 1982 p. 28. It has been argued that the 110 figure is too high, but a verbal-visuospatial difference among Asian Americans is not disputed (Flynn 1989).
82 Supplemental evidence has been found among Chinese students living in China who were given the SAT Several hundred Chinese students in Shanghai between the ages of 11 and 14 scored extremely high on the Math SAT, despite an almost total lack of familiarity with American cognitive ability testing. As a proportion of the total population, this represented a far greater density of high math scorers in Shanghai than in the United States. Further attempts to find high scorers in Chinese schools confirmed the original results in Shanghai (Stanley, Feng, and Zhu 1989).
83 The SAT data actually provide even more of a hint about genetic origins for the test-score pattern, though a speculative one. The College Board reports scores for persons whose first language learned is English and for those whose first language is “English and another.” It is plausible to assume that Asian students whose only “first language” was English contain a disproportionate number of children of mixed parentage, usually Asian and white, compared to those in whose homes both English and an Asian language were spoken from birth. With that hypothesis in mind, consider that the discrepancy between the Verbal and Math SATs was (in IQ points) only 1.7 points for the “English only” Asians and 5.3 points for the “English and another” first-language Asians. Nongenetic explanations are available. For example, one may hypothesize that although English and another language were both “first languages,” English wasn’t learned as well in those homes; hence the Verbal scores for the “English and another” homes were lower. But then one must also explain why the Math scores of the “English and another” Asians were twenty-one SAT points higher than the “English-only” homes. Here one could hypothesize that the “English-only” Asians were second- and third-generation Americans, more assimilated, and therefore didn’t study math as hard as their less assimilated friends (although somehow they did quite well in the Verbal test). But while alternative hypotheses are available, the consistency with a genetic explanation suggests that it would be instructive to examine the scores of children of full and mixed Asian parentage.
84 A related topic that we do not review here is the comparison of blacks and whites on Level I and Level II abilities, using Jensen’s two-level theory of mental abilities (Jensen and Figueroa 1975; Jensen and Inouye 1980). The findings are consistent with those presented under the discussion of WISCR profiles and Spearman’s hypothesis.
85 “Spearman’s hypothesis” is named after an observation made by Charles Spearman in 1927. Noting that the black-white difference varied systematically for different kinds of tests, Spearman wrote that the mean difference “was most marked in just those [tests] which are known to be most saturated with g” (Spearman 1927, p. 379). Spearman himself never tried to develop his comment into a formal hypothesis or to test it.
86 Jensen and Reynolds 1982.
87 Jensen and Reynolds actually compared large sets of IQ scores with the full-scale IQ score held constant statistically.
88 Jensen and Reynolds 1982, p. 427; Reynolds and Jensen 1983.
89 Jensen and Reynolds 1982, pp. 428-429.
90 Jensen 1985, 1987a.
91 Jensen 1993b.
92 Braden 1989.
93 Jensen 1993b.
94 The correlations between g loading and black-white difference are typically in the .5 to .8 range.
95 A concrete example is provided by the Kaufman Assessment Battery for Children (K-ABC), a test that attained some visibility in part because the separation between black and white children on it is smaller than on more standard intelligence tests. It was later found that K-ABC is a less valid measure of g than the standard tests (Jensen 1984a; Kaufman and Kaufman 1983; Naglieri and Bardos 1987).
96 E.g., Pedersen et al. 1992. Jensen limits himself to discussing Spearman’s hypothesis on the phenotypic level.
97 Jensen 1977.
98 Some other studies suggest a systematic sibling difference for national populations, but it goes the other way: Elder siblings outscore younger siblings in some data sets. However, this “birth-order” effect, when it occurs at all, is much smaller than the effect Jensen observed.
99 Jensen 1985, 1987a.
100 Various technical arguments were advanced against Jensen’s claim that blacks and whites differ the most on tests that are the most highly loaded on g. Many of these were effectively resolved within the forum. One critic hypothesized that Jensen’s findings resulted from an artifact of varying reliabilities (Baron 1985). Jensen was able to demonstrate that corrections for unreliability did not wash out the evidence for Spearman’s hypothesis and that some of the tests with low g loadings had high reliabilities to begin with, contrary to the critic’s assumption. Another commentator suggested that Jensen had inadvertently built into h
is own analysis the very correlation between g loading and black-white difference that he purported to discover (Schonemann 1985; see also Wilson 1985). In the next round (the forum occupied two issues of the journal), after being apprised of a response by physicist William Shockley (Shockley 1987), he withdrew his argument. A less serious criticism suggested that black-white differences did indeed correlate with some general factor that turns up to varying degrees in different intelligence tests but that the factor may not be g (Borkowski and Maxwell 1985). To this criticism, Jensen was able to demonstrate that the g factor accounted for so large a fraction of the total variance in test scores that no other general factor could possibly be comparably correlated with black-white differences. A still less serious criticism (indeed, barely a criticism at all), made by several commentators, was that the g that turns up in one battery of tests is likely to differ from the that turns up in another (e.g., Kline 1985). Jensen accepted this point, noting, however, that the various g’s are themselves intercorrelated.
A number of critics took a nontechnical tack. One set argued that Jensen’s analysis was conceptually circular. For example, if g is defined as intelligence, then tests that are loaded on g will be considered tests of intelligence. If these happen, coincidentally, to be the tests that black and whites differ on, then Spearman’s hypothesis will seem to be confirmed, though the link between the tests and intelligence was simply postulated, not proved (Brody 1987). For a related argument see Macphail 1985. Jensen acknowledged that he had not tried to discuss the relationship of g to intelligence in this particular article. Another set of critics made what could be called meta-critical comments, wondering why Jensen should want to uncover relationships that are not very interesting (Das 1985), hurtful to blacks (Das 1985), inimical to world peace (Bardis 1985), and likely to distract attention from the possibility of raising people’s g by educational means (Whimbey 1985). None of these commentaries disputed that the data show what Jensen said they show.
A few years later, the last paper written by the noted psychometrician, Louis Guttman, before his death, attempted to demonstrate a mathematical circularity in Jensen’s argument, concluding that Spearman’s hypothesis is true by mathematical necessity (Guttman 1992). He argued that the factor analytic procedures that are used to extract an estimate of g cannot fail to produce a correlation between g and the B/W difference. If the correlation is present by necessity, concluded Guttman, it can’t be telling us anything about nature. The gist of Guttman’s case is that if g is the only source of correlation across tests, then the varying B/W differences across tests must be correlated with g. Jensen and others were quick to point out that no one now believes that gis the only source of correlation between tests, just the largest one. We will not try to reproduce Guttman’s mathematical argument, not just because it would get us deep into algebra but because it was decisively refuted by other psychometricians who commented on it and seems to have found no other support since its publication. See Jensen 1992; Loehlin 1992; Roskam and Ellis 1992.
101 Gustafsson 1992.
102 Mercer 1984, pp. 297-310.
103 Mercer 1988.
104 Mercer 1988, p. 209.
105 It would be useful for the reader if we could present Mercer’s results so that they parallel the method we have been using, in which the socio-cultural variables and ethnicity are treated as independent variables predicting IQ, but her presentation does not include that analysis.
106 Mercer 1988, p. 208.
107 The critique of Mercer’s position has been highly technical. Readers who have the patience will find an extended exchange between Mercer, Jensen, and Robert Gordon in Reynolds and Brown 1984.
108 Mercer 1984, Tables 6, 9; Jensen 1984b, pp. 580-582.
109 Boykin 1986, p. 61.
110 For review, see Boykin 1986.
111 Ogbu 1986.
112 Flynn 1984, 1987a, 1987b.
113 Merrill 1938.
114 Flynn 1984, 1987b; Lynn and Hampson 1986c.
115 Flynn 1987a, 1987b.
116 Lynn and Hampson 1986a.
117 Teasdale and Owen 1989.
118 For evidence that this is what has happened in the United States, see Murray and Herrnstein 1992.
119 If the mean IQ in 1776 had been 30 and the standard deviation was what it is today, then America in the Revolutionary period had only five men and women with IQs above 100.
120 Lynn and Hampson 1986a.
121 Consider the analogy of height. The average stature of Americans has risen several inches since the Pilgrims landed at Plymouth, but height has run in families nevertheless.
122 A shifting link between IQ and intelligence is not only possible but probable under certain conditions. For example, when the literacy level of a country rises rapidly, scores on conventional intelligence tests will also rise because more people will be better able to read the test. This rise is unlikely to be fully reflected in a rising intelligence level, at least with equal rapidity. Flynn 1987b discusses this general measurement issue.
123 Scarr and Weinberg 1976, 1978, 1983; Weinberg, Scarr and Waldman 1992.
124 Weinberg, Scarr, and Waldman 1992, Table 2. The progression of the IQ means from two black parents to one black/one white to two white parents is not as neatly supportive of a genetic hypothesis as might first appear, because there is reason to suspect that the mixed-race biological parents of the adopted children were disproportionately drawn from college students, which in turn would imply that the IQ of the black parent was well above the black mean.
125 Weinberg, Scarr, and Waldman 1992. For the technical debate, see Levin in press; Lynn in press, with a response by Scarr and Weinberg in Waldman, Weinberg, and Scarr in press.
126 Weinberg, Scarr, and Waldman 1992, Table 2. The overall decline in scores for all groups was because a new test norm had been imposed in the interim, vitiating the Flynn effect for this group.
127 Waldman, Weinberg, and Scarr in press.
128 Eyferth 1961 For accounts in English, see Loehlin, Lindzey, and Spuhler 1975; Flynn 1980.
129 Loehlin, Lindzey, and Spuhler 1975, Chap. 5.
130 An earlier study showed no significant association between the amount of white ancestry in a sample of American blacks and their intelligence test scores (Scarr et al. 1977). If the whites who contributed this ancestry were a random sample of all whites, then this would be strong evidence of no genetic influence on black-white differences. There is no evidence one way or another about the nature of the white ancestors.
131 Lewontin, Rose, and Kamin 1984.
132 Scarr and Weinberg 1976, Table 12.
Chapter 14
1 U.S. Department of Labor 1993, Table 3.
2 U.S. Bureau of the Census 1993, Table 1.
3 The NLSY sample does not include GEDs. Nationally, the 1991 high school completion rate (signifying twelve years of school) was 87.0 percent for whites, 72.5 percent for blacks, and 55.4 percent for Latinos (National Center for Education Statistics 1993, p. 58).
4 These results refer to a logistic analysis in which the dependent variable was a binary variable representing obtaining a normal high school diploma. The independent variables were age and IQ.
5 For persons ages 25 to 29 in 1992, the proportions with bachelor’s degrees were 26.7 percent for whites, 10.6 percent for blacks, and 11.4 percent for Latinos (National Center for Education Statistics 1993, p. 62).
6 Welch 1973.
7 For example, given the mean years of education for people entering the high-IQ occupations defined in Chapter 3 ( 16.6) and holding age constant at the mean, the probability that whites would be in a high-IQ occupation was 14-4 percent compared to 12.8 percent for blacks and 18.1 percent for Latinos.
8 Gottfredson 1986.
9 Gottfredson 1986 leaves room for the possibility that blacks at the upper end of the IQ distribution were disproportionately choosing medicine, engineering, or the other professions she happened to examine. Perhaps if she had examined other hi
gh-IQ occupations (one may hypothesize), she would have found blacks represented at or below expectations. Our analysis, incorporating a broad range of high-IQ occupations, makes this hypothesis highly unlikely. The extension of the analysis in Chapter 20 rules it out altogether.
10 The proportions in high-IQ occupations were 5.8 percent for whites, 3.1 percent for blacks, and 3.7 percent for Latinos.
11 After controlling for IQ, the unrounded proportions in high-IQ occupations were 10.4 percent for whites, 24.5 percent for blacks, and 16.2 percent for Latinos.
12 “Year round” is defined as people who reported being employed for fifty-two weeks in calendar 1989 and reported wage income greater than O (excluding a small number who apparently were self-employed and did not pay themselves a wage).
13 This result is based on a regression analysis when the wage is the dependent variable, age is the independent variable, and the analysis is run separately for each race. The figures reported reflect the mean for a black and white of average age in the NLSY sample.
14 For a more detailed technical analysis of the NLSY experience, reaching the same conclusions, see O’Neill 1990. O’Neill’s collateral findings about the joint role of education and IQ are taken up in Chapter 19.
15 U.S. Bureau of the Census 1993, Table 29.
16 Precisely, 64.4 percent higher, computed using unrounded poverty rates.
17 For various approaches, see Bianchi and Farley 1980; Jargowsky 1993; Massey and Eggers 1990; Smith and Welch 1987, Eggebeen and Lichter 1991. For a summary of the literature, see Jaynes and Williams 1989.
18 U.S. Department of Labor 1993, Table 3.
19 For civilian males not in school and not prevented from working by health problems.
20 Wilson 1987, Lemann 1991, Holzer 1986; Kasarda 1989; Topel 1993, Jaynes and Williams 1989.
The Bell Curve: Intelligence and Class Structure in American Life Page 86