The Bell Curve: Intelligence and Class Structure in American Life

Home > Other > The Bell Curve: Intelligence and Class Structure in American Life > Page 89
The Bell Curve: Intelligence and Class Structure in American Life Page 89

by Richard J. Herrnstein


  52 “Project Rush-Rush” was what Head Start was called by those in Washington who thought that it was plunging ahead with more speed than deliberation (quoted in Caruso, Taylor, and Detterman 1982, p. 52).

  53 Zigler and Muenchow 1992, reporting the conclusions of Leon Eisenberg and C. Keith Connors after the first summer program. Only slightly less grandiose were the claims of raising IQ scores “a point a month” that were often cited by enthusiasts.

  54 Sargent Shriver, brother-in-law of the late president, John Kennedy, and former head of the Peace Corps.

  55 The first comprehensive evaluation was the so-called Westinghouse study, which the Office of Economic Opportunity sponsored. Its conclusion was that there were few or no cognitive benefits of Head Start within three years after the child completed it (Cicarelli, Evans, and Schiller 1969). Soon there was a mini-industry picking over the Westinghouse study, in addition to the one picking over Head Start. The consensus is now clear: Cognitive gains vanish before the end of primary school, e.g., Haskins 1989; McKey 1985; Spitz 1986; Zigler and Muenchow 1992. The new consensus has recently surfaced in the popular media (e.g., J. DeParle, “Sharp criticism for Head Start, even by friends,” New York Times, Mar. 19, 1993, p. A1).

  56 For a range of views, see Gamble and Zigler 1989; McKey 1985; Zigler and Muenchow 1992.

  57 E.g. Haskins 1989.

  58 Zigler and Muenchow 1992. Edward Zigler, one of the early research directors of Head Start and a professor at Yale, argues in his book that it was a mistake from the beginning to promise gains in intelligence to the public. The more general shift away from making increases in IQ the target of preschool programs is discussed in Garber and Hodge 1991; Locurto 1991; Schweinhart and Weikart 1991, pro and con.

  59 Among the people promising gains in the 300 percent range is the president of the United States, as reported by Jason DeParle (“Sharp criticism for Head Start, even by friends,” New York Times, Mar. 19, 1993). Even more of an optimist is economist Alan Blinder, who once promised a return of $4.75 for every dollar spent on preschool education (Blinder 1987).

  60 For a review of such benefits from Head Start programs, see Haskins 1989, who concludes that the results “call for humility” (p. 280). The Head Start literature, he says, “will not support the claim that a program of national scope would yield lasting impacts on children’s school performance nor substantial returns on the investment of public dollars” (p. 280). In short, there are no sleeper effects from Head Start. Even the evidence of cost-effective returns in the more intensive educational programs is highly restricted. For a literature review, see Barnett and Escobar 1987.

  61 Most of the children were 3 years old and spent two years in the program; the 22 percent who were 4 spent only one year in it (Barnett 1985; Berrueta-Clement et al. 1984.

  62 Half a school day, or about two and a half hours.

  63 The lack of effect was indirectly confirmed in a subsequent study by the same group of workers. They failed to find any differential effect on IQ of three different forms of preschool: their own cognitive enrichment program, a language-enhancing program, and a conventional nursery school program (Weikart et al. 1978). There was no control group in this follow-up, so we cannot say how much, if at all, preschool per se influenced IQ.

  64 For a critical reading of just how minimal these other effects of preschool may have been, see Spitz 1986.

  65 Lazar and Darlington 1982.

  66 Similar estimates can be found in a study of the early effects of Head Start and the consortium sample (Lee et al. 1990).

  67 Lazar and Darlington 1982, p. 47 The people who do these studies often argue that other positive effects are not being picked up in the formal measurements (e.g., Ramey, MacPhee, and Yeates 1982).

  68 Many publications have flowed from the project; useful summaries are in Ramey 1992; Ramey, MacPhee, and Yeates 1982.

  69 Personal communication from Ron Haskins.

  70 Ramey 1992.

  71 These differences are clearer in the critical accounts of the project in Spitz 1986 and 1992 than in the report by Ramey, MacPhee, and Yeates 1982.

  72 Herrnstein 1982; Sommer and Sommer 1983.

  73 Page 1972; Page and Grandon 1981.

  74 Garber 1988; Garber and Hodge 1991.

  75 Jensen 1989; Locurto 1991. The problem of “teaching to the test” recurs in educational interventions. It is based on the test’s being less than a perfect measure of intelligence (or g), so that it is possible to change the score without changing the underlying trait (see further discussion in Jensen 1993a).

  76 Our topic here is the effect of adoption on raising IQ, not the implications of adoption data for estimating the heritability of IQ. For reviews of the adoption literature, see Herrnstein 1973; Locurto 1990; Munsinger 1975; Plomin and DeFries 1985. A comprehensive theoretical analysis of adoption studies of intelligence is in Turkheimer 1991.

  77 Brown 1958, Chap. 5; Lane 1976; Lane and Pillard 1978.

  78 Among others inspired by this evidence from “wild children” of the power over the mind of the human environment was an Italian physician trained at the end of the nineteenth century whose approach to education has survived the twentieth, Maria Montessori.

  79 Locurto 1990; Plomin and DeFries 1985. In a refinement of this observation, it has been found that adopted children also score lower than the children in other homes that are socioeconomically the same as those of their adoptive parents but have no adopted children (thereby controlling for possible ways in which adoptive parents might be distinctive from non-adoptive parents).

  80 Locurto 1990.

  81 Dumaret and Stewart 1985; Schiffet al. 1982; Schiff and Lewontin 1986.

  82 We will disregard in our analysis a number of considerations that would reduce estimates of the impact of home environment, such as that the IQ of the schoolmates of the nonadopted half-siblings (who presumably share comparable lower-class surroundings) averaged only seven points less than the adopted children, not twelve. This difference raises the possibility that the adopted-away child seemed brighter in infancy or had better intellectual prospects than the half-sibling who stayed at home because of the parent they did not share, or that the shift in home environments was even more extreme than the estimates below assume it was, as if the adopted child’s biological family home was atypically poor, even for the poor neighborhoods they were in. This, as we explain below, would reduce the over-all estimate of the impact of home environment.

  83 The cell sizes in the 2 × 2 table of high-and low-SES adopting and biological parent families were only ten children or fewer.

  84 Capron and Duyme 1989. This study showed an even larger benefit—equivalent to sixteen IQ points—of having high-SES biological parents, even when the child was not reared by them, which again points to a heritability greater than .5.

  85 This, it should be remembered, is for childhood IQ, which is more subject to the influence of home environment than adult IQ. Recent work has also indicated that how a parent treats a child (presumably also an adopted child) is in part determined by the child’s inherited characteristics. To that extent, speaking of home environment as if it were purely an environmental source of variation is incorrect (see Plomin and Bergeman 1991).

  86 A twenty-point swing is easily reconciled with a heritability of .6 for IQ. Suppose the high-and low-SES homes in the French studies represent the 90th and 10th centile of environmental quality, as the text says. A twenty-point swing in IQ from the 2d to the 98th centile of environmental quality would then imply that the standard deviation of home environment effects on IQ is 4.69. Squared, this means a variance of 22 attributable to home environment. But as we noted in note 1, a heritability of .6 implies that there is a variance of 225-135, or 90, attributable to environmental sources. The French adoption studies, in short, are consistent with the conclusion that about a quarter of environmental variance is the variance across homes (if our guesses about the adopting and biological home environments are not w
ay off). Three-quarters of the environmental influence on intelligence must be uncorrelated with the family SES, according to the present analysis. Note again that the balance tips toward environmental factors outside families as being the more relevant than those provided by families in affecting IQ, as mentioned in Chapter 4.

  87 For a discussion of cost-benefit considerations, see Haskins, 1989.

  Chapter 18

  1 “Sharpen your pencil, and begin now,” Wall Street Journal, June 9, 1992, p. A16.

  2 National Commission on Excellence in Education 1983, p. 5.

  3 National Commission on Excellence in Education 1984, p. 58.

  4 For an example of an alarmist view and a discussion of the various estimates, see Kozol 1985.

  5 National Center for Education Statistics 1992, Table 12-4.

  6 DES 1992, Table 95.

  7 Ravitch and Finn 1987, p. 49.

  8 Congressional Budget Office 1987, p. 16.

  9 Congressional Budget Office 1987, p. 16.

  10 Quoted in Kozol 1985, p. 9.

  11 Four of the studies were conducted by the International Association for the Evaluation of Educational Achievement, known as the IEA. They were the First International Mathematics Study (FIMS), mid-1960s; the First International Science Study (FISS), 1966-1973; the Second International Mathematics Study (SIMS), 1981-1982; and the Second International Science Study (SISS), 1981-1982. The fifth study was initiated by the United States as a spin-off from NAEP. It was conducted in 1988 and is . known as the First International Assessment of Educational Progress (IAEP-I) (Medrich and Griffith 1992).

  12 Medrich and Griffith 1992, Appendix B.

  13 National Center for Education Statistics 1992, pp. 208-215.

  14 The best single source for understanding complexities of international comparisons is the summary and synthesis produced by National Center for Educational Statistics (Medrich and Griffith 1992). Other basic sources in this literature are Walker 1976; McKnight et al. 1989; Keeves 1991. There are cultural factors too. In his vigorous defense of American education, Gerald Bracey tells of the scene in a Korean classroom during one such international test: “As each Korean student’s name was called to come to the testing area, that child stood and exited the classroom to loud applause. What a personal honor to be chosen to perform for the honor of the nation!” American children seldom react that way, Bracey observes (Bracey 1991, p. 113).

  15 Bishop 1993b, National Center for Education Statistics 1992a, pp. 60-61.

  16 In addition to Bishop 1989, reviewed below, see especially Carlson, Huelskamp, and Woodall 1993; Bracey 1991.

  17 Bishop 1989.

  18 The Flynn effect refers to gradually rising scores over time on cognitive ability tests, discussed in Chapter 13.

  19 NAEP periodically tests representative samples of students at different age levels in mathematics, reading, science, and, more recently, in writing and in history and literature.

  20 National Center for Education Statistics, 1991, Fig. 1. The tests were designed to have a mean of 250 and a standard deviation of 50 when taken across all three age groups. The exception to flat trend lines was science performance among 17-year-olds, which shows a fifteen-point decline from 1969 to 1990, somewhat more than .3 SD (we do not know the specific standard deviations for 17-year-olds on the science test; probably it is less than 50). Note also that science among 17-year-olds reflects disproportionately the performance of the above-average students who tend to take high school science—consistent with our broader theme that educational performance deteriorated primarily among the gifted.

  21 Two large questions about the table on page 422 immediately present themselves. First, are the five studies accurate representations of the national samples that they purported to select, and are the five tests comparable with each other? The answer to the first half of the question is a qualified yes. The studies were not perfect, but all appear to have been well designed and executed. The qualification is that the data exclude youngsters who did not reach the junior year in high school. The answer to the second half of the question is cloudier, if only because sets of tests administered at different times to different samples always introduce incomparabilities with effects that cannot be assessed precisely. The prudent conclusion regarding the math scores is to discount the modest fall and rise from 1955 to 1983 and assume instead that math aptitude over that period was steady. Regarding the Verbal scores, it seems likely that they rose from 1955 to 1966 and dropped from sometime after 1966 to sometime between 1974 and 1983, with the magnitude and precise timing of those shifts still open to question. Before leaving the norm studies, we must add a proviso: the SAT scales got easier during 1963 to 1973 by about eight to thirteen points on the Verbal and perhaps ten to seventeen points on the Math. They seem to have been stable before and following this period (Modu and Stern 1975, 1977). The same person would, in other words, have earned a higher score on the later SATs than the earlier ones, owing purely to changes in the test scales themselves. Whether the PSAT, a much shorter test, experienced the same degree of drift is unknown, but it is a good idea to adjust mentally the 1974 and 1983 scores downward a bit, though this does not change the overall interpretation of the results.

  22 Grades 10 and 11 show a similar pattern. Grade 12 remained slightly under its high (1965-1967) as of 1992, but it is likely that the deficit is explained by increases in the proportion of 17-year-olds retained in school. The possibility remains open, however, that education in the post-slump period improved more in the lower grades than in the higher ones.

  23 Congressional Budget Office 1986.

  24 Medrich and Griffith 1992.

  25 The College Board added new method of reporting test scores in 1967 based on seniors instead of all tests administered, and continued to report the means for both types of samples through 1977. During the years when both scores were available, the trends were visually almost indistinguishable. In the year when we employed the new measure in the graph on page 425, 1970, the scores for the two methods were identical.

  26 Based on the 1963 standard deviations, .49 and .32 SD reductions respectively.

  27 For a technical statement of this argument, see Carlson, Huelskamp, and Woodall 1993.

  28 Readers can follow the journey through the numbers in Murray and Herrnstein 1992.

  29 It is possible that the SAT pool was not getting democratized in the usual socioeconomic sense but was nevertheless beginning to dig deeper into the cognitive distribution. Responses in the SAT student questionnaire indicate that somewhat more students from the bottom of the class were taking the test in 1992 than in 1976, but this effect was extremely small for whites. In 1980, 72.2 percent of whites reported that they were in the top two-fifths of their high school class, compared to 71.5 percent in 1992. We nonetheless explored the possibility that the pool had become cognitively democratized, by looking at the scores of students who reported that they were in the top tenth, the second tenth, and the second fifth of their classes. If their scores went up while those for the entire SAT sample went down, that would be suggestive evidence (if we make certain assumptions about the consistency with which students reported their true class rank) that the pool was drawing from a cognitively broader segment of the population. Using 1980 (the end of the decline) to 1992 as the period of comparison, the Verbal scores of whites who reported they were in the top tenth, 2d tenth, and 2d fifth went up by five, seven, and eight points respectively, while that of the entire white SAT pool remained flat. In Math, the scores of the top tenth, 2d tenth, and 2d fifth went up by nine, thirteen, and fourteen points, respectively, while that of the pool rose by nine points. At first glance, this would seem to be evidence for a strong effect of cognitive democratization. But then we looked at what happened to the scores of white students reporting that they were in the 3d, 4th, and lowest fifths of their classes. Their scores went up by much more: nine, eleven, and ten points, respectively, in the Verbal; seventeen, seventeen, and nine in the Mat
h. We are aware of Simpson’s paradox, which shows how scores in each interval can go up when scores in the aggregated group go down, but in this case the explanation appears to lie in changes either in the way that students report their class rank, the meaning of class rank, or both. We give “cognitive democratization” credit for two points each in the Verbal and Math, but it is not certain that even that much is warranted.

  30 For an argument that the test score decline does in fact represent falling intelligence, see Itzkoff 1993.

  31 For a broader discussion of falling SAT scores in the high-scoring segment of the pool, see Singal 1991.

  32 From 1967, scores were reported for all test takers; from 1972 through 1976, ETS reported scores for all test takers and for college-bound seniors. To estimate college-bound seniors for 1967-1972, we computed the ratio of college-bound seniors to total test takers for the overlapping years of 1972-1976. For Verbal, the mean ratio was .82, with a high of .88 and a low of .77. For Math, the mean ratio was .78, with a high of .85 and a low of .71. The mean ratios were applied to the data from 1967 to 1972 to obtain an estimate of the number of college-bound seniors.

  33 ETS keeps careful watch on changes in item difficulty, which are called “scale drift.” It finds that scores of 650 and above were little affected by scale drift (Modu and Stern 1975; 1977).

  34 The remaining possibility is that the increase in the SAT pool during the 1980s brought students into the pool who could score 700 but had not been taking the test before. This possibility is not subject to examination. It must be set against the evidence that extremely high proportions of the top students have been going to college since the early 1960s and that the best-of-the-best, represented by those who score more than 700 on the SAT, have been avidly seeking, and being sought by, elite colleges since the 1950s, which means that they have been taking the SAT. Note also that the proportion of SAT students who identify themselves as being in the top tenth of their high school class—where 700 scorers are almost certain to be—was virtually unchanged from 1981 to 1992. Finally, if highly talented new students were being drawn from some mysterious source, why did we see no improvement on the SAT-Verbal? It seems unlikely that the increase in the overall proportion of high school students taking the SAT can account for more than a small proportion, if any, of the remarkable improvement in Math scores among the most gifted during the 1980s.

 

‹ Prev