The Bell Curve: Intelligence and Class Structure in American Life

Home > Other > The Bell Curve: Intelligence and Class Structure in American Life > Page 69
The Bell Curve: Intelligence and Class Structure in American Life Page 69

by Richard J. Herrnstein


  Mother’s education and father’s education were based on years of education, converted to standardized scores.

  Family income was based on the averaged total net family income for 1978 and 1979, in constant dollars, when figures for both years were available. If income for only one of the two years was reported, that year was used. Family income was excluded if the subject was a Schedule C interviewee (the reported income for the year in question referred to his or her own income, not to the parental household’s income). The dollar figure was expressed as a logarithm before being standardized. This procedure, customary when working with income data, has the effect of discounting extremely high values of income and permitting greater discrimination among lower incomes. A minimum standardized value of −4 was set for incomes of less than $1,000 (all figures are in 1990 dollars).

  Parental occupation was coded with a modified version of the Duncan socioeconomic index, grouping the Duncan values (which go from 1 to 100) into deciles. A value of −1 was assigned to persons out of the labor force altogether. It was assumed that the family’s socioeconomic status is predominantly determined by the higher of the two occupations held by two parents. Thus the occupational variable was based on the higher of the two ratings of the two parents. The increment in socioeconomic status represented by both parents holding high-status occupations is indirectly reflected in the higher income and in the two educational variables. The eleven values in the modified Duncan scale were standardized.

  The reliability of the four-indicator index (Cronbach’s α) is .76. The correlations among the components of the index are shown in the table. The four variables were summed and averaged. If only a subset of variables had valid scores, that subset was summed and averaged. By far the most common missing variable was family income, since many of the NLSY youths were already living in independent households as of the beginning of the survey, and hence were reporting their own income, not parental income. Overall, data were available on all four indicators for 7,447 subjects, for three on an additional 3,612, on two for 679, and on one for 138. Two subjects with valid scores on the AFQT had no information available on any of the four indicators. For use in the regression analyses, the SES index scores were set to a mean of 0 and a standard deviation of 1.

  Correlations of Indicators in the Socioeconomic Status Index

  Mother’s Education Father’s Education Parental Occupation

  Father’s education .63 —

  Parental occupation .47 .55 —

  Family income .36 .40 .47

  EDUCATIONAL ATTAINMENT

  Highest Grade Completed.

  The NLSY creates a variable each year for “highest grade completed,” incorporating information from several questions.7 For analyses based on the occurrence of an event (e.g., the birth of a child), the value of “highest grade completed” for the contemporaneous survey year is used. For all other analyses, the 1990 value for “highest grade completed” is used. Values run from 0 through 20.

  Highest Degree Ever Received

  In the 1988-1990 surveys, the NLSY asked respondents to report the highest degree they had ever received. The possible responses were: high school diploma, associate degree, bachelor of arts, bachelor of science, master’s, Ph.D., professional degree (law, medicine, dentistry), and “other.” These self-reported degrees were sometimes questionable, especially when the degree did not correspond to the number of years of education (e.g., a bachelor’s degree for someone who also reported only fourteen years of education). To eliminate the most egregiously suspicious cases, we made adjustments. For those who reported their highest degree as being a high school diploma, we required at least eleven reported years of completed education. For degrees beyond the high school diploma, we required that the report of the highest grade completed be within at least one year of the normal number of years required to obtain that degree. Specifically, the minimum number of years of completed years of education required to use a reported degree were thirteen for the Associate’s degree, fifteen for a bachelor’s degree, sixteen for a master’s degree, and 18 for a Ph.D., law degree, or medical degree.

  We also employed the NLSY’s variables to discriminate between those whose terminal degree was a high school diploma versus a GED. We excluded the 190 persons whose degree was listed as “other,” after trying fruitlessly to come up with a satisfactory means of estimating what the “other” meant from collateral educational data.

  The “high school” and “college graduate” samples used throughout Part II are designed to isolate populations with homogeneous educational experiences as of the 1990 survey year, The high school sample is defined as those who reported twelve years of completed education and a high school diploma received through the normal process (i.e., excluding GEDs) as the highest attained degree. The college graduate sample is defined as all those who reported sixteen years of completed education and a B.A. or B.S. as the highest attained degree.

  Transition to College

  In Chapter 1, we used the NLSY to determine the percentage of students in various IQ groupings who went directly to college. We limited the analysis to students who obtained a high school diploma between January 1980 and July 1982, meaning that all subjects had taken the AFQT prior to attending college. The analysis thus also reflects the experience of those who obtain their high school diploma via the normal route (comparable to the analyses from the 1960s and 1920s, which are also reported in the same figure). A subject is classified as attending college in the year following graduation if he reported having enrolled in college at any point in the calendar year following the date of graduation.

  MARITAL AND FERTILITY VARIABLES

  All variables relating to marital history and childbearing employed the NLSY’s synthesis as contained in the 1990 Fertility File of the NLSY.

  BIRTH WEIGHT

  The most commonly reported measure of a problematic birth weight is “low birth weight,” defined as no more than 5.5 pounds. In its raw form, however, low birth weight is limited as a measure because it is confounded with prematurity. A baby born five weeks prematurely will probably weigh less than 5.5 pounds and yet be a fully developed, healthy child for gestational age, with excellent prospects. Conversely, a child carried to term but weighing slightly more than the cutoff of 5.5 pounds is (given parents of average stature) small for its gestational age. We therefore created a variable expressing the baby’s birth weight as a ratio of the weight for fetuses at the 50th centile for that gestational age, using the Colorado Intrauterine Growth Charts as the basis for the computation. If a baby weighed less than 5.5 pounds but the ratio was equal to or greater than 1, that case was excluded from the analysis. All uses of this variable in Chapters 10 and 13 are based on a sample that is exclusively white (Latino or non-Latino) or black, thereby sidestepping the complications that would be introduced by the populations of smaller stature, such as East Asians. We further excluded cases reporting gestational ages of less than twenty-six weeks, reports of pregnancies that lasted more than forty-four weeks or birth weights in excess of thirteen pounds, and one remarkable case in which a mother reported gestation of twenty-six weeks and a birth weight of more than twelve pounds.

  Appendix 3

  Technical Issues Regarding the Armed Forces Qualification Test as a Measure of IQ

  Throughout The Bell Curve, we use the Armed Forces Qualification Test (AFQT) as a measure of IQ. This appendix discusses a variety of related issues that may help readers interpret the meaning of the analyses presented in the full text.

  DOES THE AFQT MEASURE THE SAME THING THAT IQ TESTS MEASURE?

  The AFQT is a paper-and-pencil test designed for youths who have reached their late teens. In effect, it assumes exposure to an ordinary high school education (or the opportunity to get one). This kind of restriction is shared by any IQ test, all of which are designed for certain groups.

  The AFQT as scored by the armed forces is not age referenced. The armed forces have no need to do so, because th
e overwhelming majority of recruits taking the test are 18 and 19 years old. In contrast, the NLSY sample varied from 14 to 23 years old when they took the test. Therefore, as discussed in Appendix 3, all analyses in the book take age into account through one of two methods: entering age as an independent variable in the multivariate analyses, and, for all descriptive statistics, age referencing the AFQT score by expressing it in terms of the mean and standard deviation for each year’s birth cohort. In this appendix, we will uniformly use the age-referenced version for analyses based on the NLSY.

  Is a set of age-referenced AFQT scores appropriately treated as IQ scores? We approach this issue from two perspectives. First, we examine the internal psychometric properties of the AFQT and show that the AFQT is one of the most highly g-loaded mental tests in current use. It seems to do what a good IQ test is supposed to do—tap into a general factor rather than specific bits of learning or skill—as well as or better than its competitors. Second, we examine the correlation between the AFQT and other IQ tests, and show that the AFQT is more highly correlated with a wide range of other mental tests than those other mental tests are with each other. On both counts, the AFQT qualifies not just as an IQ test, but one of the better ones psychometrically.

  Psychometric Characteristics of the ASVAB

  Let us begin by considering the larger test from which the AFQT is computed, the ASVAB (Armed Services Vocational Aptitude Battery), taken every year by between a half million and a million young adults who are applying for entry into one of the armed services. The ASVAB has ten subtests, spanning a range from test items that could appear equally well on standard tests of intelligence to items testing knowledge of automobile repair and electronics.1 Scores on the subtests determine whether the applicant will be accepted by his chosen branch of service; for those accepted, the scores are later used for the placement of enlisted personnel into military occupations. How well or poorly a person performs in military occupational training schools, and also how well he does on the job, can therefore be evaluated against the scores earned on a battery of standardized tests.

  The ten subtests of ASVAB can be paired off into forty-five correlations. Of the forty-five, the three highest correlations in a large study of enlisted personnel were between Word Knowledge and General Science, Word Knowledge and Paragraph Completion, and, highest of all, between Mathematics Knowledge and Arithmetic Reasoning.2 Correlations above .8, as these were, are in the range observed between different IQ tests, which are frankly constructed to measure the same attribute. To see them arising between tests of such different subject matter should alert us to some deeper level of mental functioning. The three lowest correlations, none lower than .22, were between Coding Speed and Mechanical Comprehension, Numerical Operations and Auto/Shop Information, and, lowest of all, between Coding Speed and Automobile/Shop Information. Between those extremes, there were rather large correlations between Paragraph Completion and General Science and between Word Knowledge and Electronics Information but only moderate correlations between Electronics Information and Coding Speed and between Mathematics Knowledge and Automobile/Shop Information. Thirty-six of the forty-five correlations were above .5.

  Psychometrics tries approaches a table of correlations with one or another of its methods for factor analysis. Factor analysis (or other mathematical procedures that go under other names) extracts the factors3 that account for the observed pattern of subtest scores. The basic idea is that scores on any pair of tests are correlated to the extent that the tests measure something in common: If they test traits in common, they are correlated, and if not, not. Factor analysis tells how many different underlying factors are necessary to account for the observed correlations between them. If, for example, the subtest scores were totally uncorrelated, it would take ten independent and equally significant factors, one for each subtest by itself. With each test drawing on its own unique factor, the forty-five correlations would all be zeros. At the other extreme, if the subtests measured precisely the same thing down to the very smallest detail, then all the correlations among scores on the subtests could be explained by a single factor—that thing which all the subtests precisely measured—and the correlations would all be ones. Neither extreme describes the actuality, but for measures of intellectual performance, one large factor comes closer than many small ones. This is not the place to dwell on mathematical details except to note that, contrary to claims in nontechnical works,4 the conclusions we draw about general intelligence do not depend on the particular method of analysis used.5

  For the ASVAB, 64 percent of the variance among the ten subtest scores is accounted for by a single factor, g.A second factor accounts for another 13 percent. With three inferred factors, 82 percent of the variance is accounted for.6 The intercorrelations indicate that people do vary importantly in some single, underlying trait and that those variations affect how they do on every test. Nor is the predominance of ga fortuitous result of the particular subtests in ASVAB. The air force’s aptitude test for prospective officers, the AFOQT (Air Force Officer Qualifying Test) similarly has gas its major source of individual variation.7 Indeed, all broad-gauged test batteries of cognitive ability have gas their major source of variation among the scores people get.8

  The naive theory assumes that when scores on two subtests are correlated, it is because of overlapping content. But it is impossible to make sense of the varying correlations between the subtests in terms of overlapping content. Consider again the correlation between Arithmetic Reasoning and Mathematical Knowledge, which is the highest of all. It may seem to rest simply on a knowledge of mathematics and arithmetic. However, the score on Numerical Operations is less correlated with either of those two tests than the two are with each other. Content provides no clue as to why. Arithmetic Reasoning has only word problems on it; Mathematical Knowledge applies the basic methods of algebra and geometry; and Numerical Operations is an arithmetic test. Why are scores on algebra and geometry more similar to those on word problems than to those on arithmetic? Such variations in the correlations between the subtests arise, in fact, less from common content than from how much they draw on the underlying ability we call g.The varying correlations between the subtests preclude explaining gaway as, for example, simply a matter of test-taking ability or test-taking experience, which should affect all tests more or less equally. We try to make some of these ideas visible in the figure below.

  The relation of the ASVAB subtests to each other and to g

  For each subtest on ASVAB, we averaged the nine correlations with each of the other subtests, and that average correlation defines the horizontal axis. The vertical axis is a measure, for each subtest, of the correlation between the score and g.9The two-letter codes identify the subtests. At the top is General Science (GS), closely followed by Word Knowledge (WK), and Arithmetical Reasoning (AR), for which the scores are highly correlated with gand have the highest average correlations with all the subtests. Another three subtests—Mathematics Knowledge (MK), Paragraph Comprehension (PC), and Electronics Information (EI)—are just slightly below the top cluster in both respects. At the bottom are Coding Speed (CS), Automobile/Shop Information (AS), Numerical Operations (NO), and Mechanical Comprehension (MC), subtests that correlate, on the average, the least with other subtests and are also the least correlated with g(although still substantially correlated in their own right). The bottom group includes the two speeded subtests, CS and NO, thereby refuting another common misunderstanding about g,which is that it refers to mental speed and little more. Virtually without exception, the more dependent a subtest score is on g,the higher is its average correlation with the other subtests. This is the pattern that betrays what gmeans—a broad mental capacity that permeates performance on anything that challenges people cognitively. A rough rule of thumb is that items or tests that require mental complexity draw more on gthan items that do not—the difference, for example, between simply repeating a string of numbers after hearing them once, which does not much test g,
and repeating them in reverse order, which does.10

  The four subtests used in the 1989 scoring version of the AFQT (the one used throughout the text) and their gloadings are Word Knowledge (.87), Paragraph Comprehension (.81), Arithmetic Reasoning (.87), and Mathematics Knowledge (.82).11 The AFQT is thus one of the most highly g-loaded tests in use. By way of comparison, the factor loadings for the eleven subtests of the Wechsler Adult Intelligence Scale (WAIS) range from .63 to .83, with a median of .69.12 Whereas the first factor, g, accounts for over 70 percent of the variance in the AFQT, it accounts for only 53 percent in the WAIS.

  Correlations of the AFQT with Other IQ Tests

  Our second approach to the question, Is the AFQT an IQ test? is to ask how the AFQT correlates with other well-known standardized mental tests (see the table below). We can dc so by making use of the high school transcript survey conducted by the NLSY in 1979. In addition to gathering information about grades, the survey picked up any other IQ test that the student had taken within the school system. The data usually included both the test score and the percentile rank, based on national norms. In accordance with the recommendation of the NLSY User’s Manual,we use percentiles throughout.13

 

‹ Prev