THE NLSY
In Part I, we occasionally made use of the National Longitudinal Survey of Youth, the NLSY. In the chapters that follow, it will play the central role in the analysis, with other studies called in as available and appropriate.
Until a few years ago, there were no answers to many of the questions we will ask, or only very murky answers. No one knew what the relationship of cognitive ability to illegitimacy might be, or even the relationship of cognitive ability to poverty. Despite the millions of mental tests that have been given, very few of the systematic surveys, and sometimes none, gave the analyst a way to conclude with any confidence that this is how IQ interacts with behavior X for a representative sample of Americans.
Several modern sources of data have begun to answer such questions. The TALENT database, the huge national sample of high school students taken in 1961, is the most venerable of the sources, but its follow-up surveys have been limited in the range and continuity of their data. The Panel Study of Income Dynamics, begun in 1968 and the nation’s longest-running longitudinal database, administered a brief vocabulary test in 1972 to part of its sample, but the scores allow only rough discriminations among people in the lower portions of the distribution of intelligence. The National Longitudinal Survey begun by the Department of Education in 1972 (not to be confused with the NLSY) provides answers to many questions associated with educational outcomes. The department’s more ambitious study, High School and Beyond, conducted in the early 1980s, is also useful.
But the mother lode for scholars who wish to understand the relationship of cognitive ability to social and economic outcomes is the NLSY, whose official name is the National Longitudinal Survey of Labor Market Experience of Youth. When the study began in 1979, the participants in the study were aged 14 to 22.3 There were originally 12,686 of them, chosen to provide adequate sample sizes for analyzing crucial groups (for example, by oversampling blacks, Latinos, and low-income whites), and also incorporating a weighting system so that analysts could determine the correct estimates for nationally representative samples of their age group. Sample attrition has been kept low and the quality of the data, gathered by the National Opinion Research Council under the supervision of the Center for Human Resources Research at Ohio State University, has been excellent.
The NLSY is unique because it combines in one database all the elements that hitherto had to be studied piecemeal. Only the NLSY combined detailed information on the childhood environment and parental socioeconomic status and subsequent educational and occupational achievement and work history and family formation and—crucially for our interests—detailed psychometric measures of cognitive skills.
The NLSY acquired its cognitive measures by a lucky coincidence. In 1980, a year after the first wave of data collection, the Department of Defense decided to update the national norms for its battery of enlistment tests. At the time, it was still using test scores from World War II recruits as the reference population. Because the NLSY had just gone through the technically difficult and tedious task of selecting a nationally representative sample, the Department of Defense proposed to piggyback its study on the NLSY sample.4 And so the NLSY became the beneficiary of an expensive, well-designed set of cognitive and aptitude tests that were given under carefully controlled conditions to almost 94 percent of the 12,686 young men and women in the NLSY sample.5
The measure of cognitive ability extracted from this test battery was the Armed Forces Qualification Test, the AFQT. It is what the psychometricians call “highly g-loaded,” meaning that it is a good measure of general cognitive ability.6 The AFQT’s most significant shortcoming is that it is truncated at the high end; about one person in a thousand gets a perfect score, which means both that the test does not discriminate among the very highest levels of intelligence and that the variance in the population is somewhat understated. Otherwise the AFQT is an excellent test, with psychometric reliability and validity that compare well with those of the other major tests of intelligence. Because the raw scores on the AFQT mean nothing to the average reader, we express them in the IQ metric (with a mean of 100 and a standard deviation of 15) or in centiles. Also, we will subsequently refer to them as “IQ scores,” in keeping with our policy of using IQ as a generic term for intelligence test scores. When we use centiles, they are age equated. A centile score of 45, for example, means that the subject would rank in the 45th percentile of everyone born in the same year, if everyone took the AFQT.7 A final point about the presentation of NLSY results is that all results are based on weighted analyses, which means that all may be interpreted in terms of a nationally representative sample of Americans in the NLSY age group. We use data collected through the 1990 interview wave.
THE DEFINITION OF COGNITIVE CLASSES
To this point, we have been referring to cognitive classes without being specific. In these chapters, we divide the world into cognitive classes—five of them, because that has been the most common number among sociologists who have broken down socioeconomic status into classes and because five allows the natural groupings of “very high,” “high,” “mid,” “low,” and “very low.” We have chosen to break the intervals at the 5th, 25th, 75th, and 95th percentiles of the distribution. The figure shows how this looks for a normally distributed population.
Break points are arbitrary, but we did have some reasons for these. Mainly, we wanted to focus on the extremes; hence, we avoided a simple breakdown into quintiles (i.e., into equal cuts of 20 percent). A great deal of interest goes on within the top 20 percent and bottom 20 percent of the population. Indeed, if the sample sizes were large enough, we would have defined the top cognitive class as consisting of the top 1 or 2 percent of the population. Important gradations in social behavior occasionally separate the top 2 percent from the next 2 percent. This is in line with another of the themes that we keep reiterating because they are so easily forgotten: You—meaning the self-selected person who has read this far into this book—live in a world that probably looks nothing like the figure. In all likelihood, almost all of your friends and professional associates belong in that top Class I slice. Your friends and associates whom you consider to be unusually slow are probably somewhere in Class II. Those whom you consider to be unusually bright are probably somewhere in the upper fraction of the 99th centile, a very thin slice of the overall distribution. In defining Class I, which we will use as an operational definition of the more amorphous group called the “cognitive elite,” as being the top 5 percent, we are being quite inclusive. It does, after all, embrace some 12 1/2 million people. Class III, the normals, comprises half of the population. Classes II and IV each comprises 20 percent, and Class V, like Class I, comprises 5 percent.
Defining the cognitive classes
The labels for the classes are the best we could do. It is impossible to devise neutral terms for people in the lowest classes or the highest ones. Our choice of “very dull” for Class V sounds to us less damning than the standard “retarded” (which is generally defined as below an IQ of 70, with “borderline retarded” referring to IQs between 70 and 80). “Very bright” seems more focused than “superior,” which is the standard term for people with IQs of 120 to 130 (those with IQs above 130 are called “very superior” in that nomenclature).8
PRESENTING STATISTICAL RESULTS
The basic tool for multivariate analysis in the social sciences is known as regression analysis.9 The many forms of regression analysis have a common structure. There is a result to explain, the dependent variable. There are some things that might be the causes, the independent variables. Regression analysis tells how much each cause actually affects the result, taking the role of all the other hypothesized causes into account—an enormously useful thing for a statistical procedure to do, hence its widespread use.
In most of the chapters of Part II, we will be looking at a variety of social behaviors, ranging from crime to childbearing to unemployment to citizenship. In each instance, we will look first at the direc
t relationship of cognitive ability to that behavior. After observing a statistical connection, the next question to come to mind is, What else might be another source of the relationship?
In the case of IQ, the obvious answer is socioeconomic status. To what extent is this relationship really founded on the social background and economic resources that shaped the environment in which the person grew up—the parents’ socioeconomic status (SES)—rather than intelligence? Our measure of SES is an index combining indicators of parental education, income, and occupational prestige (details may be found in Appendix 2). Our basic procedure has been to run regression analyses in which the independent variables include IQ and parental SES.10 The result is a statement of the form: “Here is the relationship of IQ to social behavior X after the effects of socioeconomic background have been extracted,” or vice versa. Usually this takes the analysis most of the distance it can sensibly be pushed. If the independent relationship of IQ to social behavior X is small, there is no point in looking further. If the role of IQ remains large independent of SES, then it is worth thinking about, for it may cast social behavior and public policy in a new light.
What Is a Variable?
The word variable confuses some people who are new to statistics, because it sounds as if a variable is something that keeps changing. In fact, it is something that has different values among the members of a population. Consider weight as a variable. For any given observation, weight is a single number: the number of pounds that an object weighed at the time the observation was taken. But over all the members of the sample, weight has different values: It varies, hence it is a variable. A mnemonic for keeping “independent” and “dependent” straight is that the dependent variable is thought to “depend on” the values of the independent variables.
But What About Other Explanations?
We do not have the choice of leaving the issue of causation at that, however. Because intelligence has been such a taboo explanation for social behavior, we assume that our conclusions will often be resisted, if not condemned. We can already hear critics saying, “If only they had added this other variable to the analysis, they would have seen that intelligence has nothing to do with X.” A major part of our analysis accordingly has been to anticipate what other variables might be invoked and seeing if they do in fact attenuate the relationship of IQ to any given social behavior. This was not a scattershot effort. For each relationship, we asked ourselves if evidence, theory, or common sense suggests another major causal story. Sometimes it did. When looking at whether a new mother went on welfare, for example, it clearly was not enough to know the general socioeconomic background of the woman’s parents. It was also essential to examine her own economic situation at the time she had the baby: Whatever her IQ is, would she go on welfare if she had economic resources to draw on?
At this point, however, statistical analysis can become a bottomless pit. It is not uncommon in technical journals to read articles built around the estimated effects of a dozen or more independent variables. Sometimes the entire set of variables is loaded into a single regression equation. Sometimes sets of equations are used—modeling even more complex relationships, in which all the variables can exert mutual effects on one another.
Why should we not press forward? Why not also ask if religious background has an effect on the decision to go on welfare, for example? It is an interesting question, as are another fifty others that might come to mind. Our principle was to explore additional dynamics when there was another factor that was not only conceivably important but for clear logical reasons might be important because of dynamics having little or nothing to do with IQ. This last proviso is crucial, for one of the most common misuses of regression analysis is to introduce an additional variable that in reality is mostly another expression of variables that are already in the equation.
The Special Case of Education
Education posed a special and continuing problem. On the one hand, education can be important independent of cognitive ability. For example, education tends to delay marriage and childbirth because the time and commitment involved in being in school competes with the time and commitment it takes to be married or have a baby. Education shapes tastes and values in ways that are independent of the cognitive ability of the student. At the same time, however, the role of education versus IQ as calculated by a regression equation is tricky to interpret, for four reasons.
First, the number of years of education that a youth gets is caused to an important degree by both the parents’ SES and the youth’s own academic ability. In the NLSY, for example, the correlation of years of education with parental SES and youth’s IQ are +.50 and +.64, respectively. This means that when years of education is used as an independent variable, it is to some extent expressing the effects of SES and IQ in another form.
Second, any role that education plays independent of intelligence is likely to be discontinuous. For example, it may make a big difference to many outcomes that a person has a college degree. But how is one to interpret the substantive difference between one year of college and two? Between one year of graduate school and two? They are unlikely to be nearly as important as the difference between “a college degree” and “no college degree.”
Third, variables that are closely related can in some circumstances produce a technical problem known as multicollinearity, whereby the solutions produced by regression equations are unstable and often misleading.
Fourth and finally, to take education’s regression coefficient seriously tacitly assumes that intelligence and education could vary independently and produce similar results. No one can believe this to be true in general: indisputably, giving nineteen years of education to a person with an IQ of 75 is not going to have the same impact on life as it would for a person with an IQ of 125. The effects of education, whatever they may be, depend on the coexistence of suitable cognitive ability in ways that often require complex and extensive modeling of interaction effects—once again, problems that we hope others will take up but would push us far beyond the purposes of this book.
Our solution to this situation is to report the role of cognitive ability for two subpopulations of the NLSY that each have the same level of education: a high school diploma, no more and no less in one group; a bachelor’s degree, no more and no less, in the other. This is a simple, but we believe reasonable, way of bounding the degree to which cognitive ability makes a difference independent of education.
We walk through all three of these basics—the NLSY, the five cognitive classes, and the format for the statistical analysis—in a step-by-step fashion in the next chapter, where we use poverty to set the stage for the social behaviors to follow. Chapter 6 returns to education, this time not just talking about how far people got but the comparative roles of IQ and SES in determining how far someone gets in school. Then, seriatim, we take up unemployment and labor force dropout (Chapter 7), single-parent families and illegitimacy (Chapter 8), welfare dependency (Chapter 9), parenting (Chapter 10), crime (Chapter 11), and civic behavior (Chapter 12).
In these eight chapters, we limit the analysis to whites, and more specifically to non-Latino whites.11 This is, we think, the best way to make yet another central point: Cognitive ability affects social behavior without regard to race or ethnicity. The influence of race and ethnicity is deferred to Part III.
Chapter 5
Poverty
Who becomes poor? One familiar answer is that people who are unlucky enough to be born to poor parents become poor. There is some truth to this. Whites, the focus of our analyses in the chapters of Part II, who grew up in the worst 5 percent of socioeconomic circumstances are eight times more likely to fall below the poverty line than those growing up in the top 5 percent of socioeconomic circumstances. But low intelligence is a stronger precursor of poverty than low socioeconomic background. Whites with IQs in the bottom 5 percent of the distribution of cognitive ability are fifteen times more likely to be poor than those with IQs in the top 5
percent.
How does each of these causes of poverty look when the other is held constant? Or to put it another way: If you have to choose, is it better to be born smart or rich? The answer is unequivocally “smart.” A white youth reared in a home in which the parent or parents were chronically unemployed, worked at only the most menial of jobs, and had not gotten past ninth grade, but of just average intelligence—an IQ of 100—has nearly a 90 percent chance of being out of poverty by his or her early 30s. Conversely, a white youth born to a solid middle-class family but with an IQ equivalently below average faces a much higher risk of poverty, despite his more fortunate background.
When the picture is complicated by adding the effects of sex, marital status, and years of education, intelligence remains more important than any of them, with marital status running a close second. Among people who are both smart and well educated, the risk of poverty approaches zero. But it should also be noted that young white adults who marry are seldom in poverty, even if they are below average in intelligence or education. Even in these more complicated analyses, low IQ continues to be a much stronger precursor of poverty than the socioeconomic circumstances in which people grow up.
We begin with poverty because it has been so much at the center of concern about social problems. We will be asking, “What causes poverty?” focusing on the role that cognitive ability might play. Our point of departure is a quick look at the history of poverty in the next figure, which scholars from the Institute for Research on Poverty have now enabled us to take back to the 1930s.1
The Bell Curve: Intelligence and Class Structure in American Life Page 16