The Neuroscience of Intelligence
Page 3
Most theories about factors of intelligence start with the empirical observation that all tests of mental abilities are positively correlated with each other. This is called the “positive manifold” and Charles Spearman first described it over 100 years ago (Spearman, 1904). Spearman worked out statistical procedures for identifying the relationships among tests based on their correlations with one another. The basic method is called factor analysis. It works essentially by analyzing correlations among tests. You probably already know about correlations, but see the brief review in Textbox 1.1.
Textbox 1.1: Correlations
Many of you know about correlations. Because they are ubiquitous throughout this book, here is a brief explanation so everyone starts with an understanding of the concept. Let’s say we measure height and weight in many people. We can graph each person by locating the height and weight as a single point with height ranges on the y-axis and weight ranges on the x-axis. When we add points on the graph for each person, we begin to see an association. Taller people tend to weigh more. You can see this in Figure 1.2. This association is obvious without needing to plot the points, but associations between other variables are not so obvious. Moreover, correlations quantify the strength of association.
If height and weight were perfectly related, the points would all fall on a straight line and we could predict one from the other without error. A correlation has a value of plus 1 if a high value on one variable goes perfectly with a high value on the other variable. A strong but not perfect positive correlation is shown in Figure 1.2. A perfect negative correlation is where a high value on one variable predicts a low value on the other without error. A strong but not perfect negative correlation (also called an inverse correlation) is also shown in Figure 1.2. A perfect negative correlation has a value of minus 1. In the Figure 1.2 example, the higher the family income, the lower the rate of infant mortality. Finally, in Figure 1.2, the bottom panel shows no relationship at all (zero correlation) between height and hours of video game playing.
Figure 1.2 Example of a positive correlation is on the top left, showing that as height increases weight also increases. A negative correlation is shown on the top right, showing that as family income goes up infant mortality goes down (simulated data). No correlation between height and hours spent playing video games is shown on the bottom. For all of these scatterplots, each circle is a data point. The solid line shows a perfect correlation; the amount that points scatter above and below this line is used to calculate the correlation (courtesy Richard Haier).
Correlations between two variables are calculated based on how much each point deviates from the perfect line. The higher the correlation, positive or negative, the stronger the relationship and the better one variable predicts the other. Correlations always fall between plus and minus 1. Here is a critical point. A correlation between two variables does not mean one causes the other. The correlation only means there is a relationship such that as one goes up or down so does the other. To repeat, correlation does not mean causality. Two variables may be correlated to each other but neither causes the other. For example, salt consumption and cholesterol level in the blood may be somewhat correlated, but that does not mean one causes the other. The correlation could be caused by a third factor common to both, like poor diet.
Factor analysis is based on the pattern of correlations among multiple variables. In our case we are interested in the correlations among different tests of mental abilities. So the point of factor analysis is to identify what tests go with other tests, based not on content, but rather on correlations of scores irrespective of content. The set of tests that go with each other define a factor because they have something in common that causes the correlation. Studies in this field typically apply factor analysis to data sets where hundreds or thousands of people have completed dozens of tests.
There are many forms of factor analysis, but this is the basic concept and it is the basis for models of the structure of mental abilities like the pyramid described in Figure 1.1. Going back to that figure, note the correlation values show how strong the associations are among tests, factors, and g. Note that all the correlations are positive, consistent with Spearman’s positive manifold.
Let’s look at some details of this example in Figure 1.1. The reasoning factor is related to g with the strongest correlation of .96. This indicates that the reasoning factor is the strongest factor related to g, so tests of reasoning are regarded as among the best estimates of g. Another way of saying this is that reasoning tests have high g-loadings. Note that test #1 has the single highest loading of .93 on the reasoning factor, so it might provide the single best estimate of g if only one test is used rather than a battery of tests. The second strongest correlation is between the spatial ability factor and g. It turns out that spatial ability tests also are good estimates of g. The vocabulary factor is fairly strong at .74, followed by the other factors, including memory. In this example, memory tests are good but not the best estimators of g with a correlation of .80, although other research shows much stronger correlations between working memory and g (see Section 6.2).
1.4 Alternative Models
Other statisticians and researchers worked out alternative factor analysis methods. The details don’t concern us, but different factor analysis models of intelligence were derived using these various methods. Each identified a different factor structure for intelligence. These various factors emphasize that the g-factor alone is not the whole story about intelligence; no intelligence researcher ever asserted otherwise or claimed that a single score captures all aspects of intelligence. The other broad factors and specific mental abilities are important. Depending on how researchers derive factors from a battery of tests, a different number of factors secondary to g emerge. In the pyramid structure diagram example there are five broad factors. Another widely used model is based on only two core factors: crystallized intelligence and fluid intelligence (Cattell, 1971, 1987). Crystallized intelligence refers to the ability to learn facts and absorb information based on knowledge and experience. This is the kind of intelligence shown by some savants. Fluid intelligence refers to inductive and deductive reasoning for novel problem-solving. This is the kind of intelligence we associate with Einstein or Newton. Measures of fluid intelligence typically are highly correlated to measures of g, and the two are often used synonymously. Crystallized intelligence is relatively stable over the life span with little deterioration with age, whereas fluid intelligence decreases slowly with age. The distinction between fluid and crystallized intelligence is widely recognized as an important evolution in the definition of intelligence. Both are related, so they are not in conflict with the g-factor. They represent factors just below g in the pyramid structure of mental abilities.
Another factor analysis model focuses on three core factors, verbal, perceptual, and spatial rotation, in addition to g (Johnson & Bouchard, 2005). There are also models with less empirical evidence like those of Robert Sternberg (Gottfredson, 2003a; Sternberg, 2000, 2003) and Howard Gardner (Gardner, 1987; Gardner & Moran, 2006; Waterhouse, 2006) that de-emphasize or ignore the g-factor. Virtually all of the neuroscience studies of intelligence, however, use various measures with high g-loadings. We will focus on these, but also include several neuroscience studies that investigate factors and specific abilities other than g.
1.5 Focus on the g-Factor
g is the basis of most intelligence assessment used in research today. It is not the same as IQ, but IQ scores are good estimates of g because most IQ tests are based on a battery of tests that sample many mental factors, an important aspect of g. Many of the controversies about intelligence have their origins in confusion about how we use words like mental abilities, intelligence, the g-factor, and IQ. Figure 1.3 shows a diagram that will help clarify how I use these words throughout this book.
Figure 1.3 Conceptual relationships among mental abilities, intelligence, IQ, and the g-factor (The Intelligent Brain, © 2013 The Teaching Company, LLC
.
Reproduced with permission of The Teaching Company, LLC, www.thegreatcourses.com).
We have many mental abilities. All the things you can think of from multiplying in your head to picking stocks to naming state capitals. The large circle in Figure 1.3 represents all mental abilities. Intelligence is a catch-all word that means the mental abilities most related to responding to everyday problems and navigating the environment as per the APA and the Gottfredson definitions. The circle labeled intelligence is smaller than all mental abilities. IQ is a test score based on a subset of the mental abilities that relate to everyday intelligence. The IQ circle is a fairly large part of the intelligence circle because IQ is a good predictor of everyday intelligence. This circle also includes broad factors such as in the diagram of the pyramid structure in Figure 1.1. We will detail more about IQ in the next section. Finally, the g-factor is what is common to all mental abilities. The g-factor is a fairly large part of IQ. Whereas everyday intelligence and IQ test scores can be influenced by many factors, including social and cultural ones, the g-factor is thought to be more biological and genetic, as we will see in the next chapters.
The savant examples described earlier speak to the level of very specific abilities with little if any g in many cases, like Kim and Derek. They show that powerful independent abilities can exist, but they also show the problems when g is lacking. The IBM computer Watson demonstrates a specific ability to analyze verbal information and solve problems based on the meaning of words. This is an amazing accomplishment, but, in my view, Watson does not show the g-factor. Watson is more like Kim Peek than Albert Einstein, … at least for now.
The savant examples are exceedingly rare cases. Most people have g and independent factors to varying degrees, and two people with the same level of g can have different patterns of mental strengths and weaknesses across different mental abilities. Can we ever hope to learn how savants do amazing mental feats, and why we can’t? Is it possible that we all have the potential to memorize 22,514 digits or the potential for musical or artistic genius? And, why are some people just smarter than others? Does everyone have equal potential for learning all subjects? There are many questions and, as in every scientific field, answers depend entirely on measurement.
1.6 Measuring Intelligence and IQ
IQ is what most people associate with measuring intelligence. Criticism of IQ and all mental tests is widespread and has been so for decades (Lerner, 1980). It is worth remembering that the concept of testing mental ability arose to help children get special education. It is also worth stating that intelligence tests are regarded as one of the great achievements of psychology despite many concerns. Let’s briefly discuss both these points. Informative, detailed discussions about IQ testing are also found in two recent textbooks (Hunt, 2011; Mackintosh, 2011).
In the early part of the twentieth century, the Minister of Education in France was concerned about identifying children with low school achievement who needed special attention. The problem was how to distinguish children who were “mentally defective” from other children who were low achievers due to behavioral or other reasons. They wanted the distinction to be made objectively by means of testing so a teacher could not assign a child with discipline issues to a special school as a punishment, as apparently was somewhat common at the time.
In this context, Alfred Binet and his collaborator Theodore Simon devised the first IQ test to identify children who mentally could not benefit from ordinary school instruction. So the IQ test was born as an objective means for identifying low mental ability in children so they could get special attention and to identify children erroneously sent to special schools not because of low mental ability but as a punishment for bad behavior. Both goals were admirable.
The test constructed by Binet and Simon consisted of several subtests that sampled different mental abilities, with an emphasis on tests of judgment because Binet felt that judgment was a key aspect of intelligence. He gave each test to many children and developed average scores for each age and sex. He then was able to say at what age level any individual child scored. This was called the child’s mental age. A German psychologist named William Stern took the concept of mental age a step further. He divided mental age by chronological age. This resulted in an IQ score that was the ratio of a child’s mental age (averaged across all the subtests) divided by the child’s chronological age. Multiplying this ratio by 100 avoided fractions.
For example, if a child was reading at the level of an average 9-year-old, the child’s mental age was nine. If this child actually had a chronological age of 9, the IQ would be 9 divided by 9 = 1 × 100, or an IQ of 100. If a child had a mental age of 10, but was only 9 years old, the IQ would be 10 divided by 9 = 1.11 × 100, or 111. A 9-year-old with a mental age of 8 would have an IQ of 8 divided by 9 = .89 × 100, IQ = 89.
The point of these early tests was to find children who were not doing so well in school relative to their peers, and get them special attention. The Binet–Simon test actually worked reasonably well for this purpose. However, one problem with the concept of mental age is that it is hard to assess after about age 16. Can we really see a mental age difference between a 19-year-old and a 21-year-old? We’re not talking about maturity here. The mental age of a 30-year-old really isn’t much different than that of a 40-year-old, so the Binet–Simon test was not really useful or intended for adults.
However, there is a much more important measurement problem to keep in mind. Note that the IQ score is a measure of a child relative to their peers. Even today, newer IQ tests based on a different calculation, discussed below, show how an individual scores relative to his or her peers. IQ scores are not absolute measures of a quantity, like pints of water or kilometers of distance. IQ scores are meaningful only relative to other people. Note that intelligence differences among people are quite real, but our methods of measuring these differences depend on test scores that are interpretable only in a relative way. We will elaborate this key point shortly and return to it throughout this book.
Nonetheless, the Binet–Simon test was an important advance for assessing the abilities of children in an objective way. The Binet–Simon test was translated to English and redone at Stanford University in the 1920s by Professor Lewis Terman and is now known as the Stanford–Binet test. Professor Terman used very high IQ scores from this test to identify a sample for a longitudinal study of “genius” and we will discuss it in Section 1.10.4.
The Wechsler Adult Intelligence Scale, or the WAIS, was designed with subtests like the Stanford–Binet, but as its name states, it was designed for adults. It is the most widely used intelligence test today. The current version consists of a battery of 10 core subtests and another five supplemental subtests. Together, they sample a broad range of mental abilities. One key change is in the way IQ is calculated in both the WAIS and the Stanford–Binet tests. Mental age is no longer used. IQ is now based on the statistical properties of the normal distribution and deviation scores. The concept is simple: How far from the norm does an individual’s score deviate?
Here’s how deviation scores work. Let’s start with the properties of a normal distribution (also called a bell curve because of its shape), as shown in Figure 1.4.
Figure 1.4 The normal distribution of IQ scores and the percent of people within each level (courtesy Richard Haier).
Many variables and characteristics such as height or income or IQ scores are normally distributed in large populations of randomly selected individuals. Most people have middle values and the number of individuals decreases toward the low and high extremes of the distribution. Any normal distribution has specific statistical properties in that any individual score can be expressed as a percentile relative to other people. This is shown in the illustration of IQ scores where the mean score is 100 and the standard deviation is 15 points. Standard deviations show the degree of spread around the mean and are calculated as a function of how much each person deviates from the group mean. In a normal
distribution, 50% of people score below 100. Sixty-eight percent of individuals fall between plus one and minus one standard deviations so scores between 85 and 115 are regarded as the range of average IQ. A score of 130, two standard deviations above the mean, would be at about the 98th percentile, which is the top 2%. A score of 70 would be two standard deviations below the mean and represent about the 2nd percentile. A score of 145 represents the top one-tenth of 1%. Scores over 145 are often considered to be in the genius range, although few tests are accurate at this extreme high end of the distribution.
IQ tests were developed so scores would be normally distributed. Each subtest has been taken by a large number of males and females of different ages. These are the norm groups. Each norm has an average score called the mean, and the spread of scores around the mean is measured by a statistic called the standard deviation (sd).
Let’s say a subtest has a perfect possible score of 20 points. Each norm group may have a different average score on this test depending, say, on age. Younger test takers may average 8 points if they are 10 years old, and older people taking the same test, say at age 12, may average a score of 14 points. This is why it’s important to have norm groups for each age. If a new 12-year-old takes the subtest and scores 14, he is scoring at the average for his age. If he scores above or below 14, the deviation from the norm average can be calculated and his score can be expressed by how much it deviates from the mean. The average deviation across all the subtests is used to calculate the deviation IQ for the full battery. As illustrated, deviation scores are easily convertible into percentiles.