4.3 Predicting IQ From Brain Images
Imagine if colleges and universities gave applicants for admission a choice between submitting either SAT scores or a brain image. As discussed in Chapter 1, SAT scores are a good estimate of general intelligence and that is an important reason they are good predictors of academic success. Can a better estimate of intelligence or predictor of academic success be extracted from a brain image? This is an empirical question, and a positive answer is probably far less scary than you might think. In fact, brain images are likely to be more objective, especially structural images, and not sensitive to a host of factors that potentially can influence psychometric test scores, like motivation or anxiety. Whether you are a good test-taker or not is irrelevant for getting a scan. Brain images generally are less expensive than SAT preparation courses or formal IQ testing and getting a brain image is far less time-consuming. There is no preparation, you spend about 20 minutes in the scanner, and you can have a nap during structural image acquisition. Still not interested in this possibility?
Whether or not there are any practical applications for predicting IQ from brain images, the ability to do so would signal a more advanced understanding of brain/intelligence relationships than we have currently. In fact, predicting IQ from neuroscience measures like those obtained from neuroimaging is one of two major goals of intelligence research. The other one is the ability to manipulate brain variables to enhance IQ, and that one will be tackled in the next chapter.
There is a long history of trying to predict IQ from brain measurements that dates back to the early EEG research of the 1950s and 1960s. At least one patent to do so was issued in 1974 (US 3,809,069). In 2004, a group from the University of New Mexico, including my colleague Rex Jung, obtained a patent (US 6,708,053 B1) to measure IQ based on neurochemical signatures in the brain assessed by MRI spectroscopy. This claim was derived from their research correlating IQ to N-aspartate in a single brain area (see Chapter 3) (Jung et al., 1999a, 1999b). In 2006, a group from South Korea filed a patent application to measure IQ from a combination of structural and functional MRI assessments and that patent eventually was issued in 2012 (US 8,301,223 B2). Their patent is supported by previous research, including our MRI work (Haier et al., 2004), and on research from the South Korean group that reports correlations between predicted IQ scores and actual IQ scores for different samples (Choi et al., 2008; Yang et al., 2013). Let me note that no commercial potential for these patents is apparent to me at this time. In my view, none of these patents represent an immediate threat to the publishers of the SAT or the WAIS IQ test because their validity has yet to be established in any large-scale, independent replication trials. I am doubtful that such studies will be positive for these specific methods. This is because predicting an individual’s IQ based on group average data is quite difficult. Nonetheless, I am optimistic that neuroimaging-based IQ prediction may be possible. My skepticism and my optimism both are derived from my view about the importance of individual differences. Let me explain.
Conceptually, predicting IQ or any intelligence factor from neuroimaging is straightforward. Success depends on how strongly brain variables, individually or in combination, correlate to intelligence test scores. Recall from Chapter 1 that IQ scores are good estimates of the g-factor because they are a combination of scores, age- and sex-corrected, on several subtests that tap different cognitive domains. Presumably, different cognitive domains require different brain networks so various brain measures for different domains might be combined to predict IQ. As we noted in Chapter 3, whole brain size has a modest correlation with IQ. It is not strong enough for brain size alone to substitute for IQ, but certainly the correlation is a base to build on.
There are a number of statistical approaches for combining measures to make a prediction. The most basic one applied for predicting IQ scores is the multiple regression equation. This method and related versions of it use the correlation between IQ and each measure after removing common relationships between measures. For example, if variable A correlates to IQ and so do variables B and C, they cannot be simply combined without first statistically removing the common variance between A and B, A and C, and B and C. The remaining correlations to IQ for each variable are called partial correlations. Regression equations combine partial correlations between each variable and IQ along with computing weights for each variable that maximize the IQ prediction. In the ABC example, A might be weighted more than B and B might be weighted more than C in order to make the strongest IQ prediction. The resulting equation can then be applied to a new person’s data and an IQ score predicted. The correlation between a predicted IQ score and the actual IQ score for a large group of individuals must be nearly perfect for the equation to be acceptable as a substitute for an actual IQ score. It is not sufficient if an equation produces a statistically significant correlation between the predicted IQ and the actual IQ. Whenever a regression equation is calculated in a research sample, the exact same equation must be applied to an independent sample so the predicted and actual IQ score correlation can be replicated. This is cross-validation of the equation and this step is required because the original equation may produce a spuriously high correlation by incorporating chance effects, especially in small samples. In our research, we have tried the regression approach on several occasions, but each regression equation failed on cross-validation so neither publication nor patent was attempted.
So far, to the best of my knowledge, none of the patented methods of predicting IQ from brain measurements has achieved this crucial step. One recent paper attempted to predict IQ scores from structural MRIs collected from different sites (Wang et al., 2015). They reported good correlations with two different regression models that incorporated gray and white matter, but there was no cross-validation in independent samples, participants (N = 164) ranged in age from 6 to 15 years old, sex effects were not investigated, and no clear description of IQ testing was detailed. They identified 15 brain areas that were included in the prediction, but there was no attempt to integrate these areas to the PFIT or any other intelligence framework, and the areas are not generally found in other imaging studies of intelligence. At best, without independent replication the findings are tenuous. The regression models are of interest, but it is too early to judge whether the analyses will have predictive validity, as the authors hope. Here is their final sentence: “It should be emphasized again that our work paves a new way for research on predicting an infant’s future IQ score by using neuroimaging data, which can be a potential indicator for parents to prepare their child’s education if needed.” It’s an optimistic view of a potential commercial market, but significantly more caution here would have been wise.
An important contribution to this literature comes from the continuing longitudinal study of children from Scotland described in Chapter 1. These researchers collected structural MRIs on 672 individuals with an average age of 73 years and representing the full range of intelligence (Ritchie et al., 2015). They used structural equation modeling, a form of regression equation, to compare four models for combining several different MRI assessments to determine which structural brain features were most related to individual differences in intelligence based on a g-factor extracted from a battery of cognitive tests. They found that the best model accounted for about 20% of the variance in the g-factor. Total brain volume was the single measure that contributed most of the predicted variance in this model. White matter along with cortical and subcortical thickness contributed some additional variance, but iron deposits and micro-bleeds did not. The main issue for future study was whether additional measurements of other brain variables like corpus callosum thickness or functional variables might add predictive variance beyond 20%. This project had a large sample and multiple cognitive measures so it is a solid study of older men and women. How the results might differ for children or younger adults is an open question for replication and cross-validation studies.
Why would straightforward pre
diction approaches like the ones described fail to cross-validate? Correlations between any two variables are based on individual differences for each variable. That is, there must be variance among individuals for correlations to exist. Regression equations generally work on group data where there is variance on all the variables. In the case of intelligence, there may be many combinations of the same set of variables that predict any specific IQ equally well. For example, one set of brain variables might characterize a person with an IQ score of 130, but another person with the identical IQ score of 130 might be characterized by a different set of brain variables. In a group of 100 people all with IQs of 130, how many different sets of brain variables related to intelligence might there be? Compounding the problem, two individuals both with WAIS IQs of 130 may have very different subtest scores indicating different cognitive strengths and weaknesses despite the same overall IQ (Johnson et al., 2008a). The same problem may exist independently at several IQ levels so the brain variables that predict high IQ might be different from those that predict average or low IQ, even though the relevant genes may be the same across the entire IQ range as discussed in Chapter 2. Age and sex also could be important factors for identifying optimal sets of variables for predicting IQ.
There is another major source of difficulty. No two brains are the same structurally or functionally, even in identical twins. Virtually all brain image analysis starts with morphing each brain into a standard size and shape, called a template. This step artificially reduces individual differences in brain anatomy by creating an “average” brain. To account for the imprecision of “average” brain anatomy, analyses typically add a step of smoothing in an effort to minimize the imprecision. Nonetheless, forcing all brains into a standard space introduces error into efforts to predict IQ from images. Some template methods create more error than others. Consider a neuroimaging study of male/female differences. Should the males be standardized to a male template and the females to a female template, or should everyone be standardized to the same template? Many neuroimaging studies use a standard template supplied by the analysis software and other studies create a template from only the participants in the study. No one way is always correct, but the issue presents a problem for efforts to predict IQ. A study of 100 postmortem brains highlights the issue (Witelson et al., 2006). Their strongest finding was that 36% of differences in verbal ability scores was predicted by cerebral volume. However, age, sex, and handedness influenced regression analyses between other anatomical features and different cognitive abilities in complex ways. The authors cautioned neuroimaging researchers to take these factors into account.
When all these issues are considered, the prediction of IQ using regression methods becomes less straightforward. How many different regression equations may be necessary? This is why I am skeptical of this approach. One alternative approach may involve the use of profile analysis. This is common in personality testing where a profile of scores on different personality scales is used to characterize an individual. Profiles are used extensively to interpret personality tests like the Minnesota Multiphasic Personality Inventory (MMPI), for example. MMPI scores from the different subscales can be used in regression analyses, but the analysis of individual profiles allows comparisons of groups of people defined by similar profiles across the subscales to determine variables related to the profile type. We illustrated this approach for predicting intelligence by creating profiles of individuals based on the amount of gray matter in several PFIT areas and tried to relate the profile to IQ score (Haier, 2009b). As shown in Figure 4.10, this demonstration did not work so well. The two individuals shown with equally high IQ scores had different gray matter profiles. Nonetheless, we are now trying this profile approach with the MEG data as shown in Textbox 4.2. It is a promising approach for future study in large samples. In fact, an encouraging report used patterns of fMRI activation (including in some PFIT areas) to predict profiles of cognitive performance, while a small sample of 26 participants solved deductive reasoning problems (Reverberi et al., 2012). The analyses show the complexity of the problem, but the results illustrate my optimism that individual differences can be a solution to the complexity and not just a nuisance.
Figure 4.10 Brain profiles of two individuals both with an IQ of 132. The graphs show the amount of gray matter in eight PFIT areas identified by numbered Brodmann area (L, left; R, right). The y-axis is based on standardized gray matter scores so positive numbers show values greater than the group mean; negative numbers show values less than the group mean. Although the profile shapes are similar for these two individuals, one has substantially more gray matter in the eight areas than the other (courtesy Richard Haier).
There is another potential application for predicting IQ. As we discussed in Chapter 1, the g-factor definition of intelligence is sufficient for many empirical research questions, but what if we could define intelligence based on quantifiable brain measures instead of psychometric scores? If brain parameters can predict IQ, then why not define IQ in terms of brain parameters? We do not know if twice the amount of gray matter in a particular part of the cortex, for example, makes one twice as smart. We are now able to explore redefining intelligence in ways that incorporate neurometric assessments. In the next chapter, we will explore this idea further when we discuss future research possibilities.
So can intelligence be predicted from neuroimaging? The short answer is, no. The longer answer is, not yet. So far, the weight of evidence is promising but not compelling. Look at the MEG movies of brain activation patterns for different individuals correctly solving the same single problem (Textbox 4.2 links). How can these patterns be understood? Is there one particular pattern of brain variables, a unitary neuro-g, which correlates with the psychometric g-factor, or are there multiple brain patterns that imply many neuro-g factors (Haier et al., 2009)? So far, we do not know. It is my speculation, however, that should a cross-validated method become available to predict IQ or SAT scores accurately from brain images, many parents of high school students will be eager to use it and lobby institutions of higher education to do so as well. Imagine that.
Stop imagining! Just as I was finishing the final draft of this book, a remarkable new study reports that the pattern of connectivity among brain areas based on fMRI is stable within a person and unique enough to identify that person like a fingerprint (Finn et al., 2015). And, these brain fingerprints predict intelligence. This study comes from a large collaborative project that aims to map all the connections in the human brain. I have added a more detailed description of this study at the end of Section 6.4, but at this point I am ready to change the answer about predicting intelligence from neuroimaging from “not yet” to “looking good.” Very good – see more at the end of Section 6.4.
4.4 Are “Intelligence” and “Reasoning” Synonyms?
This may seem an odd question, but there is an anomaly in the research literature that deserves some consideration at this point. There is a specialization within the field of cognitive psychology that studies reasoning. Relational reasoning, inductive reasoning, deductive reasoning, analogical reasoning, and other kinds of reasoning are subjects in a variety of studies, including ones that use neuroimaging to identify brain characteristics and networks related to reasoning. The anomaly is that more than a few of these cognitive neuroscience studies of reasoning do not use the word intelligence and they often fail to cite relevant neuroimaging studies of intelligence. This is problematic because tests of reasoning are highly correlated to the g-factor (Jensen, 1998). In fact, analogy tests have some of the highest g-loadings of any mental ability tests. Obviously, this means that findings from intelligence studies are quite relevant to reasoning research and vice versa.
In my view, the artificial preference of “reasoning” over “intelligence” made by some researchers has its origins in a long-held view within cognitive psychology that the word “intelligence” is too loaded with controversy and therefore must be avoided completely. It is not unus
ual to find that books in the field of cognitive psychology and cognitive neuroscience do not include “intelligence” in the index. Language counts. No one is fooled by substituting “reasoning” for “intelligence,” although some granting agencies may think so.
Generally, neuroimaging studies of reasoning show network results consistent with intelligence studies, although reasoning studies tend to differentiate more components of information processing and accompanying subnetworks. This is an important difference and a positive one for identifying the salient brain components for different cognitive processes involved in intelligence factors. An excellent example is a sophisticated fMRI study that compared groups of high school students (N = 40) defined by high and average fluid intelligence scores while they performed problems of different difficulty that required geometric analogical reasoning (Preusse et al., 2011). Hypotheses were based in part on the PFIT and on brain efficiency. The authors concluded that the high-IQ students “… display stronger task-related recruitment of parietal brain regions on the one hand and a negative brain activation–intelligence relationship in frontal brain regions on the other hand … We showed that the relationship between brain activation and fluid intelligence is not mono-directional, but rather, frontal and parietal brain regions are differentially modulated by fluid intelligence when participants carry out the geometric analogical reasoning task.” The integration of reasoning and intelligence findings in this work demonstrates the richness of interpretation possibilities and helps advance the field.
Two other interesting and well-done fMRI papers investigated analogical reasoning although neither one mentioned intelligence. They appeared in a special section on “The neural substrate of analogical reasoning and metaphor comprehension” in the Journal of Experimental Psychology: Learning, Memory, and Cognition (only one of the other six papers in this section mentioned intelligence). The first example used an analogy generation task in a sample of 23 male college students and found corresponding brain activity in a region of the left frontal-polar cortex, as hypothesized (Green et al., 2012). Exploratory analyses revealed more distributed activations (their figure 3), seemingly consistent with the PFIT framework, but the discussion linked the findings to creativity, not intelligence. The left frontal-polar cortex had also been linked to g in the lesion study we described earlier (Glascher et al., 2010). Similarly, the second example reported a systematic investigation of analogical mapping during metaphor comprehension in 24 Carnegie Mellon University undergraduates (males plus females combined) (Prat et al., 2012). The findings showed activations consistent with the PFIT and brain efficiency, although neither PFIT nor intelligence/efficiency findings from other studies were referenced. These two studies are solid contributions to the reasoning research literature, but they were mostly overlooked in the intelligence literature.
The Neuroscience of Intelligence Page 17