Once I leave behind a minimal case—equal means and VR of only 1.09—the disproportions produced by normal distributions increase rapidly. For equal means and a VR favoring males of 1.15, there will be 54 percent more males in the 99th percentile and more than twice as many males in the 99.9th percentile. Start to combine a VR with an effect size favoring males, and the sex imbalance increases even more. Given a VR of 1.15 and an effect size of just d = –0.10 favoring males—“very small” by Cohen’s definition—and you can expect twice as many males as females in the 99th percentile and almost three times as many in the 99.9th percentile.
Do Statistical Expectations for the Tails Correspond to Actual Distributions?
The key issue here is the assumption that the distribution of the trait is perfectly normal all the way out through three standard deviations. The only way to be sure what the male-to-female ratios are at the extremes is to have such a large and representative sample that you can see the actual numbers, not the theoretically predicted ones. But this means extremely large samples. For example, if you are interested in knowing the male-to-female ratio of people with IQs of 145 and higher, you are talking about the ratio for the 99.865th percentile. With a mean of 100 and a standard deviation of 15, even a sample of 10,000 people can be expected to produce only about 13 people with IQs that high, far short of the number you need to have any confidence in the male-female ratio. Many national assessments such as the NAEP in America and the Cognitive Abilities Test in Great Britain have samples with hundreds of thousands or even millions of scores, but to my knowledge they have not published breakdowns by gender within the top five percentiles. We do have two solid pieces of evidence bearing on the question, however.
The Early Childhood Longitudinal Study. The first comes from a test of the common assertion that more people are in the gifted range of intelligence than the statistics of the normal distribution would predict. The study of 10 large, nationally representative samples (first author was Russell Warne) indicated that the numbers of people in the top percentiles are generally about what they should be—somewhat more for some tests, somewhat less for others, but overall close to expectations.63
For the question I’m asking—does greater male variance really predict disproportionate numbers of males at the extremes?—the Warne study was valuable because one of the studies it used, the Early Childhood Longitudinal Study, had a nationally representative sample that was large enough (18,000) to assess disparities up to the top half of the 99th percentile and information broken down by sex. They were given standardized cognitive tests at six points during their observation from kindergarten through 8th grade.
The effect sizes were typical: small female advantages on the reading tests, small male advantages on the math and science tests, and trivial effect sizes on the general knowledge test. The largest effect size was –0.24 on the science test; none of the rest reached an absolute value of 0.17. But VRs for all of the tests were greater than 1.0, ranging from 1.01 on the science test to 1.32 on one of the math tests. First, here are the actual results—the ratios of boys to girls in various high-end categories ranging from the top five percentiles to the top half of the top percentile.
Test: C1 General Knowledge
Top 5%: 1.44
Top 2%: 1.66
Top 1%: 1.69
Top 0.5%: 2.83
Test: C2 General Knowledge
Top 5%: 1.39
Top 2%: 1.78
Top 1%: 2.27
Top 0.5%: 5.25
Test: C3 General Knowledge
Top 5%: 1.62
Top 2%: 2.42
Top 1%: 3.63
Top 0.5%: 3.10
Test: C4 Reading
Top 5%: 0.93
Top 2%: 0.97
Top 1%: 0.95
Top 0.5%: 1.07
Test: C4 Math
Top 5%: 1.84
Top 2%: 2.22
Top 1%: 2.91
Top 0.5%: 2.72
Test: C5 Reading
Top 5%: 0.95
Top 2%: 1.10
Top 1%: 1.16
Test: C5 Math
Top 5%: 2.09
Top 2%: 2.79
Top 1%: 4.35
Top 0.5%: 3.90
Test: C5 Science
Top 5%: 2.10
Top 2%: 2.68
Top 1%: 2.64
Top 0.5%: 4.38
Test: C6 Science
Top 5%: 2.28
Top 2%: 2.55
Top 1%: 1.77
Test: Average, all tests
Top 5%: 1.63
Top 2%: 2.02
Top 1%: 2.37
Top 0.5%: 3.32
Source: Data from Warne, Godwin, and Smith (2013), provided by Russell Warne, personal communication.
The numbering of the tests reflects the increasing ages at which they were administered. To interpret the table, look at the top row, left-hand column—“1.44” indicates that there were 44 percent more boys than girls in the top five percentiles for the initial test of general knowledge. The two blank cells for the right-hand column indicate that no more than a few students scored in that range.
The table shows two broad trends: Throughout elementary and middle school, more boys than girls were represented in the top percentiles in general knowledge and math, with roughly equal proportions for the reading test; and the ratios favoring boys tended to increase as the criteria got more restrictive. Taking the mean for all the tests, the ratios increased from 1.57 for the top five percentiles to 3.15 for the top half percentile.
Next, how did the predictions for those categories based on the assumption of a normal distribution work out? The graph below shows a scatter plot of the predicted ratios and the actual ratios.
Source: Data provided by Russell Warne, personal communication.
Dots falling above the diagonal represent underprediction of the actual ratio; dots falling below the diagonal represent overprediction of the actual ratio. What it comes down to is that for this test battery administered to a large representative sample of children, the disproportion of males at the right tail of the distribution was usually even larger than the VRs would have predicted.[64]
The Scottish Mental Surveys of 1932 and 1947. The second piece of evidence comes from the Scottish Mental Surveys of 1932 and 1947, which tested 87,498 and 75,211 Scottish 11-year-olds respectively, representing 95 percent and 93 percent of the populations. Psychologists Wendy Johnson, Andrew Carothers, and Ian Deary exhaustively analyzed the actual distributions of both tests. Males outnumbered females at both the low and high ends of the IQ distribution for both cohorts. At IQs of 60 and 140 (the low and high points of the range), the male-female ratios in both cohorts were concentrated in a narrow range, from 2.0 to 2.3.65 At IQs of 132 (about the 98th percentile), the ratio of boys to girls had dropped to about 1.4 in both cohorts. These results are roughly consistent with recent ratios of boys to girls for the samples of gifted children of the TIP that I reported in chapter 3.
How well did expectations based on a perfectly normal distribution match up with the actual frequency distributions? For the 1932 survey, the prediction for an IQ at exactly 140 was 1.7 males per female, an underprediction of the actual 2.3 males per female. For IQ at exactly 132, the prediction was 1.4 males per female, the same as the actual result of 1.4. For the 1947 survey, the prediction at IQ 140 was also 1.4 males per female, an underprediction of the actual 2.0. At IQ 132, the prediction was of 1.2 males per female compared to an actual 1.4.66 Overall, taking the Warne and Johnson studies together, the consequences of greater male variance at the tails of actual distributions are at least as great as the assumption of a normal distribution would lead one to expect.
Another lesson of the Scottish surveys is that small variance ratios can make a difference. The VRs for both surveys were quite small, less than 1.1.67 The frequency distributions for the tests were left-skewed, and in other respects fell short of a perfectly normal distribution. And yet both cohorts produced male-to-female ratios at b
oth tails that represent differences easily big enough to have real-world consequences. As for the greater male variance hypothesis, the authors concluded as follows:
In this article, we reviewed the history of the hypothesis that general intelligence is more biologically variable in males than in females and presented data from two samples consisting of almost entire populations that test this hypothesis. These data, which in many ways are the most complete that have ever been compiled, substantially support the hypothesis.68
Generally greater male variance is no longer a hypothesis but a proven phenomenon. Generally greater male variance does not mean universal. How close to universal depends on the characteristic in question. In brain volumes and surface areas, it applied to 134 of the 136 measures. In the physiological traits I reviewed, it appeared in 83 percent of the measures in the Army’s Anthropometric Survey. For cognitive tests, all of the results for the NAEP and the SAT, covering decades in both cases, showed greater male variability in both reading and math. Internationally, the 2003 PISA administration included 41 countries, meaning there were 82 measures of variability for the reading and math tests. Seventy-eight of them showed greater male variability (95 percent). In the 2015 PISA administration with 67 countries and three tests (reading, math, and science), 96 percent showed greater male variability. The weakest evidence for greater male variability I have found is the cross-national data on personality. Of the 255 measures (51 countries, five personality factors), greater male variability was found in just 63 percent of them.
Notes
Many of the books I used for Human Diversity were e-book editions, few of which let the reader know the page numbers of the print version. Most of the technical articles, magazine articles, reports, and databases I cite were found on the Internet. In both cases, standards for citations are still evolving. I have followed the Chicago style with a few simplifying adaptations. For e-books, I give the chapter from which my material was drawn, and the figure or table number when appropriate. For sources taken from the Internet, I give the website’s name and the URL for the home page. I do not give the specific web page because websites change their indexes so frequently, nor do I include the date when I accessed the website.
The full citations of articles in newspapers, magazines, and websites are given in the notes. The references section is reserved for books, journal articles, and other scholarly works.
Introduction
1. Trivers (2011): chapter 13. For a full-scale exposition of the proposition that social science must rest on biology, see Rosenberg (2017). Throughout the book, I use an inclusive definition of social sciences, treating psychology as a social science along with anthropology, sociology, economics, and political science.
2. Trivers (2011): chapter 13.
3. For the roots of the orthodoxy, see Tooby and Cosmides (1992) and Pinker (2002): Part I. I give a fuller account of this story in chapter 15.
4. Berger and Luckmann (1966): 2.
5. For a sympathetic account of the subsequent development of social constructionism, see Lock and Strong (2010).
6. This case was memorably made first in Bloom (1987) and brought up to date in Lukianoff and Haidt (2018).
7. Two classics that introduced the field now called evolutionary psychology are E. O. Wilson’s Sociobiology (1975) and Richard Dawkins’s The Selfish Gene (1976). For those who are prepared for deep dives, there’s David Geary’s The Origin of Mind: Evolution of Brain, Cognition, and General Intelligence (2004) and the sixth edition of David Buss’s classic Evolutionary Psychology: The New Science of the Mind (2019). A shorter and breezier account (but still scientifically serious) is Steve Stewart-Williams’s The Ape That Understood the Universe: How the Mind and Culture Evolve (2018).
8. Gardner (1983) and Gardner (2008). Gardner has nine intelligences in the most recent version: visual-spatial, verbal-linguistic, logical-mathematical, interpersonal, intrapersonal, musical, bodily-kinesthetic, naturalistic, and existential. The labels for the last two are not intuitively understandable. Naturalistic refers to relating to one’s natural surroundings and making accurate judgments, as in hunting, farming, and Charles Darwin’s genius. Existential encompasses what others have called spiritual intelligence. Bodily-kinesthetic intelligence is outside the purview of this book. It has a significant cognitive component, involving a sense of timing, understanding the goal of the physical action, and the mental qualities that go into training the physical abilities. But the core of bodily-kinesthetic ability involves capabilities below the neck. Otherwise, I discuss all of the intelligences (or talents; call them what you will) represented by Gardner’s theory insofar as they have entered the technical literature.
Part I: “Gender Is a Social Construct”
1. For the universality of male social and political dominance, see Goldberg (1993b). The original version, titled The Inevitability of Patriarchy: Why the Biological Difference Between Men and Women Always Produces Male Domination, inspired a variety of attacks claiming to identify tribes or societies in which men did not rule. Goldberg addressed all of them in the 1993 revision, making an empirical case for the universality of male domination that has not subsequently been challenged with data.
2. Locke’s case was limited to arguments that a woman is not the property of the husband, she retains power over her children in the absence of the father, and a woman has as much right as a man to dissolve the marriage compact. Locke (1960).
3. Astell (1700): Preface to the third edition.
4. Kate Austin, “Woman,” unpublished MS, 1901. en.wikisource.org.
5. Swanwick (1913): 28.
6. Mill (1869).
7. Shaw (1891): 42. The advent of evolutionary theory in the mid-nineteenth century provided ammunition for those who saw biologically grounded differences. In 1875, Antoinette Blackwell published The Sexes Throughout Nature, based on Darwin, concluding that men and women were importantly different emotionally and intellectually, but equal. Blackwell (1875). The Evolution of Sex, by Patrick Geddes and Arthur Thomson (1889), concluded that men and women were primordially different: “We have seen that a deep difference in constitution expresses itself in the distinctions between male and female, whether these be physical or mental. The differences may be exaggerated or lessened, but to obliterate them it would be necessary to have all the evolution over again on a new basis. What was decided among the prehistoric Protozoa cannot be annulled by Act of Parliament.” Geddes and Thomson (1889): 247.
8. “On ne naît pas femme: on le devient.” de Beauvoir (2009): 283. The sentence comes at the opening of the first chapter of volume 2. The 2009 translation by Constance Borde and Sheila Malovany-Chevallier reads, “One is not born, but rather becomes, woman,” which I prefer. I have given the customary translation in the text because it is so widely accepted.
9. Martin, Ruble, and Szkrybalo (2002) has a nice account of the different strands of social role theory.
10. Endendijk, Groeneveld, Bakermans-Kranenburg et al. (2016): 1 of 33.
11. The meta-analyses that combine the studies of socialization have found only modest evidence for differential treatment from the second half of the twentieth century onward. The first assessment of the literature was Maccoby and Jacklin (1974). They documented the reality of what are called sex-typed activities (e.g., little girls are given dolls, little boys are encouraged to play sports), but found that otherwise “the reinforcement contingencies for the two sexes appear to be remarkably similar.” (p. 342). A spirited debate ensued, led by Jeanne Block, who argued in a series of articles that Maccoby and Jacklin had underestimated the role of differential socialization (e.g., Block (1978), Block (1983)).
Lytton and Romney (1991), a subsequent meta-analysis of 172 studies of gender socialization, examined socialization regarding eight topics: amount of interaction, achievement encouragement, warmth, nurturance, responsiveness (including praise), encouragement of dependency, restrictiveness/low encouragement of independence, disciplinary strictness
, encouragement of sex-typed activities, sex-typed perception, and clarity of communication/use of reasoning. The meta-analysis found few differences on any of them. “The effect sizes for most socialization areas are nonsignificant and generally very small, fluctuating in direction across studies,” the authors wrote in their conclusion. In the North American studies, the one exception was moderately more encouragement of sex-typed activities in boys. In the studies from other Western countries, parents use moderately more physical punishment for boys than for girls. Lytton and Romney (1991): 286 and Tables 4 and 5. The term effect size is explained in chapter 1. The effect size for sex-typed activities was –0.43 (boys got more encouragement) and the effect size for physical punishment was –0.37 (boys got more physical punishment).
Seven years later, another meta-analysis, Leaper, Anderson, and Sanders (1998), focused specifically on studies that observed the language that parents use with their children. There were sex differences in the types of speech that parents employed (e.g., supportive speech, directive speech, informative speech). Not surprisingly, mothers tend to talk more to their children than fathers do. But fathers used the same patterns of language with daughters and sons alike. So did mothers, with just two exceptions: Mothers talked somewhat more to daughters than to sons and used somewhat more supportive speech with daughters than with sons. The effect sizes were +0.29 and +0.22 respectively. This was not the stuff of pervasive socialization through language.
Since 1998, one of the major areas in which researchers continue to look for socialization effects has been differences in the ways parents exert parental control over daughters and sons. In the theoretical literature there is a good strategy (autonomy-supportive ) and a bad one (controlling). Autonomy-supportive strategies combine an appropriate amount of control with an appropriate amount of choice, take the child’s perspective into account, and explain the reasons when the parent’s decision overrules the child’s preference. Endendijk, Groeneveld, Bakermans-Kranenburg et al. (2016): 2 of 33. An extensive literature provides evidence that autonomy-supportive strategies are associated with lower levels of oppositional, aggressive, and hyperactive child behaviors. E.g., Kawabata, Alink, Tseng et al. (2011); Karreman, van Tuijl, van Aken et al. (2006); Stormshak, Bierman, McMahon et al. (2000).
Human Diversity Page 45