Inferior
Page 9
The fact that research is replicated is crucial. A lot of work in the field of psychology, even the most widely reported on in the press, hasn’t been. If a number of independent scientists come to the same conclusions based on different studies across a broad range of people, then it’s far easier to be confident about the results. “A lot of research findings never get replicated and probably are false,” she admits. “It’s just the way science works. You can’t study the whole world, so you have to take a sample, and your sample may or may not be representative.” This is so important to Hines that, when I meet her, she goes so far as to warn me that she isn’t even sure about the reliability of some of her own research because it hasn’t yet been replicated elsewhere.
On toy preferences, now, she has little doubt left. “One of the first studies I did in this area was bringing children into the playroom with all the toys and just recording how much time they spend playing with each toy,” she describes. “I was really surprised by the results because, at the time, the thought was that toy choices are completely socially determined. And you can see why, because there is so much social pressure for children to play with the gender-appropriate toy.” She and others found in study after study that boys on average really do prefer to play with trucks and cars, while girls on average prefer dolls. “The main toys are vehicles and dolls. Those are the most gendered type of toys,” she says.
A study that Hines and her colleagues carried out on infants in 2010, watching for how long children look at one toy over another, suggested that these preferences might start to emerge close to the age of two. “Between twelve and twenty-four months, children were already showing preferences for sex-typed toys. So, the girls were looking longer at the dolls than at the car, and the boys were looking longer at the car than at the doll,” she says. But at twelve months, both boys and girls spent longer looking at the doll than the car.
Statistically, this difference in how young children play is significant. “Toy preferences, I like to compare to height,” she explains. “We know that men are taller than women but not all men are taller than all women. So the size of that sex difference is two standard deviations. The sex difference in time playing with dolls versus trucks is about the same as the sex difference in height.” A standard deviation is a measure of how spread out data are. The spread of height looks like a bell-shaped curve. The average height of men is around sixty-nine inches and the standard deviation is three inches. This means that, in a large group of men, more than two-thirds will be within one standard deviation of the average, making them between sixty-six and seventy-two inches tall. The farther you get from the average, toward the thin ends of the bell curve, the fewer men there are. Two standard deviations away will be men who are six inches taller or six inches shorter than the average man (less than 5 percent of men are two or more standard deviations away from the average). A difference in behavior of two standard deviations between men and women would therefore be like a difference of six inches between their average heights. In everyday life, it’s a noticeable gap.
In studying girls with congenital adrenal hyperplasia, Hines’s team was keen to test whether they might be getting some unconscious encouragement to play with boys’ toys, perhaps because their families knew of their intersex condition. “So we thought, let’s bring parents in with them and see how they react. Are they encouraging the girls to play in that way or not in the playroom?” she says. “But we found what they actually did was try to get them to play with female typical toys. More so than with their other daughters, they would introduce female toys. If she was playing with a female toy they would say, ‘That’s nice,’ and give them a hug.” It’s more evidence, she implies, that the differences they’ve seen in toy preferences aren’t purely due to social conditioning but have a biological element, too.
This difference in toy choices, however, is a far leap from the theory that the brains of men and women are deeply structurally different because of how much testosterone they’ve been exposed to. It’s also a considerable distance from Baron-Cohen’s claim that there’s such a thing as a typical male brain and a typical female brain—one that prefers mathematics and another that likes coffee mornings. For him to be right, there would have to be noticeable gaps in lots of other behaviors as well. Those with female brains would have to clearly behave on average like empathizers and those with male brains like systemizers.
According to Hines, this isn’t what we see. Tallying all the scientific data she has seen across all ages, Hines believes that the “sex difference in empathizing and systemizing is about half a standard deviation.” This would be equivalent to a gap of about an inch between the average heights of men and women. It’s small. “That’s typical,” she adds. “Most sex differences are in that range, And for a lot of things, we don’t show any sex differences.”
Researchers have known this for a long time. In their 1974 book The Psychology of Sex Differences, American researchers Eleanor Maccoby and Carol Nagy Jacklin picked through an enormous mass of studies looking at similarities and differences between boys and girls. They concluded that the psychological gaps between women and men were far smaller than the differences that existed in society among women and among men. In 2010, Hines repeated this exercise using more recent research. She found that only the tiniest gaps, if any, existed between boys’ and girls’ fine motor skills, ability to perform mental rotations, spatial visualization, mathematics ability, verbal fluency, and vocabulary. On all these measures, boys and girls performed almost the same.
Teodora Gliga from the Birkbeck baby lab agrees that when it comes to children raised under normal conditions, without unusual medical conditions, large gaps between girls and boys haven’t been found. “It’s quite rare to find differences in typical development.” The overlap between the sexes is so huge, she explains, that scientists have struggled to find and replicate results that suggest that there is a real gap between the sexes. “For the time being, the baby science is not convincingly showing any consistent differences.”
Even studying the tiny minority of girls who have been exposed to higher than usual levels of androgens, adds Hines, while it does tell us something about sex differences, doesn’t tell us that these differences are particularly big. “If genetically I am a girl fetus that produces a bit more androgen, maybe I’ll play a bit more with boys than if I had a bit less. Then maybe I’ll have two friends who are boys, instead of one.” Beyond gender identity and toy preference, on pretty much every other behavioral and cognitive measure that scientists have investigated (in a field that has left few stones unturned), girls and boys overlap hugely. Indeed, almost entirely. In a study by Hines exploring color preferences, for example, she found infant girls also had no more of a love of pink than boys did.
In 2005 University of Wisconsin, Madison, psychologist Janet Shibley Hyde proposed a “gender similarities hypothesis” to demonstrate just how big this overlap is. In a table more than three pages long, she lists the statistical gaps that have been found between the sexes on all kinds of measures, from vocabulary and anxiety about mathematics to aggression and self-esteem. In every case, except for throwing distance and vertical jumping, females are less than one standard deviation apart from males. On many measures, they are less than a tenth of a standard deviation apart, which is indistinguishable in everyday life.
When it comes to intelligence, too, it has been convincingly established that there are no differences between the average woman and man. Psychologist Roberto Colom at the Autonomous University of Madrid, Spain, found negligible differences in “general intelligence” (a measure that takes into account intelligence, cognitive ability, and mental ability) when he tested more than ten thousand adults who were applying to a private university between 1989 and 1995. His paper, published in the journal Intelligence in 2000, confirms what earlier studies have repeatedly shown.
Some have argued that there is statistically more variation among men than among women, which means that e
ven though the average man is no more intelligent than the average woman, there are more men of extremely low intelligence and more men of extremely high intelligence. At the far ends of the bell curve where the overlap ends, they say, the difference becomes clear. This may have been the basis for the controversial point made by Harvard president Lawrence Summers in 2005 when he was hunting for explanations for why there are so many more male than female science professors at top universities.
Studies haven’t fully supported this explanation. In 2008, using populationwide surveys of general intelligence among eleven-year-olds in Scotland, a team of researchers based at the University of Edinburgh confirmed that males did show more variability in their test results. These differences aren’t extreme as some in the past have suggested they are, they note, but they are substantial. At the same time, the authors point out that the biggest effect is seen at the bottom end of the scale. Those with the very lowest intelligence scores tend to be male. This is partly genetic. X-linked mental retardation, for instance, affects far more men than women.
“Mainly it’s at the bottom extreme because they have more developmental disorders,” explains Melissa Hines. “At the upper extreme, it’s not that big a difference.” The authors of the Scottish study showed that the smaller differences they saw at the top end certainly weren’t enough to account for the gaps between women and men taking up mathematics and science. In their particular set of data, around two boys for every girl achieved the very highest intelligence test scores. At universities, gaps in the numbers of male and female science professors are usually far bigger.
Hines adds that this difference in Scottish test results could also be due to social factors. “Even though on the average there is no sex difference in IQ, I think still boys get encouraged at the top. I think in some social environments, they don’t get encouraged at all, but I think in affluent, educated social environments, there is still a tendency to expect more from boys, to invest more in boys,” she tells me.
This observation is backed up by recent research into how people often think of genius as being a male feature. A 2015 study published in the journal Science explored whether this expectation of raw brilliance in men might affect the gender balance in certain subjects. Led by the Princeton University philosophy professor Sarah-Jane Leslie and University of Illinois psychologist Andrei Cimpian, the researchers asked academics from thirty disciplines across the United States if they believed being a top scholar in their field required “a special aptitude that just can’t be taught.” They found that in those disciplines in which people thought you did need to have an innate gift or talent to succeed, there were fewer female PhDs.
The subjects that instead valued hard work tended to have more women.
“It’s hard to separate our opinion from the data.”
Perhaps naive, Jennifer Connellan didn’t expect the backlash when it came. But then, no one could have expected that, when it came, it would be so huge.
Not long after her and Simon Baron-Cohen’s study on newborns preferring faces or mobiles was published in 2000, people began to question their research. Could it be true that there was such a deep sex difference in the behavior of newborn babies? Were girls really preprogrammed to be empathizers while boys were born systemizers? Flickers of doubt were raised about her methods and the reliability of the results.
The skepticism came to a head in 2007 when New York psychologists Alison Nash and Giordana Grossi dissected the experiment in forensic detail and catalogued a string of problems, big and small. For one thing, the paper’s grand claim that the experiment’s conclusions were “beyond reasonable doubt” seemed an uncomfortable stretch when, in fact, not even half the boys in the study preferred to stare at the mobile and an even smaller percentage of the girls preferred to stare at the face.
But their most damning criticism was that Connellan knew the sex of at least some of the babies she was testing. This could have caused any number of subtle biases. For instance, consciously or not, she may have moved her head to make the girls look at her longer, Nash and Grossi pointed out. The need to avoid this sort of problem is exactly why scientists are advised to carry out these studies blind, without knowing the sex of their subjects. Without this safety measure, it’s hard to take the results seriously.
Psychologist and author Cordelia Fine, who in 2010 published Delusions of Gender, a book about the problems with brain research that includes Nash and Grossi’s findings, adds that, even if their findings were right, Connellan, Baron-Cohen, and their colleagues made too big a leap when speculating about what they might mean. “One assumption is that these visual preferences predict a child’s later empathizing versus systemizing interests, for which there is no evidence either way,” she tells me.
When I put these criticisms to Connellan herself, now fifteen years since her paper was published, she accepts them humbly. At the time, her paper was out before she had been awarded her doctorate, and the flood of criticism came to bite when she turned up to defend her work in front of a panel of reviewers. She was told she had failed. “To have the defense go as poorly as it did was really surprising,” she says. She attributed it to “lots of politics in there with the reviewers. . . . We appealed it and got some more neutral people.” Only then, with a new set of reviewers, did she finally pass.
The experiment did have its problems, she admits. She found it impossible to prevent herself from being aware of the sex of some babies, mostly because she was in a maternity ward surrounded by newborn paraphernalia, including pink and blue balloons, and sometimes even their names. “We were testing the babies in a neutral zone, where there were no balloons or anything like that, and the blankets were all neutral. That was actually where we did the experimentation,” she says. But in getting permission to test the babies, they had to go see the mothers first, in an environment that was far from neutral.
“We did the best we could with the results that we had,” she admits. “Are they perfect? No.” In writing the paper, too, she says that she may have become overexcited by the results. “I was very inexperienced, and I think that inexperience caused more of the problems than anything else.”
When I ask Simon Baron-Cohen to give me his own thoughts on the experiment, he tells me by e-mail, “It was designed thoroughly and was scrutinised through peer review and as such it met the bar for good science. No study is above criticism in the sense that one can always think of ways to improve the study, and I hope when a replication is attempted, it will also be improved.”
In fact, replication has been one of the biggest problems for the experiment. To date, nobody has attempted to copy it to check if the findings were reliable. “Studies have to be replicated,” comments Teodora Gliga, “especially if it’s a new idea. It needs to be replicated, otherwise it’s not believable. It’s an interesting idea, but not a fact.” Subsequent studies with slightly older children have shown no sex differences. And, as Melissa Hines’s work has revealed, there appear to be no toy preferences among children until at least the age of one, and possibly closer to two years old.
Baron-Cohen, however, tells me that “the fact that the study hasn’t yet been replicated does not invalidate it at all. It simply means we are still awaiting replication.” One explanation he gives for why no other researchers have tried to copy it is that babies are difficult to test, which means you need large groups to get a reliable result. “Second, it appears that testing for psychological sex differences in neonates still attracts a fair amount of controversy. So some researchers may have been deterred by not wanting to walk into a potential political minefield,” he adds.
Jennifer Connellan has since abandoned the minefield altogether. Her career in Simon Baron-Cohen’s lab turned out to be brief. After getting her degree, she left Cambridge to join Pepperdine University. Today, she runs a tutoring company in California. She’s also mother to a girl and a boy. She tells me that she remains intrigued by the idea of empathizing and systemizing brain types, but belie
ves that it’s only at the extremes where researchers seem to find any discrepancies. “It’s all a bell curve. . .and for the kids in the middle there’s almost no sex difference there at all,” she says.
Baron-Cohen, meanwhile, presses on in trying to establish links between levels of testosterone before birth and sex differences in the brain. In 2002 he and another postgraduate student, Svetlana Lutchmaya, claimed that twelve-month-old girls they observed in experiments made more eye contact than boys of the same age did. This study has been cited by other researchers more than two hundred times.
Then in 2014 Baron-Cohen and his colleagues published the results of a study looking at one of the biggest sources of data in the world: more than nineteen thousand amniotic fluid samples in Denmark, taken from pregnant women for medical reasons between 1993 and 1999. If ever a set of data could reliably prove his hypothesis that high fetal testosterone levels are linked to autism, leading to the “extreme male brain,” it was this one. His team managed to measure hormone levels in these fluid samples to find out how much testosterone the babies would have been exposed to. They could then crosscheck all this with the medical and psychiatric records of the same set of children when they were older. It was an amazingly large and thorough set of patient information.
The database included 128 males who were diagnosed with a condition on the autism spectrum. But Melissa Hines tells me that Baron-Cohen’s results didn’t show a direct link between them and high fetal testosterone levels. “That was like the ultimate test, and there was no correlation between testosterone and getting an autism spectrum diagnosis,” she says. “That’s just one study, but it doesn’t support it.”