by Lee McIntyre
Like medicine, social science is subjective. And it is also normative. We have a stake not just in knowing how things are but also in using this knowledge to make things the way we think they should be. We study voting behavior in the interest of preserving democratic values. We study the relationship between inflation and unemployment in order to mitigate the next recession. Yet unlike medicine, so far social scientists have not proven to be very effective in finding a way to wall off positive inquiry from normative expectations, which leads to the problem that instead of acquiring objective knowledge we may only be indulging in confirmation bias and wishful thinking. This is the real barrier to a better social science. It is not just that we have ineffective tools or a recalcitrant subject matter; it is that at some level we do not yet have enough respect for our own ignorance to keep ourselves honest by comparing our ideas relentlessly against the data. The challenge in social science is to find a way to preserve our values without letting them interfere with empirical investigation. We need to understand the world before we can change it. In medicine, the answer was controlled experimentation. What might it be in social science?
Examples of Good and Bad Social Science
Even when social scientists do “research,” it is often not experimental. This means that a good deal of what passes for social scientific “evidence” is based on extrapolations from surveys and other data sets that may have been conducted by other researchers for other purposes. But this can lead to various methodological problems such as confusion between causation and correlation, the use of fuzzy concepts, and some of the other weaknesses we spoke about earlier in this chapter. It is one thing to say that “bad” social science is all theory and no evidence, infected with ideology, does not rely enough on actual experimentation, is not replicable, and so on, but it is another to see this in action.
One example of poorly conducted social scientific research can be found in a 2013 article by Susan Fiske and Cydney Dupree entitled “Gaining Trust as Well as Respect in Communicating to Motivated Audiences about Science Topics,” which was published in the Perspectives section of the Proceedings of the National Academy of Science.15 In this study, the researchers set out to study an issue that has great importance for the defense of science: whether the allegedly low trustworthiness of scientists may be undermining their persuasiveness on factual questions such as climate change. Does it come as a surprise that scientists are seen as untrustworthy? Fiske and Dupree purport to have empirical evidence for this.
In their study, the researchers first conducted an online poll of American adults to ask them to list typical American jobs. The researchers then chose the most commonly mentioned forty-two jobs, which included scientists, researchers, professors, and teachers.16 In the next step, they polled a new sample to ask about the “warmth” versus “competence” of practitioners of these professions. Here it was found that scientists rated highly on expertise (competence) but relatively low on warmth (trustworthiness). What does warmth have to do with trustworthiness? Their hypothesis was that trustworthiness is positively correlated with warmth and friendliness. In short, if someone is judged to be “on my side” then that person is more likely to be trusted. But whereas there is empirical work to show that if someone is judged to be “like us” we are more likely to trust that person,17 it is a great leap to then start using “warmth” and “trustworthiness” as interchangeable proxies for one another.
First, one should pay attention to the leap from saying (1) “if X is on my side, then X is more trustworthy” to saying (2) “if X is not on my side, then X is less trustworthy.” By elementary logic, we understand that statement (2) is not implied by statement (1), nor vice versa. Indeed, the leap from (1) to (2) is the classic logical error of denying the antecedent. This means that even if there were empirical evidence in support of the truth of statement (1), the truth of statement (2) is still in question. Nowhere in Fiske and Dupree’s article do they cite any evidence in support of statement (2), yet the biconditional link between “being on my side” and “being trustworthy” is the crux of their conclusion that it is methodologically sound to use “warmth” as a proxy to measure “trustworthiness.”18 Isn’t it conceivable that scientists could be judged as not warm yet nonetheless trustworthy? Indeed, wouldn’t it have been more direct if the researchers had simply asked their subjects to rate the trustworthiness of various professions? One wonders what the result might have been. For whatever reason, however, the researchers chose not to take this route and instead skip blithely back and forth between measurements of warmth and conclusions about trust throughout their article.
[Scientists] earn respect but not trust. Being seen as competent but cold might not seem problematic until one recalls that communicator credibility requires not just status and expertise (competence) but also trustworthiness (warmth). … Even if scientists are respected as competent, they may not be trusted as warm.19
This is a classic example of the use of fuzzy concepts in social scientific research, where dissimilar concepts are treated as interchangeable, presumably because one of them is easier to measure than the other. In this case, I am not convinced of that, because “trust” is hardly an esoteric concept that would be unreportable by research subjects, but we nonetheless find in this article a conclusion that scientists have a “trust” problem rather than a “warmth” problem, based on zero direct measurement of the concept of trust itself.20
This is unfortunate, because the researchers’ own study would seem to give reason to doubt the truth of their own conclusions. In a follow up, Fiske and Dupree report that as a final step they singled out climate scientists for further review, and here polled a fresh sample of subjects with a slightly different methodology for the measurement of trust. Here, instead of allegedly measuring “trust,” they instead sought to measure “distrust” through the use of a seven-item scale that included things like perceptions of “motive to lie with statistics, complicate a simple story, show superiority, gain research money, pursue a liberal agenda, provoke the public, and hurt big corporations.”21 The researchers were surprised to find that climate scientists were judged to be more trustworthy than scientists in general (measured against their previous poll). What might be the reason for this? They offer the hypothesis that the scale was different (which raises the question of why they made the decision to use a different scale), but also float the idea that climate scientists perhaps had a more “constructive approach to the public, balancing expertise (competence) with trustworthiness (warmth), together facilitating communicator credibility.”22 I find this to be a questionable conclusion, for in the final part of the study there was no measurement at all of the “warmth” of climate scientists, yet the researchers once again feel comfortable drawing parallels between trustworthiness and warmth.23
By way of contrast, I will now explore an example of good social scientific work that is based firmly in the scientific attitude, uses empirical evidence to challenge an intuitive theoretical hypothesis, and employs experimental methods to measure human motivation directly through human action. In Sheena Iyengar’s work on the paradox of choice, we face a classic social scientific dilemma. How can something as amorphous as human motivation be measured through empirical evidence? According to neoclassical economics, we measure consumer desire directly through marketplace behavior. People will buy what they want, and the price is a reflection of how much the good is valued. To work out the mathematical details, however, a few “simplifying assumptions” are required. First, we assume that our preferences are rational. If I like cherry pie more than apple, and apple more than blueberry, it is assumed that I like cherry more than blueberry.24 Second, we assume that consumers have perfect information about prices. Although this is widely known to be untrue in individual cases, it is a core assumption of neoclassical economics, for it is needed to explain how it is that the market as a whole performs the magical task of ordering preferences through prices.25 Although it is acknowledged that actual cons
umers may make “mistakes” in the marketplace (for instance, they did not know that cherry pie was on sale at a nearby market), the model purports to work because if they had known this, they would have changed their behavior. Finally, the neoclassical model assumes that “more is better.” This is not to say that there is no such thing as diminishing marginal utility—that last bite of cherry pie probably does not taste as good as the first one—but it is to say that for consumers it is better to have more choices in the marketplace, for this is how one’s preferences can be maximized.
In Sheena Iyengar’s work, she sought to test this last assumption directly through experiment. The stakes were high, for if she could show that this simplifying assumption was wrong, then, together with Herbert Simon’s earlier work undermining “perfect information,” the neoclassical model may be in jeopardy. Iyengar and her colleague Mark Lepper set up a controlled consumer choice experiment in a grocery store where shoppers were offered the chance to taste different kinds of jam. In the control condition, shoppers were offered twenty-four different choices. In the experimental condition, this was decreased to six options. To ensure that different shoppers were present for the two conditions, the displays were rotated every two hours and other scientific controls were put in place. Iyengar and Lepper sought to measure two things: (1) how many different flavors of jam the shoppers chose to taste and (2) how much total jam they actually bought when they checked out of the store. To measure the latter, everyone who stopped by to taste was given a coded coupon, so that the experimenters could track whether the number of jams in the display affected later purchasing behavior. And did it ever. Even though the initial display of twenty-four jams attracted slightly more customer interest, their later purchasing behavior was quite low when measured against those who had visited the booth with only six jams. Although each display attracted an equal number of jam tasters (thus removing the fact of tasting as a causal variable to explain the difference), the shoppers who had visited the display with twenty-four jams used their coupons only 3 percent of the time, whereas those who visited the display with only six jams used theirs 30 percent of the time.
What might account for this? In their analysis, Iyengar and Lepper speculated that the shoppers might have been overwhelmed in the first condition.26 Even when they tasted a few jams, this was such a small percentage of the total display that they perhaps felt they could not be sure they had chosen the best one, so they chose not to buy any at all. In the second condition, however, shoppers might have been better able to rationalize making a choice based on a proportionally larger sampling. As it turned out, people wanted fewer choices. Although they might not have realized it, their own behavior revealed a surprising fact about human motivation.27
Although this may sound like a trivial experiment, the implications are far reaching. One of the most important direct applications of Iyengar and Lepper’s finding was to the problem of undersaving in 401k plans, where new employees are customarily overwhelmed by the number of options for investing their money and so choose to put off the decision, which effectively means choosing not to invest any money at all. In Respecting Truth, I have explored a number of other implications of this research ranging from automatic enrollment in retirement plans to the introduction of “target date” retirement funds.28 Not only is this good social science, but its positive impact on human lives has been considerable.
For present purposes, the point is this. Even in a situation where we may feel most in touch with our subject matter—human preference and desire—we can be wrong about what influences our behavior. If you ask people whether they want more or fewer choices, most will say they want more. But their actual behavior belies this. The results of experimental evidence in the study of human action can surprise us. Even concepts as seemingly qualitative as desire, motivation, and human choice can be measured by experimentation rather than mere intuition, theory, or verbal report.
Here again we are reminded of Semmelweis. How do we know before we have conducted an experiment what is true? Our intuitions may feel solid, but experiment shows that they can fail us. And this is as true in social science as it is in medicine. Having the facts about human behavior can be just as useful in public policy as in the diagnosis and treatment of human disease. Thus the scientific attitude is to be recommended just as heartily in social science as it is in any empirical subject. If we care about evidence and are willing to change our minds about a theory based on evidence, what better example might we have before us than the success of Iyengar and Lepper’s experiment? Just as the elegance of Pasteur’s experimental model allowed him to overthrow the outdated idea of spontaneous generation, could economics now move forward owing to recognition of the impact of cognitive bias and irrationality on human choice?
And perhaps this same approach might work throughout the social sciences. All of the recent work on cognitive bias, for instance, might help us to develop a more effective approach to science education and the correction of public misperceptions about climate change. If the researchers cited by Fiske and Dupree as the foundation for their work are right (which has nothing to do with the question of any purported connection between warmth and trustworthiness), then attitude is as much a part of making up our mind as evidence.
First, scientists may misunderstand the sources of lay beliefs. People are no idiots. The public’s issue with science is not necessarily ignorance. The public increasingly knows more than before about climate change’s causes. … Potential divides between scientists and the public are not merely about sheer knowledge in any simple way.
The second, often-neglected factor is the other side of attitudes. Attitudes are evaluations that include both cognition (beliefs) and affect (feelings, emotions). Acting on attitudes involves both cognitive capacity and motivation. Attitudes show an intrinsic pressure for consistency between cognition and affect, so for most attitudes, both are relevant. When attitudes do tilt toward emphasizing either cognition or affect, persuasion is more effective when it matches the type of attitude. In the domain of climate change, for example, affect and values together motivate climate cognition. If public attitudes have two sides—belief and affect—what is their role in scientific communication?29
If this is true, what breakthroughs might be possible once we gain better experimental evidence for how the human mind really works? Actual human beings do not have perfect information; neither are they perfectly rational. We know that our reason is buffeted by built-in cognitive biases, which allow a sea of emotions, misperceptions, and desires to cloud our reasoning. If we seek to do a better job of convincing people to accept the scientific consensus on a topic like climate change, this provides a valuable incentive for social scientists to get their own house in order. Once they have found a way to make their own discipline more scientific, perhaps they can play a more active role in helping us to defend the enterprise of science as a whole.
Notes
1. This is at least in part due to the problem of reflexive prediction, which is when a human subject takes a prediction of their behavior into account, which in turn affects their actions. Note, however, that this does not necessarily make human behavior unpredictable at either the individual or the group level; but it does make it impossible to run a counterfactual experiment and see what would have happened if you hadn’t shared any information with the subject in the first place.
2. Lee McIntyre, Laws and Explanation in the Social Sciences: Defending a Science of Human Behavior (Boulder: Westview Press, 1996); Lee McIntyre, Dark Ages: The Case for a Science of Human Behavior (Cambridge, MA: MIT Press, 2006).
3. Of course, attitude can influence method (as I acknowledged in Dark Ages, 20), but I now wish to place more emphasis on the importance of that dynamic.
4. McIntyre, Dark Ages, 93.
5. Popper, “Prediction and Prophecy in the Social Sciences,” in Conjectures and Refutations: The Growth of Scientific Knowledge (New York: Harper Torchbooks, 1965), 336–346.
6. McIntyre, Dark Ages, 123, note 4.
7. McIntyre, Respecting Truth: Willful Ignorance in the Internet Age (New York: Routledge, 2015), 37.
8. See Steven Yoder, “Life on the List,” American Prospect, April 4, 2011, http://prospect.org/article/life-list. In this case, perhaps there are normative considerations as well: we might believe that sex offenders just deserve to be punished beyond their sentence or that neighbors have a right to know where they live. If that is the case, though, the recidivism rate is irrelevant.
9. McIntyre, Dark Ages, 63–68.
10. Robert Trivers, “Fraud, Disclosure, and Degrees of Freedom in Science,” Psychology Today (blog entry: May 10, 2012), https://www.psychologytoday.com/blog/the-folly-fools/201205/fraud-disclosure-and-degrees-freedom-in-science.
11. Jay Gabler and Jason Kaufman, “Chess, Cheerleading, Chopin: What Gets You Into College?” Contexts 5, no. 2 (Spring 2006): 45–49. If one digs, one finds that this was based on an earlier study in which the researchers did control for socioeconomic status, but even so it seems suspect to pursue the idea that there is a direct causal link between parental museum visitation and college admission. For instance, they speculate that name dropping about the latest art exhibit at the Whitney Museum during an interview might make an applicant seem more like “college material.” This hypothesis, however, was never tested. Jason Kaufman and Jay Gabler, “Cultural Capital and the Extracurricular Activities of Girls and Boys in the College Attainment Process,” Poetics 32 (2004): 145–168.
12. Emile Durkheim, The Rules of Sociological Method, author’s preface (Paris, 1895).