Book Read Free

Rationality- From AI to Zombies

Page 37

by Eliezer Yudkowsky


  Yet we change our minds less often than we think. Genetic accusations have a force among humans that they would not have among ideal Bayesians.

  Clearing your mind is a powerful heuristic when you’re faced with new suspicion that many of your ideas may have come from a flawed source.

  Once an idea gets into our heads, it’s not always easy for evidence to root it out. Consider all the people out there who grew up believing in the Bible; later came to reject (on a deliberate level) the idea that the Bible was written by the hand of God; and who nonetheless think that the Bible contains indispensable ethical wisdom. They have failed to clear their minds; they could do significantly better by doubting anything the Bible said because the Bible said it.

  At the same time, they would have to bear firmly in mind the principle that reversed stupidity is not intelligence; the goal is to genuinely shake your mind loose and do independent thinking, not to negate the Bible and let that be your algorithm.

  Once an idea gets into your head, you tend to find support for it everywhere you look—and so when the original source is suddenly cast into suspicion, you would be very wise indeed to suspect all the leaves that originally grew on that branch . . .

  If you can! It’s not easy to clear your mind. It takes a convulsive effort to actually reconsider, instead of letting your mind fall into the pattern of rehearsing cached arguments. “It ain’t a true crisis of faith unless things could just as easily go either way,” said Thor Shenkel.

  You should be extremely suspicious if you have many ideas suggested by a source that you now know to be untrustworthy, but by golly, it seems that all the ideas still ended up being right—the Bible being the obvious archetypal example.

  On the other hand . . . there’s such a thing as sufficiently clear-cut evidence, that it no longer significantly matters where the idea originally came from. Accumulating that kind of clear-cut evidence is what Science is all about. It doesn’t matter any more that Kekulé first saw the ring structure of benzene in a dream—it wouldn’t matter if we’d found the hypothesis to test by generating random computer images, or from a spiritualist revealed as a fraud, or even from the Bible. The ring structure of benzene is pinned down by enough experimental evidence to make the source of the suggestion irrelevant.

  In the absence of such clear-cut evidence, then you do need to pay attention to the original sources of ideas—to give experts more credence than layfolk, if their field has earned respect—to suspect ideas you originally got from suspicious sources—to distrust those whose motives are untrustworthy, if they cannot present arguments independent of their own authority.

  The genetic fallacy is a fallacy when there exist justifications beyond the genetic fact asserted, but the genetic accusation is presented as if it settled the issue. Hal Finney suggests that we call correctly appealing to a claim’s origins “the genetic heuristic.”

  Some good rules of thumb (for humans):

  Be suspicious of genetic accusations against beliefs that you dislike, especially if the proponent claims justifications beyond the simple authority of a speaker. “Flight is a religious idea, so the Wright Brothers must be liars” is one of the classically given examples.

  By the same token, don’t think you can get good information about a technical issue just by sagely psychoanalyzing the personalities involved and their flawed motives. If technical arguments exist, they get priority.

  When new suspicion is cast on one of your fundamental sources, you really should doubt all the branches and leaves that grew from that root. You are not licensed to reject them outright as conclusions, because reversed stupidity is not intelligence, but . . .

  Be extremely suspicious if you find that you still believe the early suggestions of a source you later rejected.

  *

  Part J

  Death Spirals

  100

  The Affect Heuristic

  The affect heuristic is when subjective impressions of goodness/badness act as a heuristic—a source of fast, perceptual judgments. Pleasant and unpleasant feelings are central to human reasoning, and the affect heuristic comes with lovely biases—some of my favorites.

  Let’s start with one of the relatively less crazy biases. You’re about to move to a new city, and you have to ship an antique grandfather clock. In the first case, the grandfather clock was a gift from your grandparents on your fifth birthday. In the second case, the clock was a gift from a remote relative and you have no special feelings for it. How much would you pay for an insurance policy that paid out $100 if the clock were lost in shipping? According to Hsee and Kunreuther, subjects stated willingness to pay more than twice as much in the first condition.1 This may sound rational—why not pay more to protect the more valuable object?—until you realize that the insurance doesn’t protect the clock, it just pays if the clock is lost, and pays exactly the same amount for either clock. (And yes, it was stated that the insurance was with an outside company, so it gives no special motive to the movers.)

  All right, but that doesn’t sound too insane. Maybe you could get away with claiming the subjects were insuring affective outcomes, not financial outcomes—purchase of consolation.

  Then how about this? Yamagishi showed that subjects judged a disease as more dangerous when it was described as killing 1,286 people out of every 10,000, versus a disease that was 24.14% likely to be fatal.2 Apparently the mental image of a thousand dead bodies is much more alarming, compared to a single person who’s more likely to survive than not.

  But wait, it gets worse.

  Suppose an airport must decide whether to spend money to purchase some new equipment, while critics argue that the money should be spent on other aspects of airport safety. Slovic et al. presented two groups of subjects with the arguments for and against purchasing the equipment, with a response scale ranging from 0 (would not support at all) to 20 (very strong support).3 One group saw the measure described as saving 150 lives. The other group saw the measure described as saving 98% of 150 lives. The hypothesis motivating the experiment was that saving 150 lives sounds vaguely good—is that a lot? a little?—while saving 98% of something is clearly very good because 98% is so close to the upper bound of the percentage scale. Lo and behold, saving 150 lives had mean support of 10.4, while saving 98% of 150 lives had mean support of 13.6.

  Or consider the report of Denes-Raj and Epstein:4 Subjects offered an opportunity to win $1 each time they randomly drew a red jelly bean from a bowl, often preferred to draw from a bowl with more red beans and a smaller proportion of red beans. E.g., 7 in 100 was preferred to 1 in 10.

  According to Denes-Raj and Epstein, these subjects reported afterward that even though they knew the probabilities were against them, they felt they had a better chance when there were more red beans. This may sound crazy to you, oh Statistically Sophisticated Reader, but if you think more carefully you’ll realize that it makes perfect sense. A 7% probability versus 10% probability may be bad news, but it’s more than made up for by the increased number of red beans. It’s a worse probability, yes, but you’re still more likely to win, you see. You should meditate upon this thought until you attain enlightenment as to how the rest of the planet thinks about probability.

  Finucane et al. found that for nuclear reactors, natural gas, and food preservatives, presenting information about high benefits made people perceive lower risks; presenting information about higher risks made people perceive lower benefits; and so on across the quadrants.5 People conflate their judgments about particular good/bad aspects of something into an overall good or bad feeling about that thing.

  Finucane et al. also found that time pressure greatly increased the inverse relationship between perceived risk and perceived benefit, consistent with the general finding that time pressure, poor information, or distraction all increase the dominance of perceptual heuristics over analytic deliberation.

  Ganzach found the same effect in the realm of finance.6 According to ordinary economic theory, return a
nd risk should correlate positively—or to put it another way, people pay a premium price for safe investments, which lowers the return; stocks deliver higher returns than bonds, but have correspondingly greater risk. When judging familiar stocks, analysts’ judgments of risks and returns were positively correlated, as conventionally predicted. But when judging unfamiliar stocks, analysts tended to judge the stocks as if they were generally good or generally bad—low risk and high returns, or high risk and low returns.

  For further reading I recommend Slovic’s fine summary article, “Rational Actors or Rational Fools: Implications of the Affect Heuristic for Behavioral Economics.”7

  *

  1. Christopher K. Hsee and Howard C. Kunreuther, “The Affection Effect in Insurance Decisions,” Journal of Risk and Uncertainty 20 (2 2000): 141–159, doi:10.1023/A:1007876907268.

  2. Kimihiko Yamagishi, “When a 12.86% Mortality Is More Dangerous than 24.14%: Implications for Risk Communication,” Applied Cognitive Psychology 11 (6 1997): 461–554.

  3. Paul Slovic et al., “Rational Actors or Rational Fools: Implications of the Affect Heuristic for Behavioral Economics,” Journal of Socio-Economics 31, no. 4 (2002): 329–342, doi:10.1016/S1053-5357(02)00174-9.

  4. Veronika Denes-Raj and Seymour Epstein, “Conflict between Intuitive and Rational Processing: When People Behave against Their Better Judgment,” Journal of Personality and Social Psychology 66 (5 1994): 819–829, doi:10.1037/0022-3514.66.5.819.

  5. Finucane et al., “The Affect Heuristic in Judgments of Risks and Benefits.”

  6. Yoav Ganzach, “Judging Risk and Return of Financial Assets,” Organizational Behavior and Human Decision Processes 83, no. 2 (2000): 353–370, doi:10.1006/obhd.2000.2914.

  7. Slovic et al., “Rational Actors or Rational Fools.”

  101

  Evaluability (and Cheap Holiday Shopping)

  With the expensive part of the Hallowthankmas season now approaching, a question must be looming large in our readers’ minds:

  “Dear Overcoming Bias, are there biases I can exploit to be seen as generous without actually spending lots of money?”

  I’m glad to report the answer is yes! According to Hsee—in a paper entitled “Less is better: When low-value options are valued more highly than high-value options”—if you buy someone a $45 scarf, you are more likely to be seen as generous than if you buy them a $55 coat.1

  This is a special case of a more general phenomenon. In an earlier experiment, Hsee asked subjects how much they would be willing to pay for a second-hand music dictionary:2

  Dictionary A, from 1993, with 10,000 entries, in like-new condition.

  Dictionary B, from 1993, with 20,000 entries, with a torn cover and otherwise in like-new condition.

  The gotcha was that some subjects saw both dictionaries side-by-side, while other subjects only saw one dictionary . . .

  Subjects who saw only one of these options were willing to pay an average of $24 for Dictionary A and an average of $20 for Dictionary B. Subjects who saw both options, side-by-side, were willing to pay $27 for Dictionary B and $19 for Dictionary A.

  Of course, the number of entries in a dictionary is more important than whether it has a torn cover, at least if you ever plan on using it for anything. But if you’re only presented with a single dictionary, and it has 20,000 entries, the number 20,000 doesn’t mean very much. Is it a little? A lot? Who knows? It’s non-evaluable. The torn cover, on the other hand—that stands out. That has a definite affective valence: namely, bad.

  Seen side-by-side, though, the number of entries goes from non-evaluable to evaluable, because there are two compatible quantities to be compared. And, once the number of entries becomes evaluable, that facet swamps the importance of the torn cover.

  From Slovic et al.: Which would you prefer?3

  A 29/36 chance to win $2.

  A 7/36 chance to win $9.

  While the average prices (equivalence values) placed on these options were $1.25 and $2.11 respectively, their mean attractiveness ratings were 13.2 and 7.5. Both the prices and the attractiveness rating were elicited in a context where subjects were told that two gambles would be randomly selected from those rated, and they would play the gamble with the higher price or higher attractiveness rating. (Subjects had a motive to rate gambles as more attractive, or price them higher, that they would actually prefer to play.)

  The gamble worth more money seemed less attractive, a classic preference reversal. The researchers hypothesized that the dollar values were more compatible with the pricing task, but the probability of payoff was more compatible with attractiveness. So (the researchers thought) why not try to make the gamble’s payoff more emotionally salient—more affectively evaluable—more attractive?

  And how did they do this? By adding a very small loss to the gamble. The old gamble had a 7/36 chance of winning $9. The new gamble had a 7/36 chance of winning $9 and a 29/36 chance of losing 5 cents. In the old gamble, you implicitly evaluate the attractiveness of $9. The new gamble gets you to evaluate the attractiveness of winning $9 versus losing 5 cents.

  “The results,” said Slovic et al., “exceeded our expectations.” In a new experiment, the simple gamble with a 7/36 chance of winning $9 had a mean attractiveness rating of 9.4, while the complex gamble that included a 29/36 chance of losing 5 cents had a mean attractiveness rating of 14.9.

  A follow-up experiment tested whether subjects preferred the old gamble to a certain gain of $2. Only 33% of students preferred the old gamble. Among another group asked to choose between a certain $2 and the new gamble (with the added possibility of a 5 cents loss), fully 60.8% preferred the gamble. After all, $9 isn’t a very attractive amount of money, but $9 / 5 cents is an amazingly attractive win/loss ratio.

  You can make a gamble more attractive by adding a strict loss! Isn’t psychology fun? This is why no one who truly appreciates the wondrous intricacy of human intelligence wants to design a human-like AI.

  Of course, it only works if the subjects don’t see the two gambles side-by-side.

  Similarly, which of these two ice creams do you think subjects in Hsee’s 1998 study preferred?

  From Hsee, © 1998 John Wiley & Sons, Ltd.

  Naturally, the answer depends on whether the subjects saw a single ice cream, or the two side-by-side. Subjects who saw a single ice cream were willing to pay $1.66 to Vendor H and $2.26 to Vendor L. Subjects who saw both ice creams were willing to pay $1.85 to Vendor H and $1.56 to Vendor L.

  What does this suggest for your holiday shopping? That if you spend $400 on a 16GB iPod Touch, your recipient sees the most expensive MP3 player. If you spend $400 on a Nintendo Wii, your recipient sees the least expensive game machine. Which is better value for the money? Ah, but that question only makes sense if you see the two side-by-side. You’ll think about them side-by-side while you’re shopping, but the recipient will only see what they get.

  If you have a fixed amount of money to spend—and your goal is to display your friendship, rather than to actually help the recipient—you’ll be better off deliberately not shopping for value. Decide how much money you want to spend on impressing the recipient, then find the most worthless object which costs that amount. The cheaper the class of objects, the more expensive a particular object will appear, given that you spend a fixed amount. Which is more memorable, a $25 shirt or a $25 candle?

  Gives a whole new meaning to the Japanese custom of buying $50 melons, doesn’t it? You look at that and shake your head and say “What is it with the Japanese?” And yet they get to be perceived as incredibly generous, spendthrift even, while spending only $50. You could spend $200 on a fancy dinner and not appear as wealthy as you can by spending $50 on a melon. If only there was a custom of gifting $25 toothpicks or $10 dust specks; they could get away with spending even less.

  PS: If you actually use this trick, I want to know what you bought.

  *

  1. Christopher K. Hsee, “Less Is Better: When Low-Value Option
s Are Valued More Highly than High-Value Options,” Behavioral Decision Making 11 (2 1998): 107–121.

  2. Christopher K. Hsee, “The Evaluability Hypothesis: An Explanation for Preference Reversals between Joint and Separate Evaluations of Alternatives,” Organizational Behavior and Human Decision Processes 67 (3 1996): 247–257, doi:10.1006/obhd.1996.0077.

  3. Slovic et al., “Rational Actors or Rational Fools.”

  102

  Unbounded Scales, Huge Jury Awards, and Futurism

  “Psychophysics,” despite the name, is the respectable field that links physical effects to sensory effects. If you dump acoustic energy into air—make noise—then how loud does that sound to a person, as a function of acoustic energy? How much more acoustic energy do you have to pump into the air, before the noise sounds twice as loud to a human listener? It’s not twice as much; more like eight times as much.

  Acoustic energy and photons are straightforward to measure. When you want to find out how loud an acoustic stimulus sounds, how bright a light source appears, you usually ask the listener or watcher. This can be done using a bounded scale from “very quiet” to “very loud,” or “very dim” to “very bright.” You can also use an unbounded scale, whose zero is “not audible at all” or “not visible at all,” but which increases from there without limit. When you use an unbounded scale, the observer is typically presented with a constant stimulus, the modulus, which is given a fixed rating. For example, a sound that is assigned a loudness of 10. Then the observer can indicate a sound twice as loud as the modulus by writing 20.

  And this has proven to be a fairly reliable technique. But what happens if you give subjects an unbounded scale, but no modulus? Zero to infinity, with no reference point for a fixed value? Then they make up their own modulus, of course. The ratios between stimuli will continue to correlate reliably between subjects. Subject A says that sound X has a loudness of 10 and sound Y has a loudness of 15. If subject B says that sound X has a loudness of 100, then it’s a good guess that subject B will assign loudness in the vicinity of 150 to sound Y. But if you don’t know what subject C is using as their modulus—their scaling factor—then there’s no way to guess what subject C will say for sound X. It could be 1. It could be 1,000.

 

‹ Prev