Rationality- From AI to Zombies

Page 157

by Eliezer Yudkowsky

This is the signature style I want to convey from all those essays that entangled cognitive science experiments and probability theory and epistemology with the practical advice—that practical advice actually becomes practically more powerful if you go out and read up on cognitive science experiments, or probability theory, or even materialist epistemology, and realize what you’re seeing. This is the brand that can distinguish Less Wrong from ten thousand other blogs purporting to offer advice.

I could tell you, “You know, how much you’re satisfied with your food probably depends more on the quality of the food than on how much of it you eat.” And you would read it and forget about it, and the impulse to finish off a whole plate would still feel just as strong. But if I tell you about scope insensitivity, and duration neglect and the Peak/End rule, you are suddenly aware in a very concrete way, looking at your plate, that you will form almost exactly the same retrospective memory whether your portion size is large or small; you now possess a deep theory about the rules governing your memory, and you know that this is what the rules say. (You also know to save the dessert for last.)

I want to hear how I can overcome akrasia—how I can have more willpower, or get more done with less mental pain. But there are ten thousand people purporting to give advice on this, and for the most part, it is on the level of that alternate Seth Roberts who just tells people about the amazing effects of drinking fruit juice. Or actually, somewhat worse than that—it’s people trying to describe internal mental levers that they pulled, for which there are no standard words, and which they do not actually know how to point to. See also the illusion of transparency, inferential distance, and double illusion of transparency. (Notice how “You overestimate how much you’re explaining and your listeners overestimate how much they’re hearing” becomes much more forceful as advice, after I back it up with a cognitive science experiment and some evolutionary psychology?)

I think that the advice I need is from someone who reads up on a whole lot of experimental psychology dealing with willpower, mental conflicts, ego depletion, preference reversals, hyperbolic discounting, the breakdown of the self, picoeconomics, et cetera, and who, in the process of overcoming their own akrasia, manages to understand what they did in truly general terms—thanks to experiments that give them a vocabulary of cognitive phenomena that actually exist, as opposed to phenomena they just made up. And moreover, someone who can explain what they did to someone else, thanks again to the experimental and theoretical vocabulary that lets them point to replicable experiments that ground the ideas in very concrete results, or mathematically clear ideas.

Note the grade of increasing difficulty in citing:

Concrete experimental results (for which one need merely consult a paper, hopefully one that reported p < 0.01 because p < 0.05 may fail to replicate);

Causal accounts that are actually true (which may be most reliably obtained by looking for the theories that are used by a majority within a given science);

Math validly interpreted (on which I have trouble offering useful advice because so much of my own math talent is intuition that kicks in before I get a chance to deliberate).

If you don’t know who to trust, or you don’t trust yourself, you should concentrate on experimental results to start with, move on to thinking in terms of causal theories that are widely used within a science, and dip your toes into math and epistemology with extreme caution.

But practical advice really, really does become a lot more powerful when it’s backed up by concrete experimental results, causal accounts that are actually true, and math validly interpreted.

*

332

The Sin of Underconfidence

There are three great besetting sins of rationalists in particular, and the third of these is underconfidence. Michael Vassar regularly accuses me of this sin, which makes him unique among the entire population of the Earth.

But he’s actually quite right to worry, and I worry too, and any adept rationalist will probably spend a fair amount of time worying about it. When subjects know about a bias or are warned about a bias, overcorrection is not unheard of as an experimental result. That’s what makes a lot of cognitive subtasks so troublesome—you know you’re biased but you’re not sure how much, and you don’t know if you’re correcting enough—and so perhaps you ought to correct a little more, and then a little more, but is that enough? Or have you, perhaps, far overshot? Are you now perhaps worse off than if you hadn’t tried any correction?

You contemplate the matter, feeling more and more lost, and the very task of estimation begins to feel increasingly futile . . .

And when it comes to the particular questions of confidence, overconfidence, and underconfidence—being interpreted now in the broader sense, not just calibrated confidence intervals—then there is a natural tendency to cast overconfidence as the sin of pride, out of that other list which never warned against the improper use of humility or the abuse of doubt. To place yourself too high—to overreach your proper place—to think too much of yourself—to put yourself forward—to put down your fellows by implicit comparison—and the consequences of humiliation and being cast down, perhaps publicly—are these not loathesome and fearsome things?

To be too modest—seems lighter by comparison; it wouldn’t be so humiliating to be called on it publicly. Indeed, finding out that you’re better than you imagined might come as a warm surprise; and to put yourself down, and others implicitly above, has a positive tinge of niceness about it. It’s the sort of thing that Gandalf would do.

So if you have learned a thousand ways that humans fall into error and read a hundred experimental results in which anonymous subjects are humiliated of their overconfidence—heck, even if you’ve just read a couple of dozen—and you don’t know exactly how overconfident you are—then yes, you might genuinely be in danger of nudging yourself a step too far down.

I have no perfect formula to give you that will counteract this. But I have an item or two of advice.

What is the danger of underconfidence?

Passing up opportunities. Not doing things you could have done, but didn’t try (hard enough).

So here’s a first item of advice: If there’s a way to find out how good you are, the thing to do is test it. A hypothesis affords testing; hypotheses about your own abilities likewise. Once upon a time it seemed to me that I ought to be able to win at the AI-Box Experiment; and it seemed like a very doubtful and hubristic thought; so I tested it. Then later it seemed to me that I might be able to win even with large sums of money at stake, and I tested that, but I only won one time out of three. So that was the limit of my ability at that time, and it was not necessary to argue myself upward or downward, because I could just test it.

One of the chief ways that smart people end up stupid is by getting so used to winning that they stick to places where they know they can win—meaning that they never stretch their abilities, they never try anything difficult.

It is said that this is linked to defining yourself in terms of your “intelligence” rather than “effort,” because then winning easily is a sign of your “intelligence,” where failing on a hard problem could have been interpreted in terms of a good effort.

Now, I am not quite sure this is how an adept rationalist should think about these things: rationality is systematized winning and trying to try seems like a path to failure. I would put it this way: A hypothesis affords testing! If you don’t know whether you’ll win on a hard problem—then challenge your rationality to discover your current level. I don’t usually hold with congratulating yourself on having tried—it seems like a bad mental habit to me—but surely not trying is even worse. If you have cultivated a general habit of confronting challenges, and won on at least some of them, then you may, perhaps, think to yourself, “I did keep up my habit of confronting challenges, and will do so next time as well.” You may also think to yourself “I have gained valuable information about my current level and where I need impro
vement,” so long as you properly complete the thought, “I shall try not to gain this same valuable information again next time.”

If you win every time, it means you aren’t stretching yourself enough. But you should seriously try to win every time. And if you console yourself too much for failure, you lose your winning spirit and become a scrub.

When I try to imagine what a fictional master of the Competitive Conspiracy would say about this, it comes out something like: “It’s not okay to lose. But the hurt of losing is not something so scary that you should flee the challenge for fear of it. It’s not so scary that you have to carefully avoid feeling it, or refuse to admit that you lost and lost hard. Losing is supposed to hurt. If it didn’t hurt you wouldn’t be a Competitor. And there’s no Competitor who never knows the pain of losing. Now get out there and win.”

Cultivate a habit of confronting challenges—not the ones that can kill you outright, perhaps, but perhaps ones that can potentially humiliate you. I recently read of a certain theist that he had defeated Christopher Hitchens in a debate (severely so; this was said by atheists). And so I wrote at once to the Bloggingheads folks and asked if they could arrange a debate. This seemed like someone I wanted to test myself against. Also, it was said by them that Christopher Hitchens should have watched the theist’s earlier debates and been prepared, so I decided not to do that, because I think I should be able to handle damn near anything on the fly, and I desire to learn whether this thought is correct; and I am willing to risk public humiliation to find out. Note that this is not self-handicapping in the classic sense—if the debate is indeed arranged (I haven’t yet heard back), and I do not prepare, and I fail, then I do lose those stakes of myself that I have put up; I gain information about my limits; I have not given myself anything I consider an excuse for losing.

Of course this is only a way to think when you really are confronting a challenge just to test yourself, and not because you have to win at any cost. In that case you make everything as easy for yourself as possible. To do otherwise would be spectacular overconfidence, even if you’re playing tic-tac-toe against a three-year-old.

A subtler form of underconfidence is losing your forward momentum—amid all the things you realize that humans are doing wrong, that you used to be doing wrong, of which you are probably still doing some wrong. You become timid; you question yourself but don’t answer the self-questions and move on; when you hypothesize your own inability you do not put that hypothesis to the test.

Perhaps without there ever being a watershed moment when you deliberately, self-visibly decide not to try at some particular test . . . you just . . . . slow . . . . . down . . . . . . .

It doesn’t seem worthwhile any more, to go on trying to fix one thing when there are a dozen other things that will still be wrong . . .

There’s not enough hope of triumph to inspire you to try hard . . .

When you consider doing any new thing, a dozen questions about your ability at once leap into your mind, and it does not occur to you that you could answer the questions by testing yourself . . .

And having read so much wisdom of human flaws, it seems that the course of wisdom is ever doubting (never resolving doubts), ever the humility of refusal (never the humility of preparation), and just generally, that it is wise to say worse and worse things about human abilities, to pass into feel-good feel-bad cynicism.

And so my last piece of advice is another perspective from which to view the problem—by which to judge any potential habit of thought you might adopt—and that is to ask:

Does this way of thinking make me stronger, or weaker? Really truly?

I have previously spoken of the danger of reasonableness—the reasonable-sounding argument that we should two-box on Newcomb’s problem, the reasonable-sounding argument that we can’t know anything due to the problem of induction, the reasonable-sounding argument that we will be better off on average if we always adopt the majority belief, and other such impediments to the Way. “Does it win?” is one question you could ask to get an alternate perspective. Another, slightly different perspective is to ask, “Does this way of thinking make me stronger, or weaker?” Does constantly reminding yourself to doubt everything make you stronger, or weaker? Does never resolving or decreasing those doubts make you stronger, or weaker? Does undergoing a deliberate crisis of faith in the face of uncertainty make you stronger, or weaker? Does answering every objection with a humble confession of you fallibility make you stronger, or weaker?

Are your current attempts to compensate for possible overconfidence making you stronger, or weaker? Hint: If you are taking more precautions, more scrupulously trying to test yourself, asking friends for advice, working your way up to big things incrementally, or still failing sometimes but less often then you used to, you are probably getting stronger. If you are never failing, avoiding challenges, and feeling generally hopeless and dispirited, you are probably getting weaker.

I learned the first form of this rule at a very early age, when I was practicing for a certain math test, and found that my score was going down with each practice test I took, and noticed going over the answer sheet that I had been pencilling in the correct answers and erasing them. So I said to myself, “All right, this time I’m going to use the Force and act on instinct,” and my score shot up to above what it had been in the beginning, and on the real test it was higher still. So that was how I learned that doubting yourself does not always make you stronger—especially if it interferes with your ability to be moved by good information, such as your math intuitions. (But I did need the test to tell me this!)

Underconfidence is not a unique sin of rationalists alone. But it is a particular danger into which the attempt to be rational can lead you. And it is a stopping mistake—an error that prevents you from gaining that further experience that would correct the error.

Because underconfidence actually does seem quite common among aspiring rationalists who I meet—though rather less common among rationalists who have become famous role models—I would indeed name it third among the three besetting sins of rationalists.

*

333

Go Forth and Create the Art!

I have said a thing or two about rationality, these past months. I have said a thing or two about how to untangle questions that have become confused, and how to tell the difference between real reasoning and fake reasoning, and the will to become stronger that leads you to try before you flee; I have said something about doing the impossible.

And these are all techniques that I developed in the course of my own projects—which is why there is so much about cognitive reductionism, say—and it is possible that your mileage may vary in trying to apply it yourself. The one’s mileage may vary. Still, those wandering about asking “But what good is it?” might consider rereading some of the earlier essays; knowing about e.g. the conjunction fallacy, and how to spot it in an argument, hardly seems esoteric. Understanding why motivated skepticism is bad for you can constitute the whole difference, I suspect, between a smart person who ends up smart and a smart person who ends up stupid. Affective death spirals consume many among the unwary . . .

Yet there is, I think, more absent than present in this “art of rationality”—defeating akrasia and coordinating groups are two of the deficits I feel most keenly. I’ve concentrated more heavily on epistemic rationality than instrumental rationality, in general. And then there’s training, teaching, verification, and becoming a proper experimental science based on that. And if you generalize a bit further, then building the Art could also be taken to include issues like developing better introductory literature, developing better slogans for public relations, establishing common cause with other Enlightenment subtasks, analyzing and addressing the gender imbalance problem . . .

But those small pieces of rationality that I’ve set out . . . I hope . . . just maybe . . .

I suspect—you could even call it a guess—that there is a barrier to getting started, in this
matter of rationality. Where by default, in the beginning, you don’t have enough to build on. Indeed so little that you don’t have a clue that more exists, that there is an Art to be found. And if you do begin to sense that more is possible—then you may just instantaneously go wrong. As David Stove observes, most “great thinkers” in philosophy, e.g., Hegel, are properly objects of pity.1 That’s what happens by default to anyone who sets out to develop the art of thinking; they develop fake answers.

When you try to develop part of the human art of thinking . . . then you are doing something not too dissimilar to what I was doing over in Artificial Intelligence. You will be tempted by fake explanations of the mind, fake accounts of causality, mysterious holy words, and the amazing idea that solves everything.

It’s not that the particular, epistemic, fake-detecting methods that I use are so good for every particular problem; but they seem like they might be helpful for discriminating good and bad systems of thinking.

I hope that someone who learns the part of the Art that I’ve set down here will not instantaneously and automatically go wrong if they start asking themselves, “How should people think, in order to solve new problem X that I’m working on?” They will not immediately run away; they will not just make stuff up at random; they may be moved to consult the literature in experimental psychology; they will not automatically go into an affective death spiral around their Brilliant Idea; they will have some idea of what distinguishes a fake explanation from a real one. They will get a saving throw.

It’s this sort of barrier, perhaps, that prevents people from beginning to develop an art of rationality, if they are not already rational.

And so instead they . . . go off and invent Freudian psychoanalysis. Or a new religion. Or something. That’s what happens by default, when people start thinking about thinking.

‹ Prev Next ›