Book Read Free

Rationality- From AI to Zombies

Page 145

by Eliezer Yudkowsky


  Tsuyoku naritai is the driving force behind my essay The Proper Use of Humility, in which I contrast the student who humbly double-checks their math test, and the student who modestly says, “But how can we ever really know? No matter how many times I check, I can never be absolutely certain.” The student who double-checks their answers wants to become stronger; they react to a possible inner flaw by doing what they can to repair the flaw, not with resignation.

  Each year on Yom Kippur, an Orthodox Jew recites a litany which begins Ashamnu, bagadnu, gazalnu, dibarnu dofi, and goes on through the entire Hebrew alphabet: We have acted shamefully, we have betrayed, we have stolen, we have slandered . . .

  As you pronounce each word, you strike yourself over the heart in penitence. There’s no exemption whereby, if you manage to go without stealing all year long, you can skip the word gazalnu and strike yourself one less time. That would violate the community spirit of Yom Kippur, which is about confessing sins—not avoiding sins so that you have less to confess.

  By the same token, the Ashamnu does not end, “But that was this year, and next year I will do better.”

  The Ashamnu bears a remarkable resemblance to the notion that the way of rationality is to beat your fist against your heart and say, “We are all biased, we are all irrational, we are not fully informed, we are overconfident, we are poorly calibrated . . .”

  Fine. Now tell me how you plan to become less biased, less irrational, more informed, less overconfident, better calibrated.

  There is an old Jewish joke: During Yom Kippur, the rabbi is seized by a sudden wave of guilt, and prostrates himself and cries, “God, I am nothing before you!” The cantor is likewise seized by guilt, and cries, “God, I am nothing before you!” Seeing this, the janitor at the back of the synagogue prostrates himself and cries, “God, I am nothing before you!” And the rabbi nudges the cantor and whispers, “Look who thinks he’s nothing.”

  Take no pride in your confession that you too are biased; do not glory in your self-awareness of your flaws. This is akin to the principle of not taking pride in confessing your ignorance; for if your ignorance is a source of pride to you, you may become loath to relinquish your ignorance when evidence comes knocking. Likewise with our flaws—we should not gloat over how self-aware we are for confessing them; the occasion for rejoicing is when we have a little less to confess.

  Otherwise, when the one comes to us with a plan for correcting the bias, we will snarl, “Do you think to set yourself above us?” We will shake our heads sadly and say, “You must not be very self-aware.”

  Never confess to me that you are just as flawed as I am unless you can tell me what you plan to do about it. Afterward you will still have plenty of flaws left, but that’s not the point; the important thing is to do better, to keep moving ahead, to take one more step forward. Tsuyoku naritai!

  *

  305

  Tsuyoku vs. the Egalitarian Instinct

  Hunter-gatherer tribes are usually highly egalitarian (at least if you’re male)—the all-powerful tribal chieftain is found mostly in agricultural societies, rarely in the ancestral environment. Among most hunter-gatherer tribes, a hunter who brings in a spectacular kill will carefully downplay the accomplishment to avoid envy.

  Maybe, if you start out below average, you can improve yourself without daring to pull ahead of the crowd. But sooner or later, if you aim to do the best you can, you will set your aim above the average.

  If you can’t admit to yourself that you’ve done better than others—or if you’re ashamed of wanting to do better than others—then the median will forever be your concrete wall, the place where you stop moving forward. And what about people who are below average? Do you dare say you intend to do better than them? How prideful of you!

  Maybe it’s not healthy to pride yourself on doing better than someone else. Personally I’ve found it to be a useful motivator, despite my principles, and I’ll take all the useful motivation I can get. Maybe that kind of competition is a zero-sum game, but then so is Go; it doesn’t mean we should abolish that human activity, if people find it fun and it leads somewhere interesting.

  But in any case, surely it isn’t healthy to be ashamed of doing better.

  And besides, life is not graded on a curve. The will to transcendence has no point beyond which it ceases and becomes the will to do worse; and the race that has no finish line also has no gold or silver medals. Just run as fast as you can, without worrying that you might pull ahead of other runners. (But be warned: If you refuse to worry about that possibility, someday you may pull ahead. If you ignore the consequences, they may happen to you.)

  Sooner or later, if your path leads true, you will set out to mitigate a flaw that most people have not mitigated. Sooner or later, if your efforts bring forth any fruit, you will find yourself with fewer sins to confess.

  Perhaps you will find it the course of wisdom to downplay the accomplishment, even if you succeed. People may forgive a touchdown, but not dancing in the end zone. You will certainly find it quicker, easier, more convenient to publicly disclaim your worthiness, to pretend that you are just as much a sinner as everyone else. Just so long, of course, as everyone knows it isn’t true. It can be fun to proudly display your modesty, so long as everyone knows how very much you have to be modest about.

  But do not let that be the endpoint of your journeys. Even if you only whisper it to yourself, whisper it still: Tsuyoku, tsuyoku! Stronger, stronger!

  And then set yourself a higher target. That’s the true meaning of the realization that you are still flawed (though a little less so). It means always reaching higher, without shame.

  Tsuyoku naritai! I’ll always run as fast as I can, even if I pull ahead, I’ll keep on running; and someone, someday, will surpass me; but even though I fall behind, I’ll always run as fast as I can.

  *

  306

  Trying to Try

  No! Try not! Do, or do not. There is no try.

  —Yoda

  Years ago, I thought this was yet another example of Deep Wisdom that is actually quite stupid. SUCCEED is not a primitive action. You can’t just decide to win by choosing hard enough. There is never a plan that works with probability 1.

  But Yoda was wiser than I first realized.

  The first elementary technique of epistemology—it’s not deep, but it’s cheap—is to distinguish the quotation from the referent. Talking about snow is not the same as talking about “snow.” When I use the word “snow,” without quotes, I mean to talk about snow; and when I use the word ““snow,”” with quotes, I mean to talk about the word “snow.” You have to enter a special mode, the quotation mode, to talk about your beliefs. By default, we just talk about reality.

  If someone says, “I’m going to flip that switch,” then by default, they mean they’re going to try to flip the switch. They’re going to build a plan that promises to lead, by the consequences of its actions, to the goal-state of a flipped switch; and then execute that plan.

  No plan succeeds with infinite certainty. So by default, when you talk about setting out to achieve a goal, you do not imply that your plan exactly and perfectly leads to only that possibility. But when you say, “I’m going to flip that switch,” you are trying only to flip the switch—not trying to achieve a 97.2% probability of flipping the switch.

  So what does it mean when someone says, “I’m going to try to flip that switch?”

  Well, colloquially, “I’m going to flip the switch” and “I’m going to try to flip the switch” mean more or less the same thing, except that the latter expresses the possibility of failure. This is why I originally took offense at Yoda for seeming to deny the possibility. But bear with me here.

  Much of life’s challenge consists of holding ourselves to a high enough standard. I may speak more on this principle later, because it’s a lens through which you can view many-but-not-all personal dilemmas—“What standard am I holding myself to? Is it high enough?”

&n
bsp; So if much of life’s failure consists in holding yourself to too low a standard, you should be wary of demanding too little from yourself—setting goals that are too easy to fulfill.

  Often where succeeding to do a thing is very hard, trying to do it is much easier.

  Which is easier—to build a successful startup, or to try to build a successful startup? To make a million dollars, or to try to make a million dollars?

  So if “I’m going to flip the switch” means by default that you’re going to try to flip the switch—that is, you’re going to set up a plan that promises to lead to switch-flipped state, maybe not with probability 1, but with the highest probability you can manage—

  —then “I’m going to ‘try to flip’ the switch” means that you’re going to try to “try to flip the switch,” that is, you’re going to try to achieve the goal-state of “having a plan that might flip the switch.”

  Now, if this were a self-modifying AI we were talking about, the transformation we just performed ought to end up at a reflective equilibrium—the AI planning its planning operations.

  But when we deal with humans, being satisfied with having a plan is not at all like being satisfied with success. The part where the plan has to maximize your probability of succeeding gets lost along the way. It’s far easier to convince ourselves that we are “maximizing our probability of succeeding,” than it is to convince ourselves that we will succeed.

  Almost any effort will serve to convince us that we have “tried our hardest,” if trying our hardest is all we are trying to do.

  You have been asking what you could do in the great events that are now stirring, and have found that you could do nothing. But that is because your suffering has caused you to phrase the question in the wrong way . . . Instead of asking what you could do, you ought to have been asking what needs to be done.

  —Steven Brust, The Paths of the Dead1

  When you ask, “What can I do?,” you’re trying to do your best. What is your best? It is whatever you can do without the slightest inconvenience. It is whatever you can do with the money in your pocket, minus whatever you need for your accustomed lunch. What you can do with those resources may not give you very good odds of winning. But it’s the “best you can do,” and so you’ve acted defensibly, right?

  But what needs to be done? Maybe what needs to be done requires three times your life savings, and you must produce it or fail.

  So trying to have “maximized your probability of success”—as opposed to trying to succeed—is a far lesser barrier. You can have “maximized your probability of success” using only the money in your pocket, so long as you don’t demand actually winning.

  Want to try to make a million dollars? Buy a lottery ticket. Your odds of winning may not be very good, but you did try, and trying was what you wanted. In fact, you tried your best, since you only had one dollar left after buying lunch. Maximizing the odds of goal achievement using available resources: is this not intelligence?

  It’s only when you want, above all else, to actually flip the switch—without quotation and without consolation prizes just for trying—that you will actually put in the effort to actually maximize the probability.

  But if all you want is to “maximize the probability of success using available resources,” then that’s the easiest thing in the world to convince yourself you’ve done. The very first plan you hit upon will serve quite well as “maximizing”—if necessary, you can generate an inferior alternative to prove its optimality. And any tiny resource that you care to put in will be what is “available.” Remember to congratulate yourself on putting in 100% of it!

  Don’t try your best. Win, or fail. There is no best.

  *

  1. Steven Brust, The Paths of the Dead, Vol. 1 of The Viscount of Adrilankha (Tor Books, 2002).

  307

  Use the Try Harder, Luke

  When there’s a will to fail, obstacles can be found.

  —John McCarthy

  I first watched Star Wars IV-VI when I was very young. Seven, maybe, or nine? So my memory was dim, but I recalled Luke Skywalker as being, you know, this cool Jedi guy.

  Imagine my horror and disappointment when I watched the saga again, years later, and discovered that Luke was a whiny teenager.

  I mention this because yesterday, I looked up, on Youtube, the source of the Yoda quote: “Do, or do not. There is no try.”

  Oh. My. Cthulhu.

  I present to you a little-known outtake from the scene, in which the director and writer, George Lucas, argues with Mark Hamill, who played Luke Skywalker:

  LUKE: “All right, I’ll give it a try.”

  YODA: “No! Try not. Do. Or do not. There is no try.”

  Luke raises his hand, and slowly, the X-wing begins to rise out of the water—Yoda’s eyes widen—but then the ship sinks again.

  Mark Hamill: “Um, George . . .”

  George Lucas: “What is it now?”

  Mark: “So . . . according to the script, next I say, ‘I can’t. It’s too big.’”

  George: “That’s right.”

  Mark: “Shouldn’t Luke maybe give it another shot?”

  George: “No. Luke gives up, and sits down next to Yoda—”

  Mark: “This is the hero who’s going to take down the Empire? Look, it was one thing when he was a whiny teenager at the beginning, but he’s in Jedi training now. Last movie he blew up the Death Star. Luke should be showing a little backbone.”

  George: “No. You give up. And then Yoda lectures you for a while, and you say, ‘You want the impossible.’ Can you remember that?”

  Mark: “Impossible? What did he do, run a formal calculation to arrive at a mathematical proof? The X-wing was already starting to rise out of the swamp! That’s the feasibility demonstration right there! Luke loses it for a second and the ship sinks back—and now he says it’s impossible? Not to mention that Yoda, who’s got literally eight hundred years of seniority in the field, just told him it should be doable—”

  George: “And then you walk away.”

  Mark: “It’s his friggin’ spaceship! If he leaves it in the swamp, he’s stuck on Dagobah for the rest of his miserable life! He’s not just going to walk away! Look, let’s just cut to the next scene with the words ‘one month later’ and Luke is still raggedly standing in front of the swamp, trying to raise his ship for the thousandth time—”

  George: “No.”

  Mark: “Fine! We’ll show a sunset and a sunrise, as he stands there with his arm out, straining, and then Luke says ‘It’s impossible.’ Though really, he ought to try again when he’s fully rested—”

  George: “No.”

  Mark: “Five goddamned minutes! Five goddamned minutes before he gives up!”

  George: “I am not halting the story for five minutes while the X-wing bobs in the swamp like a bathtub toy.”

  Mark: “For the love of sweet candied yams! If a pathetic loser like this could master the Force, everyone in the galaxy would be using it! People would become Jedi because it was easier than going to high school.”

  George: “Look, you’re the actor. Let me be the storyteller. Just say your lines and try to mean them.”

  Mark: “The audience isn’t going to buy it.”

  George: “Trust me, they will.”

  Mark: “They’re going to get up and walk out of the theater.”

  George: “They’re going to sit there and nod along and not notice anything out of the ordinary. Look, you don’t understand human nature. People wouldn’t try for five minutes before giving up if the fate of humanity were at stake.”

  *

  308

  On Doing the Impossible

  “Persevere.” It’s a piece of advice you’ll get from a whole lot of high achievers in a whole lot of disciplines. I didn’t understand it at all, at first.

  At first, I thought “perseverance” meant working 14-hour days. Apparently, there are people out there who can work for 10 hours at a technica
l job, and then, in their moments between eating and sleeping and going to the bathroom, seize that unfilled spare time to work on a book. I am not one of those people—it still hurts my pride even now to confess that. I’m working on something important; shouldn’t my brain be willing to put in 14 hours a day? But it’s not. When it gets too hard to keep working, I stop and go read or watch something. Because of that, I thought for years that I entirely lacked the virtue of “perseverance.”

  In accordance with human nature, Eliezer1998 would think things like: “What counts is output, not input.” Or, “Laziness is also a virtue—it leads us to back off from failing methods and think of better ways.” Or, “I’m doing better than other people who are working more hours. Maybe, for creative work, your momentary peak output is more important than working 16 hours a day.” Perhaps the famous scientists were seduced by the Deep Wisdom of saying that “hard work is a virtue,” because it would be too awful if that counted for less than intelligence?

  I didn’t understand the virtue of perseverance until I looked back on my journey through AI, and realized that I had overestimated the difficulty of almost every single important problem.

  Sounds crazy, right? But bear with me here.

  When I was first deciding to challenge AI, I thought in terms of 40-year timescales, Manhattan Projects, planetary computing networks, millions of programmers, and possibly augmented humans.

  This is a common failure mode in AI-futurism which I may write about later; it consists of the leap from “I don’t know how to solve this” to “I’ll imagine throwing something really big at it.” Something huge enough that, when you imagine it, that imagination creates a feeling of impressiveness strong enough to be commensurable with the problem. (There’s a fellow currently on the AI list who goes around saying that AI will cost a quadrillion dollars—we can’t get AI without spending a quadrillion dollars, but we could get AI at any time by spending a quadrillion dollars.) This, in turn, lets you imagine that you know how to solve AI, without trying to fill the obviously-impossible demand that you understand intelligence.

 

‹ Prev