Rationality- From AI to Zombies

Page 156

by Eliezer Yudkowsky

I reply: The pure, true Prisoner’s Dilemma is incredibly rare in real life. In real life you usually have knock-on effects—what you do affects your reputation. In real life most people care to some degree about what happens to other people. And in real life you have an opportunity to set up incentive mechanisms.

And in real life, I do think that a community of human rationalists could manage to produce soldiers willing to die to defend the community. So long as children aren’t told in school that ideal rationalists are supposed to defect against each other in the Prisoner’s Dilemma. Let it be widely believed—and I do believe it, for exactly the same reason I one-box on Newcomb’s Problem—that if people decided as individuals not to be soldiers or if soldiers decided to run away, then that is the same as deciding for the Barbarians to win. By that same theory whereby, if an election is won by 100,000 votes to 99,998 votes, it does not make sense for every voter to say “my vote made no difference.” Let it be said (for it is true) that utility functions don’t need to be solipsistic, and that a rational agent can fight to the death if they care enough about what they’re protecting. Let them not be told that rationalists should expect to lose reasonably.

If this is the culture and the mores of the rationalist society, then, I think, ordinary human beings in that society would volunteer to be soldiers. That also seems to be built into human beings, after all. You only need to ensure that the cultural training does not get in the way.

And if I’m wrong, and that doesn’t get you enough volunteers?

Then so long as people still prefer, on the whole, fighting to surrender, they have an opportunity to set up incentive mechanisms, and avert the True Prisoner’s Dilemma.

You can have lotteries for who gets elected as a warrior. Sort of like the example above with AIs changing their own code. Except that if “be reflectively consistent; do that which you would precommit to do” is not sufficient motivation for humans to obey the lottery, then . . .

. . . well, in advance of the lottery actually running, we can perhaps all agree that it is a good idea to give the selectees drugs that will induce extra courage, and shoot them if they run away. Even considering that we ourselves might be selected in the lottery. Because in advance of the lottery, this is the general policy that gives us the highest expectation of survival.

. . . like I said: Real wars = not fun, losing wars = less fun.

Let’s be clear, by the way, that I’m not endorsing the draft as practiced nowadays. Those drafts are not collective attempts by a populace to move from a Nash equilibrium to a Pareto optimum. Drafts are a tool of kings playing games in need of toy soldiers. The Vietnam draftees who fled to Canada, I hold to have been in the right. But a society that considers itself too smart for kings does not have to be too smart to survive. Even if the Barbarian hordes are invading, and the Barbarians do practice the draft.

Will rational soldiers obey orders? What if the commanding officer makes a mistake?

Soldiers march. Everyone’s feet hitting the ground in the same rhythm. Even, perhaps, against their own inclinations, since people left to themselves would walk all at separate paces. Lasers made out of people. That’s marching.

If it’s possible to invent some method of group decisionmaking that is superior to the captain handing down orders, then a company of rational soldiers might implement that procedure. If there is no proven method better than a captain, then a company of rational soldiers commit to obey the captain, even against their own separate inclinations. And if human beings aren’t that rational . . . then in advance of the lottery, the general policy that gives you the highest personal expectation of survival is to shoot soldiers who disobey orders. This is not to say that those who fragged their own officers in Vietnam were in the wrong; for they could have consistently held that they preferred no one to participate in the draft lottery.

But an uncoordinated mob gets slaughtered, and so the soldiers need some way of all doing the same thing at the same time in the pursuit of the same goal, even though, left to their own devices, they might march off in all directions. The orders may not come from a captain like a superior tribal chief, but unified orders have to come from somewhere. A society whose soldiers are too clever to obey orders is a society that is too clever to survive. Just like a society whose people are too clever to be soldiers. That is why I say “clever,” which I often use as a term of opprobrium, rather than “rational.”

(Though I do think it’s an important question as to whether you can come up with a small-group coordination method that really genuinely in practice works better than having a leader. The more people can trust the group decision method—the more they can believe that it really is superior to people going their own way—the more coherently they can behave even in the absence of enforceable penalties for disobedience.)

I say all this, even though I certainly don’t expect rationalists to take over a country any time soon, because I think that what we believe about a society of “people like us” has some reflection on what we think of ourselves. If you believe that a society of people like you would be too reasonable to survive in the long run . . . that’s one sort of self-image. And it’s a different sort of self-image if you think that a society of people all like you could fight the vicious Evil Barbarians and win—not just by dint of superior technology, but because your people care about each other and about their collective society—and because they can face the realities of war without losing themselves—and because they would calculate the group-rational thing to do and make sure it got done—and because there’s nothing in the rules of probability theory or decision theory that says you can’t sacrifice yourself for a cause—and because if you really are smarter than the Enemy and not just flattering yourself about that, then you should be able to exploit the blind spots that the Enemy does not allow itself to think about—and because no matter how heavily the Enemy hypes itself up before battle, you think that just maybe a coherent mind, undivided within itself, and perhaps practicing something akin to meditation or self-hypnosis, can fight as hard in practice as someone who theoretically believes they’ve got seventy-two virgins waiting for them.

Then you’ll expect more of yourself and people like you operating in groups; and then you can see yourself as something more than a cultural dead end.

So look at it this way: Jeffreyssai probably wouldn’t give up against the Evil Barbarians if he were fighting alone. A whole army of beisutsukai masters ought to be a force that no one would mess with. That’s the motivating vision. The question is how, exactly, that works.

*

330

Beware of Other-Optimizing

I’ve noticed a serious problem in which aspiring rationalists vastly overestimate their ability to optimize other people’s lives. And I think I have some idea of how the problem arises.

You read nineteen different webpages advising you about personal improvement—productivity, dieting, saving money. And the writers all sound bright and enthusiastic about Their Method, they tell tales of how it worked for them and promise amazing results . . .

But most of the advice rings so false as to not even seem worth considering. So you sigh, mournfully pondering the wild, childish enthusiasm that people can seem to work up for just about anything, no matter how silly. Pieces of advice #4 and #15 sound interesting, and you try them, but . . . they don’t . . . quite . . . well, it fails miserably. The advice was wrong, or you couldn’t do it, and either way you’re not any better off.

And then you read the twentieth piece of advice—or even more, you discover a twentieth method that wasn’t in any of the pages—and STARS ABOVE IT ACTUALLY WORKS THIS TIME.

At long, long last you have discovered the real way, the right way, the way that actually works. And when someone else gets into the sort of trouble you used to have—well, this time you know how to help them. You can save them all the trouble of reading through nineteen useless pieces of advice and skip directly to the correct answer. As an a
spiring rationalist you’ve already learned that most people don’t listen, and you usually don’t bother—but this person is a friend, someone you know, someone you trust and respect to listen.

And so you put a comradely hand on their shoulder, look them straight in the eyes, and tell them how to do it.

I, personally, get quite a lot of this. Because you see . . . when you’ve discovered the way that really works . . . well, you know better by now than to run out and tell your friends and family. But you’ve got to try telling Eliezer Yudkowsky. He needs it, and there’s a pretty good chance that he’ll understand.

It actually did take me a while to understand. One of the critical events was when someone on the Board of the Machine Intelligence Research Institute told me that I didn’t need a salary increase to keep up with inflation—because I could be spending substantially less money on food if I used an online coupon service. And I believed this, because it was a friend I trusted, and it was delivered in a tone of such confidence. So my girlfriend started trying to use the service, and a couple of weeks later she gave up.

Now here’s the thing: if I’d run across exactly the same advice about using coupons on some blog somewhere, I probably wouldn’t even have paid much attention, just read it and moved on. Even if it were written by Scott Aaronson or some similar person known to be intelligent, I still would have read it and moved on. But because it was delivered to me personally, by a friend who I knew, my brain processed it differently—as though I were being told the secret; and that indeed is the tone in which it was told to me. And it was something of a delayed reaction to realize that I’d simply been told, as personal advice, what otherwise would have been just a blog post somewhere; no more and no less likely to work for me, than a productivity blog post written by any other intelligent person.

And because I have encountered a great many people trying to optimize me, I can attest that the advice I get is as wide-ranging as the productivity blogosphere. But others don’t see this plethora of productivity advice as indicating that people are diverse in which advice works for them. Instead they see a lot of obviously wrong poor advice. And then they finally discover the right way—the way that works, unlike all those other blog posts that don’t work—and then, quite often, they decide to use it to optimize Eliezer Yudkowsky.

Don’t get me wrong. Sometimes the advice is helpful. Sometimes it works. “Stuck In The Middle With Bruce”—that resonated, for me. It may prove to be the most helpful thing I’ve read on the new Less Wrong so far, though that has yet to be determined.

It’s just that your earnest personal advice, that amazing thing you’ve found to actually work by golly, is no more and no less likely to work for me than a random personal improvement blog post written by an intelligent author is likely to work for you.

“Different things work for different people.” That sentence may give you a squicky feeling; I know it gives me one. Because this sentence is a tool wielded by Dark Side Epistemology to shield from criticism, used in a way closely akin to “Different things are true for different people” (which is simply false).

But until you grasp the laws that are near-universal generalizations, sometimes you end up messing around with surface tricks that work for one person and not another, without your understanding why, because you don’t know the general laws that would dictate what works for who. And the best you can do is remember that, and be willing to take “No” for an answer.

You especially had better be willing to take “No” for an answer, if you have power over the Other. Power is, in general, a very dangerous thing, which is tremendously easy to abuse, without your being aware that you’re abusing it. There are things you can do to prevent yourself from abusing power, but you have to actually do them or they don’t work. There was a post on Overcoming Bias on how being in a position of power has been shown to decrease our ability to empathize with and understand the other, though I can’t seem to locate it now. I have seen a rationalist who did not think he had power, and so did not think he needed to be cautious, who was amazed to learn that he might be feared . . .

It’s even worse when their discovery that works for them requires a little willpower. Then if you say it doesn’t work for you, the answer is clear and obvious: you’re just being lazy, and they need to exert some pressure on you to get you to do the correct thing, the advice they’ve found that actually works.

Sometimes—I suppose—people are being lazy. But be very, very, very careful before you assume that’s the case and wield power over others to “get them moving.” Bosses who can tell when something actually is in your capacity if you’re a little more motivated, without it burning you out or making your life incredibly painful—these are the bosses who are a pleasure to work under. That ability is extremely rare, and the bosses who have it are worth their weight in silver. It’s a high-level interpersonal technique that most people do not have. I surely don’t have it. Do not assume you have it because your intentions are good. Do not assume you have it because you’d never do anything to others that you didn’t want done to yourself. Do not assume you have it because no one has ever complained to you. Maybe they’re just scared. That rationalist of whom I spoke—who did not think he held power and threat, though it was certainly obvious enough to me—he did not realize that anyone could be scared of him.

Be careful even when you hold leverage, when you hold an important decision in your hand, or a threat, or something that the other person needs, and all of a sudden the temptation to optimize them seems overwhelming.

Consider, if you would, that Ayn Rand’s whole reign of terror over Objectivists can be seen in just this light—that she found herself with power and leverage, and could not resist the temptation to optimize.

We underestimate the distance between ourselves and others. Not just inferential distance, but distances of temperament and ability, distances of situation and resource, distances of unspoken knowledge and unnoticed skills and luck, distances of interior landscape.

Even I am often surprised to find that X, which worked so well for me, doesn’t work for someone else. But with so many others having tried to optimize me, I can at least recognize distance when I’m hit over the head with it.

Maybe being pushed on does work . . . for you. Maybe you don’t get sick to the stomach when someone with power over you starts helpfully trying to reorganize your life the correct way. I don’t know what makes you tick. In the realm of willpower and akrasia and productivity, as in other realms, I don’t know the generalizations deep enough to hold almost always. I don’t possess the deep keys that would tell me when and why and for who a technique works or doesn’t work. All I can do is be willing to accept it when someone tells me it doesn’t work . . . and go on looking for the deeper generalizations that will hold everywhere, the deeper laws governing both the rule and the exception, waiting to be found, someday.

*

331

Practical Advice Backed by Deep Theories

Once upon a time, Seth Roberts took a European vacation and found that he started losing weight while drinking unfamiliar-tasting caloric fruit juices.

Now suppose Roberts had not known, and never did know, anything about metabolic set points or flavor-calorie associations—all this high-falutin’ scientific experimental research that had been done on rats and occasionally humans.

He would have posted to his blog, “Gosh, everyone! You should try these amazing fruit juices that are making me lose weight!” And that would have been the end of it. Some people would have tried it, it would have worked temporarily for some of them (until the flavor-calorie association kicked in) and there never would have been a Shangri-La Diet per se.

The existing Shangri-La Diet is visibly incomplete—for some people, like me, it doesn’t seem to work, and there is no apparent reason for this or any logic permitting it. But the reason why as many people have benefited as they have—the reason why there was more than just one more blog post describing a trick t
hat seemed to work for one person and didn’t work for anyone else—is that Roberts knew the experimental science that let him interpret what he was seeing, in terms of deep factors that actually did exist.

One of the pieces of advice on Overcoming Bias / Less Wrong that was frequently cited as the most important thing learned, was the idea of “the bottom line”—that once a conclusion is written in your mind, it is already true or already false, already wise or already stupid, and no amount of later argument can change that except by changing the conclusion. And this ties directly into another oft-cited most important thing, which is the idea of “engines of cognition,” minds as mapping engines that require evidence as fuel.

Suppose I had merely written one more blog post that said, “You know, you really should be more open to changing your mind—it’s pretty important—and oh yes, you should pay attention to the evidence too.” This would not have been as useful. Not just because it was less persuasive, but because the actual operations would have been much less clear without the explicit theory backing it up. What constitutes evidence, for example? Is it anything that seems like a forceful argument? Having an explicit probability theory and an explicit causal account of what makes reasoning effective makes a large difference in the forcefulness and implementational details of the old advice to “Keep an open mind and pay attention to the evidence.”

It is also important to realize that causal theories are much more likely to be true when they are picked up from a science textbook than when invented on the fly—it is very easy to invent cognitive structures that look like causal theories but are not even anticipation-controlling, let alone true.

‹ Prev Next ›