Rationality- From AI to Zombies Page 139 Read online free by Eliezer Yudkowsky

Home > Science > Rationality- From AI to Zombies > Page 139

Rationality- From AI to Zombies Page 139

The belief that you didn’t want to define intelligence led to a situation in which he studied the problem for a long time before, years later, he started to propose systematizations.

This is what I refer to when I say that this is one of my all-time best mistakes.

Looking back, years afterward, I drew a very strong moral, to this effect:

What you actually end up doing screens off the clever reason why you’re doing it.

Contrast amazing clever reasoning that leads you to study many sciences, to amazing clever reasoning that says you don’t need to read all those books. Afterward, when your amazing clever reasoning turns out to have been stupid, you’ll have ended up in a much better position if your amazing clever reasoning was of the first type.

When I look back upon my past, I am struck by the number of semi-accidental successes, the number of times I did something right for the wrong reason. From your perspective, you should chalk this up to the anthropic principle: if I’d fallen into a true dead end, you probably wouldn’t be hearing from me in this book. From my perspective it remains something of an embarrassment. My Traditional Rationalist upbringing provided a lot of directional bias to those “accidental successes”—biased me toward rationalizing reasons to study rather than not study, prevented me from getting completely lost, helped me recover from mistakes. Still, none of that was the right action for the right reason, and that’s a scary thing to look back on your youthful history and see. One of my primary purposes in writing on Overcoming Bias is to leave a trail to where I ended up by accident—to obviate the role that luck played in my own forging as a rationalist.

So what makes this one of my all-time worst mistakes? Because sometimes “informal” is another way of saying “held to low standards.” I had amazing clever reasons why it was okay for me not to precisely define “intelligence,” and certain of my other terms as well: namely, other people had gone astray by trying to define it. This was a gate through which sloppy reasoning could enter.

So should I have jumped ahead and tried to forge an exact definition right away? No, all the reasons why I knew this was the wrong thing to do were correct; you can’t conjure the right definition out of thin air if your knowledge is not adequate.

You can’t get to the definition of fire if you don’t know about atoms and molecules; you’re better off saying “that orangey-bright thing.” And you do have to be able to talk about that orangey-bright stuff, even if you can’t say exactly what it is, to investigate fire. But these days I would say that all reasoning on that level is something that can’t be trusted—rather it’s something you do on the way to knowing better, but you don’t trust it, you don’t put your weight down on it, you don’t draw firm conclusions from it, no matter how inescapable the informal reasoning seems.

The young Eliezer put his weight down on the wrong floor tile—stepped onto a loaded trap.

*

294

Raised in Technophilia

My father used to say that if the present system had been in place a hundred years ago, automobiles would have been outlawed to protect the saddle industry.

One of my major childhood influences was reading Jerry Pournelle’s A Step Farther Out, at the age of nine. It was Pournelle’s reply to Paul Ehrlich and the Club of Rome, who were saying, in the 1960s and 1970s, that the Earth was running out of resources and massive famines were only years away. It was a reply to Jeremy Rifkin’s so-called fourth law of thermodynamics; it was a reply to all the people scared of nuclear power and trying to regulate it into oblivion.

I grew up in a world where the lines of demarcation between the Good Guys and the Bad Guys were pretty clear; not an apocalyptic final battle, but a battle that had to be fought over and over again, a battle where you could see the historical echoes going back to the Industrial Revolution, and where you could assemble the historical evidence about the actual outcomes.

On one side were the scientists and engineers who’d driven all the standard-of-living increases since the Dark Ages, whose work supported luxuries like democracy, an educated populace, a middle class, the outlawing of slavery.

On the other side, those who had once opposed smallpox vaccinations, anesthetics during childbirth, steam engines, and heliocentrism: The theologians calling for a return to a perfect age that never existed, the elderly white male politicians set in their ways, the special interest groups who stood to lose, and the many to whom science was a closed book, fearing what they couldn’t understand.

And trying to play the middle, the pretenders to Deep Wisdom, uttering cached thoughts about how technology benefits humanity but only when it was properly regulated—claiming in defiance of brute historical fact that science of itself was neither good nor evil—setting up solemn-looking bureaucratic committees to make an ostentatious display of their caution—and waiting for their applause. As if the truth were always a compromise. And as if anyone could really see that far ahead. Would humanity have done better if there’d been a sincere, concerned, public debate on the adoption of fire, and committees set up to oversee its use?

When I entered into the problem, I started out allergized against anything that pattern-matched “Ah, but technology has risks as well as benefits, little one.” The presumption-of-guilt was that you were either trying to collect some cheap applause, or covertly trying to regulate the technology into oblivion. And either way, ignoring the historical record immensely in favor of technologies that people had once worried about.

Robin Hanson raised the topic of slow FDA approval of drugs approved in other countries. Someone in the comments pointed out that Thalidomide was sold in 50 countries under 40 names, but that only a small amount was given away in the US, so that there were 10,000 malformed children born globally, but only 17 children in the US.

But how many people have died because of the slow approval in the US, of drugs more quickly approved in other countries—all the drugs that didn’t go wrong? And I ask that question because it’s what you can try to collect statistics about—this says nothing about all the drugs that were never developed because the approval process is too long and costly. According to this source, the FDA’s longer approval process prevents 5,000 casualties per year by screening off medications found to be harmful, and causes at least 20,000–120,000 casualties per year just by delaying approval of those beneficial medications that are still developed and eventually approved.

So there really is a reason to be allergic to people who go around saying, “Ah, but technology has risks as well as benefits.” There’s a historical record showing over-conservativeness, the many silent deaths of regulation being outweighed by a few visible deaths of nonregulation. If you’re really playing the middle, why not say, “Ah, but technology has benefits as well as risks”?

Well, and this isn’t such a bad description of the Bad Guys. (Except that it ought to be emphasized a bit harder that these aren’t evil mutants but standard human beings acting under a different worldview-gestalt that puts them in the right; some of them will inevitably be more competent than others, and competence counts for a lot.) Even looking back, I don’t think my childhood technophilia was too wrong about what constituted a Bad Guy and what was the key mistake. But it’s always a lot easier to say what not to do, than to get it right. And one of my fundamental flaws, back then, was thinking that if you tried as hard as you could to avoid everything the Bad Guys were doing, that made you a Good Guy.

Particularly damaging, I think, was the bad example set by the pretenders to Deep Wisdom trying to stake out a middle way; smiling condescendingly at technophiles and technophobes alike, and calling them both immature. Truly this is a wrong way; and in fact, the notion of trying to stake out a middle way generally, is usually wrong. The Right Way is not a compromise with anything; it is the clean manifestation of its own criteria.

But that made it more difficult for the young Eliezer to depart from the charge-straight-ahead verdict, because any departure felt like joining t
he pretenders to Deep Wisdom.

The first crack in my childhood technophilia appeared in, I think, 1997 or 1998, at the point where I noticed my fellow technophiles saying foolish things about how molecular nanotechnology would be an easy problem to manage. (As you may be noticing yet again, the young Eliezer was driven to a tremendous extent by his ability to find flaws—I even had a personal philosophy of why that sort of thing was a good idea.)

There was a debate going on about molecular nanotechnology, and whether offense would be asymmetrically easier than defense. And there were people arguing that defense would be easy. In the domain of nanotech, for Ghu’s sake, programmable matter, when we can’t even seem to get the security problem solved for computer networks where we can observe and control every one and zero. People were talking about unassailable diamondoid walls. I observed that diamond doesn’t stand off a nuclear weapon, that offense has had defense beat since 1945 and nanotech didn’t look likely to change that.

And by the time that debate was over, it seems that the young Eliezer—caught up in the heat of argument—had managed to notice, for the first time, that the survival of Earth-originating intelligent life stood at risk.

It seems so strange, looking back, to think that there was a time when I thought that only individual lives were at stake in the future. What a profoundly friendlier world that was to live in . . . though it’s not as if I were thinking that at the time. I didn’t reject the possibility so much as manage to never see it in the first place. Once the topic actually came up, I saw it. I don’t really remember how that trick worked. There’s a reason why I refer to my past self in the third person.

It may sound like Eliezer1998 was a complete idiot, but that would be a comfortable out, in a way; the truth is scarier. Eliezer1998 was a sharp Traditional Rationalist, as such things went. I knew hypotheses had to be testable, I knew that rationalization was not a permitted mental operation, I knew how to play Rationalist’s Taboo, I was obsessed with self-awareness . . . I didn’t quite understand the concept of “mysterious answers” . . . and no Bayes or Kahneman at all. But a sharp Traditional Rationalist, far above average . . . So what? Nature isn’t grading us on a curve. One step of departure from the Way, one shove of undue influence on your thought processes, can repeal all other protections.

One of the chief lessons I derive from looking back at my personal history is that it’s no wonder that, out there in the real world, a lot of people think that “intelligence isn’t everything,” or that rationalists don’t do better in real life. A little rationality, or even a lot of rationality, doesn’t pass the astronomically high barrier required for things to actually start working.

Let not my misinterpretation of the Right Way be blamed on Jerry Pournelle, my father, or science fiction generally. I think the young Eliezer’s personality imposed quite a bit of selectivity on which parts of their teachings made it through. It’s not as if Pournelle didn’t say: The rules change once you leave Earth, the cradle; if you’re careless sealing your pressure suit just once, you die. He said it quite a bit. But the words didn’t really seem important, because that was something that happened to third-party characters in the novels—the main character didn’t usually die halfway through, for some reason.

What was the lens through which I filtered these teachings? Hope. Optimism. Looking forward to a brighter future. That was the fundamental meaning of A Step Farther Out unto me, the lesson I took in contrast to the Sierra Club’s doom-and-gloom. On one side was rationality and hope, the other, ignorance and despair.

Some teenagers think they’re immortal and ride motorcycles. I was under no such illusion and quite reluctant to learn to drive, considering how unsafe those hurtling hunks of metal looked. But there was something more important to me than my own life: The Future. And I acted as if that were immortal. Lives could be lost, but not the Future.

And when I noticed that nanotechnology really was going to be a potentially extinction-level challenge?

The young Eliezer thought, explicitly, “Good heavens, how did I fail to notice this thing that should have been obvious? I must have been too emotionally attached to the benefits I expected from the technology; I must have flinched away from the thought of human extinction.”

And then . . .

I didn’t declare a Halt, Melt, and Catch Fire. I didn’t rethink all the conclusions that I’d developed with my prior attitude. I just managed to integrate it into my worldview, somehow, with a minimum of propagated changes. Old ideas and plans were challenged, but my mind found reasons to keep them. There was no systemic breakdown, unfortunately.

Most notably, I decided that we had to run full steam ahead on AI, so as to develop it before nanotechnology. Just like I’d been originally planning to do, but now, with a different reason.

I guess that’s what most human beings are like, isn’t it? Traditional Rationality wasn’t enough to change that.

But there did come a time when I fully realized my mistake. It just took a stronger boot to the head.

*

295

A Prodigy of Refutation

My Childhood Death Spiral described the core momentum carrying me into my mistake, an affective death spiral around something that Eliezer1996 called “intelligence.” I was also a technophile, pre-allergized against fearing the future. And I’d read a lot of science fiction built around personhood ethics—in which fear of the Alien puts humanity-at-large in the position of the bad guys, mistreating aliens or sentient AIs because they “aren’t human.”

That’s part of the ethos you acquire from science fiction—to define your in-group, your tribe, appropriately broadly. Hence my email address, [email protected].

So Eliezer1996 is out to build superintelligence, for the good of humanity and all sentient life.

At first, I think, the question of whether a superintelligence will/could be good/evil didn’t really occur to me as a separate topic of discussion. Just the standard intuition of, “Surely no supermind would be stupid enough to turn the galaxy into paperclips; surely, being so intelligent, it will also know what’s right far better than a human being could.”

Until I introduced myself and my quest to a transhumanist mailing list, and got back responses along the general lines of (from memory):

Morality is arbitrary—if you say that something is good or bad, you can’t be right or wrong about that. A superintelligence would form its own morality.

Everyone ultimately looks after their own self-interest. A superintelligence would be no different; it would just seize all the resources.

Personally, I’m a human, so I’m in favor of humans, not Artificial Intelligences. I don’t think we should develop this technology. Instead we should develop the technology to upload humans first.

No one should develop an AI without a control system that watches it and makes sure it can’t do anything bad.

Well, that’s all obviously wrong, thinks Eliezer1996, and he proceeded to kick his opponents’ arguments to pieces. (I’ve mostly done this in other essays, and anything remaining is left as an exercise to the reader.)

It’s not that Eliezer1996 explicitly reasoned, “The world’s stupidest man says the Sun is shining, therefore it is dark out.” But Eliezer1996 was a Traditional Rationalist; he had been inculcated with the metaphor of science as a fair fight between sides who take on different positions, stripped of mere violence and other such exercises of political muscle, so that, ideally, the side with the best arguments can win.

It’s easier to say where someone else’s argument is wrong, then to get the fact of the matter right; and Eliezer1996 was very skilled at finding flaws. (So am I. It’s not as if you can solve the danger of that power by refusing to care about flaws.) From Eliezer1996’s perspective, it seemed to him that his chosen side was winning the fight—that he was formulating better arguments than his opponents—so why would he switch sides?

Therefore is it written: “Because this world contains many whose
grasp of rationality is abysmal, beginning students of rationality win arguments and acquire an exaggerated view of their own abilities. But it is useless to be superior: Life is not graded on a curve. The best physicist in ancient Greece could not calculate the path of a falling apple. There is no guarantee that adequacy is possible given your hardest effort; therefore spare no thought for whether others are doing worse.”

You cannot rely on anyone else to argue you out of your mistakes; you cannot rely on anyone else to save you; you and only you are obligated to find the flaws in your positions; if you put that burden down, don’t expect anyone else to pick it up. And I wonder if that advice will turn out not to help most people, until they’ve personally blown off their own foot, saying to themselves all the while, correctly, “Clearly I’m winning this argument.”

Today I try not to take any human being as my opponent. That just leads to overconfidence. It is Nature that I am facing off against, who does not match Her problems to your skill, who is not obliged to offer you a fair chance to win in return for a diligent effort, who does not care if you are the best who ever lived, if you are not good enough.

But return to 1996. Eliezer1996 is going with the basic intuition of “Surely a superintelligence will know better than we could what is right,” and offhandedly knocking down various arguments brought against his position. He was skillful in that way, you see. He even had a personal philosophy of why it was wise to look for flaws in things, and so on.

I don’t mean to say it as an excuse, that no one who argued against Eliezer1996 actually presented him with the dissolution of the mystery—the full reduction of morality that analyzes all his cognitive processes debating “morality,” a step-by-step walkthrough of the algorithms that make morality feel to him like a fact. Consider it rather as an indictment, a measure of Eliezer1996’s level, that he would have needed the full solution given to him, in order to present him with an argument that he could not refute.

‹ Prev Next ›