Rationality- From AI to Zombies

Home > Science > Rationality- From AI to Zombies > Page 94
Rationality- From AI to Zombies Page 94

by Eliezer Yudkowsky


  Because if this wasn’t just a coincidence—if you had some reach-into-the-bin function that pulled out a human-corresponding GLUT by design, not just chance—then that reach-into-the-bin function is probably conscious, and so the GLUT is again a cellphone, not a zombie. It’s connected to a human at two removes, instead of one, but it’s still a cellphone! Nice try at concealing the source of the improbability there!

  Now behold where Follow-The-Improbability has taken us: where is the source of this body’s tongue talking about an inner listener? The consciousness isn’t in the lookup table. The consciousness isn’t in the factory that manufactures lots of possible lookup tables. The consciousness was in whatever pointed to one particular already-manufactured lookup table, and said, “Use that one!”

  You can see why I introduced the game of Follow-The-Improbability. Ordinarily, when we’re talking to a person, we tend to think that whatever is inside the skull must be “where the consciousness is.” It’s only by playing Follow-The-Improbability that we can realize that the real source of the conversation we’re having is that-which-is-responsible-for the improbability of the conversation—however distant in time or space, as the Sun moves a wind-up toy.

  “No, no!” says the philosopher. “In the thought experiment, they aren’t randomly generating lots of GLUTs, and then using a conscious algorithm to pick out one GLUT that seems humanlike! I am specifying that, in this thought experiment, they reach into the inconceivably vast GLUT bin, and by pure chance pull out a GLUT that is identical to a human brain’s inputs and outputs! There! I’ve got you cornered now! You can’t play Follow-The-Improbability any further!”

  Oh. So your specification is the source of the improbability here.

  When we play Follow-The-Improbability again, we end up outside the thought experiment, looking at the philosopher.

  That which points to the one GLUT that talks about consciousness, out of all the vast space of possibilities, is now . . . the conscious person asking us to imagine this whole scenario. And our own brains, which will fill in the blank when we imagine, “What will this GLUT say in response to ‘Talk about your inner listener’?”

  The moral of this story is that when you follow back discourse about “consciousness,” you generally find consciousness. It’s not always right in front of you. Sometimes it’s very cleverly hidden. But it’s there. Hence the Generalized Anti-Zombie Principle.

  If there is a Zombie Master in the form of a chatbot that processes and remixes amateur human discourse about “consciousness,” the humans who generated the original text corpus are conscious.

  If someday you come to understand consciousness, and look back, and see that there’s a program you can write that will output confused philosophical discourse that sounds an awful lot like humans without itself being conscious—then when I ask “How did this program come to sound similar to humans?” the answer is that you wrote it to sound similar to conscious humans, rather than choosing on the criterion of similarity to something else. This doesn’t mean your little Zombie Master is conscious—but it does mean I can find consciousness somewhere in the universe by tracing back the chain of causality, which means we’re not entirely in the Zombie World.

  But suppose someone actually did reach into a GLUT-bin and by genuinely pure chance pulled out a GLUT that wrote philosophy papers?

  Well, then it wouldn’t be conscious. In my humble opinion.

  I mean, there’s got to be more to it than inputs and outputs.

  Otherwise even a GLUT would be conscious, right?

  Oh, and for those of you wondering how this sort of thing relates to my day job . . .

  In this line of business you meet an awful lot of people who think that an arbitrarily generated powerful AI will be “moral.” They can’t agree among themselves on why, or what they mean by the word “moral”; but they all agree that doing Friendly AI theory is unnecessary. And when you ask them how an arbitrarily generated AI ends up with moral outputs, they proffer elaborate rationalizations aimed at AIs of that which they deem “moral”; and there are all sorts of problems with this, but the number one problem is, “Are you sure the AI would follow the same line of thought you invented to argue human morals, when, unlike you, the AI doesn’t start out knowing what you want it to rationalize?” You could call the counter-principle Follow-The-Decision-Information, or something along those lines. You can account for an AI that does improbably nice things by telling me how you chose the AI’s design from a huge space of possibilities, but otherwise the improbability is being pulled out of nowhere—though more and more heavily disguised, as rationalized premises are rationalized in turn.

  So I’ve already done a whole series of essays which I myself generated using Follow-The-Improbability. But I didn’t spell out the rules explicitly at that time, because I hadn’t done the thermodynamics essays yet . . .

  Just thought I’d mention that. It’s amazing how many of my essays coincidentally turn out to include ideas surprisingly relevant to discussion of Friendly AI theory . . . if you believe in coincidence.

  *

  1. Daniel C. Dennett, “The Unimagined Preposterousness of Zombies,” Journal of Consciousness Studies 2 (4 1995): 322–26.

  2. Richard P. Feynman, “Judging Books by Their Covers,” in Surely You’re Joking, Mr. Feynman! (New York: W. W. Norton & Company, 1985).

  225

  Belief in the Implied Invisible

  One generalized lesson not to learn from the Anti-Zombie Argument is, “Anything you can’t see doesn’t exist.”

  It’s tempting to conclude the general rule. It would make the Anti-Zombie Argument much simpler, on future occasions, if we could take this as a premise. But unfortunately that’s just not Bayesian.

  Suppose I transmit a photon out toward infinity, not aimed at any stars, or any galaxies, pointing it toward one of the great voids between superclusters. Based on standard physics, in other words, I don’t expect this photon to intercept anything on its way out. The photon is moving at light speed, so I can’t chase after it and capture it again.

  If the expansion of the universe is accelerating, as current cosmology holds, there will come a future point where I don’t expect to be able to interact with the photon even in principle—a future time beyond which I don’t expect the photon’s future light cone to intercept my world-line. Even if an alien species captured the photon and rushed back to tell us, they couldn’t travel fast enough to make up for the accelerating expansion of the universe.

  Should I believe that, in the moment where I can no longer interact with it even in principle, the photon disappears?

  No.

  It would violate Conservation of Energy. And the Second Law of Thermodynamics. And just about every other law of physics. And probably the Three Laws of Robotics. It would imply the photon knows I care about it and knows exactly when to disappear.

  It’s a silly idea.

  But if you can believe in the continued existence of photons that have become experimentally undetectable to you, why doesn’t this imply a general license to believe in the invisible?

  (If you want to think about this question on your own, do so before reading on . . .)

  Though I failed to Google a source, I remember reading that when it was first proposed that the Milky Way was our galaxy—that the hazy river of light in the night sky was made up of millions (or even billions) of stars—that Occam’s Razor was invoked against the new hypothesis. Because, you see, the hypothesis vastly multiplied the number of “entities” in the believed universe. Or maybe it was the suggestion that “nebulae”—those hazy patches seen through a telescope—might be galaxies full of stars, that got the invocation of Occam’s Razor.

  Lex parsimoniae: Entia non sunt multiplicanda praeter necessitatem.

  That was Occam’s original formulation, the law of parsimony: Entities should not be multiplied beyond necessity.

  If you postulate billions of stars that no one has ever believed in before, yo
u’re multiplying entities, aren’t you?

  No. There are two Bayesian formalizations of Occam’s Razor: Solomonoff induction, and Minimum Message Length. Neither penalizes galaxies for being big.

  Which they had better not do! One of the lessons of history is that what-we-call-reality keeps turning out to be bigger and bigger and huger yet. Remember when the Earth was at the center of the universe? Remember when no one had invented Avogadro’s number? If Occam’s Razor was weighing against the multiplication of entities every time, we’d have to start doubting Occam’s Razor, because it would have consistently turned out to be wrong.

  In Solomonoff induction, the complexity of your model is the amount of code in the computer program you have to write to simulate your model. The amount of code, not the amount of RAM it uses or the number of cycles it takes to compute. A model of the universe that contains billions of galaxies containing billions of stars, each star made of a billion trillion decillion quarks, will take a lot of RAM to run—but the code only has to describe the behavior of the quarks, and the stars and galaxies can be left to run themselves. I am speaking semi-metaphorically here—there are things in the universe besides quarks—but the point is, postulating an extra billion galaxies doesn’t count against the size of your code, if you’ve already described one galaxy. It just takes a bit more RAM, and Occam’s Razor doesn’t care about RAM.

  Why not? The Minimum Message Length formalism, which is nearly equivalent to Solomonoff induction, may make the principle clearer: If you have to tell someone how your model of the universe works, you don’t have to individually specify the location of each quark in each star in each galaxy. You just have to write down some equations. The amount of “stuff” that obeys the equation doesn’t affect how long it takes to write the equation down. If you encode the equation into a file, and the file is 100 bits long, then there are 2100 other models that would be around the same file size, and you’ll need roughly 100 bits of supporting evidence. You’ve got a limited amount of probability mass; and a priori, you’ve got to divide that mass up among all the messages you could send; and so postulating a model from within a model space of 2100 alternatives, means you’ve got to accept a 2-100 prior probability penalty—but having more galaxies doesn’t add to this.

  Postulating billions of stars in billions of galaxies doesn’t affect the length of your message describing the overall behavior of all those galaxies. So you don’t take a probability hit from having the same equations describing more things. (So long as your model’s predictive successes aren’t sensitive to the exact initial conditions. If you’ve got to specify the exact positions of all the quarks for your model to predict as well as it does, the extra quarks do count as a hit.)

  If you suppose that the photon disappears when you are no longer looking at it, this is an additional law in your model of the universe. It’s the laws that are “entities,” costly under the laws of parsimony. Extra quarks are free.

  So does it boil down to, “I believe the photon goes on existing as it wings off to nowhere, because my priors say it’s simpler for it to go on existing than to disappear”?

  This is what I thought at first, but on reflection, it’s not quite right. (And not just because it opens the door to obvious abuses.)

  I would boil it down to a distinction between belief in the implied invisible, and belief in the additional invisible.

  When you believe that the photon goes on existing as it wings out to infinity, you’re not believing that as an additional fact.

  What you believe (assign probability to) is a set of simple equations; you believe these equations describe the universe. You believe these equations because they are the simplest equations you could find that describe the evidence. These equations are highly experimentally testable; they explain huge mounds of evidence visible in the past, and predict the results of many observations in the future.

  You believe these equations, and it is a logical implication of these equations that the photon goes on existing as it wings off to nowhere, so you believe that as well.

  Your priors, or even your probabilities, don’t directly talk about the photon. What you assign probability to is not the photon, but the general laws. When you assign probability to the laws of physics as we know them, you automatically contribute that same probability to the photon continuing to exist on its way to nowhere—if you believe the logical implications of what you believe.

  It’s not that you believe in the invisible as such, from reasoning about invisible things. Rather the experimental evidence supports certain laws, and belief in those laws logically implies the existence of certain entities that you can’t interact with. This is belief in the implied invisible.

  On the other hand, if you believe that the photon is eaten out of existence by the Flying Spaghetti Monster—maybe on just this one occasion—or even if you believed without reason that the photon hit a dust speck on its way out—then you would be believing in a specific extra invisible event, on its own. If you thought that this sort of thing happened in general, you would believe in a specific extra invisible law. This is belief in the additional invisible.

  To make it clear why you would sometimes want to think about implied invisibles, suppose you’re going to launch a spaceship, at nearly the speed of light, toward a faraway supercluster. By the time the spaceship gets there and sets up a colony, the universe’s expansion will have accelerated too much for them to ever send a message back. Do you deem it worth the purely altruistic effort to set up this colony, for the sake of all the people who will live there and be happy? Or do you think the spaceship blips out of existence before it gets there? This could be a very real question at some point.

  The whole matter would be a lot simpler, admittedly, if we could just rule out the existence of entities we can’t interact with, once and for all—have the universe stop existing at the edge of our telescopes. But this requires us to be very silly.

  Saying that you shouldn’t ever need a separate and additional belief about invisible things—that you only believe invisibles that are logical implications of general laws which are themselves testable, and even then, don’t have any further beliefs about them that are not logical implications of visibly testable general rules—actually does seem to rule out all abuses of belief in the invisible, when applied correctly.

  Perhaps I should say, “you should assign unaltered prior probability to additional invisibles,” rather than saying, “do not believe in them.” But if you think of a belief as something evidentially additional, something you bother to track, something where you bother to count up support for or against, then it’s questionable whether we should ever have additional beliefs about additional invisibles.

  There are exotic cases that break this in theory. (E.g.: The epiphenomenal demons are watching you, and will torture 3 ↑↑↑ 3 victims for a year, somewhere you can’t ever verify the event, if you ever say the word “Niblick.”) But I can’t think of a case where the principle fails in human practice.

  *

  226

  Zombies: The Movie

  FADE IN around a serious-looking group of uniformed military officers. At the head of the table, a senior, heavy-set man, GENERAL FRED, speaks.

  GENERAL FRED: The reports are confirmed. New York has been overrun . . . by zombies.

  COLONEL TODD: Again? But we just had a zombie invasion 28 days ago!

  GENERAL FRED: These zombies . . . are different. They’re . . . philosophical zombies.

  CAPTAIN MUDD: Are they filled with rage, causing them to bite people?

  COLONEL TODD: Do they lose all capacity for reason?

  GENERAL FRED: No. They behave . . . exactly like we do . . . except that they’re not conscious.

  (Silence grips the table.)

  COLONEL TODD: Dear God.

  GENERAL FRED moves over to a computerized display.

  GENERAL FRED: This is New York City, two weeks ago.

  The display shows crowds bustling through
the streets, people eating in restaurants, a garbage truck hauling away trash.

  GENERAL FRED: This . . . is New York City . . . now.

  The display changes, showing a crowded subway train, a group of students laughing in a park, and a couple holding hands in the sunlight.

  COLONEL TODD: It’s worse than I imagined.

  CAPTAIN MUDD: How can you tell, exactly?

  COLONEL TODD: I’ve never seen anything so brutally ordinary.

  A lab-coated SCIENTIST stands up at the foot of the table.

  SCIENTIST: The zombie disease eliminates consciousness without changing the brain in any way. We’ve been trying to understand how the disease is transmitted. Our conclusion is that, since the disease attacks dual properties of ordinary matter, it must, itself, operate outside our universe. We’re dealing with an epiphenomenal virus.

  GENERAL FRED: Are you sure?

  SCIENTIST: As sure as we can be in the total absence of evidence.

  GENERAL FRED: All right. Compile a report on every epiphenomenon ever observed. What, where, and who. I want a list of everything that hasn’t happened in the last fifty years.

  CAPTAIN MUDD: If the virus is epiphenomenal, how do we know it exists?

  SCIENTIST: The same way we know we’re conscious.

  CAPTAIN MUDD: Oh, okay.

  GENERAL FRED: Have the doctors made any progress on finding an epiphenomenal cure?

  SCIENTIST: They’ve tried every placebo in the book. No dice. Everything they do has an effect.

  GENERAL FRED: Have you brought in a homeopath?

  SCIENTIST: I tried, sir! I couldn’t find any!

  GENERAL FRED: Excellent. And the Taoists?

  SCIENTIST: They refuse to do anything!

  GENERAL FRED: Then we may yet be saved.

 

‹ Prev