It makes more sense to talk about language learning than about language acquisition, argue Kirby and Christiansen.12 Their point is simply this: Children do, of course, readily learn language. Instead of beginning with the assumption that this is an impossible task that requires extra explanation, they simply begin by asking, how do they do it?
There is inevitably a human predisposition to language learning. “It’s absolutely true that there is an innate component to the process of language learning,” said Kirby. “It would be ludicrous to say otherwise. At the most basic level, not every species can speak the languages that we speak, so there must be something there. But in a more subtle sense, we know that we must have some biases. We can’t learn everything. There is no such thing as a general-purpose learner, a learner that can be exposed to any task and learn it. So yes, there’s linguistic innateness.”
The question remains: How much of this bias to learning language is actually language-specific? Said Kirby, “If you added up all of the influence of our learning bias, and all the things that give rise to our learning bias, then the number of things that aren’t specific to language but still affect the way we learn language vastly outweigh any language specifics within there.” It’s more accurate, explain Kirby and Christiansen, to talk of universal bias than of universal grammar.13
Another researcher takes up, almost literally, where Jean-Jacques Rousseau left off. Luc Steels heads the Sony Computer Science Laboratories in Paris, which is only a few blocks from the Panthéon, where Rousseau is buried. More than two hundred years after Rousseau wrote about the origin of language, Steels is spearheading a research program that may help us get closer to the answer. He asks: “What are the mental mechanisms and patterns of interaction that you need to get a communication system off the ground?”
Steels’s way of imagining the first language users is considerably more practical than his intellectual forebear’s. He manages a group of graduate and postdoctoral students, and together they are building creatures—not unlike the inhabitants of Rousseau’s primeval forest, the Adam and Eve of language.
In the beginning, Steels’s robots had only a single eye and a brain, and their primordial jungle was limited to some basic shapes and colors. Their eyes were black cameras sitting on top of large tripods. Their brains were computers, and their world was a small whiteboard, at which they stared.
Steels made his creatures look at shapes and think about what they saw, and then he encouraged them to talk to one another about it. He is trying to build a linguistic system from the bottom up, as it happened once before, sometime in the last six million years.
Embodiment is crucial. Steels is not modeling language, or a person, or a brain, or a world. His goal is to ground his experiments in hardware that is able to perceive the real physical world. If you go to the lab, you can watch Steels set up his robots and provoke a ricochet of signals between the bodies and the things they perceive; soon a cascade of meaning develops, and a linguistic system emerges before your eyes. Creating a linguistic animal means that, in this context, communication is not a separate, self-contained program, but instead is profoundly shaped by the development of the creature and its world. “These agents are as real as you can get,” said Steels. “They are artificial in the sense that they are built by us, but they operate for real in the real world, just like artificial light gives real light with which you can read a book in the dark.”
Steels’s fundamental motivation is to explore the design of an emerging communication system. “The approach I take,” he explained, “is a bit like the Wright brothers, who were trying to understand how flight was possible by building physical aircraft and experimenting with it. They did not try to model birds, nor did they run computer simulations (which would have been difficult at the time…). Once you have a theory of aerodynamics, you can take a fresh look at birds and better understand why the wings have a certain shape or why a particular size of bird has the wing span it does.” With such insight into the emergence of mental mechanisms underlying a communication system, a dialogue with researchers such as anthropologists, archaeologists, neurobiologists, and historical linguists may contribute ideas to the puzzle of human language evolution.
In most of Steels’s “talking heads” experiments, the robots’ brains consisted of memory and the ability to produce wordlike sounds. The robots’ main way of sensing the world was through vision. Their eyes were directed at simple scenes and objects—a plastic horse, a wooden mannequin—and each robotic individual was forced to find a way to recognize color, segment images, and identify these specific objects. In simpler versions of the experiment the world at which the robots gaze was a whiteboard on which a variety of colored, geometric shapes were fastened. The basic idea is that there is a cycle of back-and-forth between perception of the world and production of language, as the robots adapt and respond to a changing environment in the same way that humans have to.
Steels distributed his robots’ bodies throughout the real world, with some going to Paris, London, Tokyo, and Amsterdam, among other cities. The virtual entities occupying the bodies, the agents, were able to teleport through the Internet into specific bodies set up in each lab. Only once they were established inside a body could they communicate about what they saw, and only agents that inhabited the same physical space were allowed to talk to one another. The agents were like strangers at an art gallery, not looking at one another but standing side by side, commenting on the painting before them. This ensured not only that the agents had something to talk about but that they talked about the same physical world.
Steels was inspired by the twentieth-century philosopher Ludwig Wittgenstein’s habit of using games to study language. A game captures language in its most basic form, Steels said. It is a simple interaction between individuals within a specific setting. Steels’s agents played a guessing game. One agent would pick an object in the world and generate a word for it. Its agent interlocutor had to guess what the word referred to. Each entity took turns at being a speaker or a listener. If one correctly guessed what the other was referring to, the game was successful.
Steels didn’t program any word lists or mental and perceptual categories into the agents. They had to segment the images they looked at into sensory data, such as color and position on the board, and then the speaker agent would pick an object based on these data (for example, the red circle in the upper-left part of the board). Then it would choose a word to tell the hearer about the object; that word—for example, “malewina” or “bozopite”—was selected at random. If the listener agent guessed the word’s meaning correctly, it might then go on to use it with other robots, and in this way a correspondence between a word and a meaning developed within the population.
Steels found that the game would never get off the ground unless the robots had another channel for communication and verification, so he enabled them to point at the board by moving their camera and zooming in on an area (the other agent could sense the direction the camera was pointing). In one of the largest versions of the talking heads experiment, eight thousand words were generated for five thousand concepts, and a basic vocabulary of fundamental concepts, like up, down, left, right, green, large. There was no central dictionary or record defining each word; they existed only as tokens in the mind of each agent. Meaning was created when agents were able to make perceptually grounded distinctions, such as “left” or “right” and “green” or “red.” The distinctions arose when agents identified the object under discussion, separate from other objects in the context.
Since conducting the largest of the talking heads experiments in 1999, Steels and his co-workers have built more complexity into their experiments. One researcher has robots not playing games so much as communicating in order to feel emotion. In another, Steels has robots communicating with ears and vocal tracts to further increase their challenge. The lab is also looking at case marking, tense, open-ended semantics, language processing, and the different types of g
rammars that can emerge.
Steels is also interested in the way that structure spontaneously arises in biological systems where random behavior is reinforced by positive feedback. He was particularly inspired by Jean-Louis Deneubourg’s work on ants. Hundreds, sometimes thousands, of ants organize themselves into long chains when they are carrying material from a food source to their nest. The chains are adaptive: you can sweep away part of one, put objects in its way, remove individual ants or add new ones, and the chain will emerge again until the food source is depleted. There is no central coordinator instructing the ants on what to do and how to organize themselves in the face of disruption. Nevertheless, a greater intelligence—a design—emerges out of the local behavior of many relatively unintelligent individuals. Other systems where order emerges spontaneously from chaos are termite nest building, the growth of cell tissue, the way that cellular slime amoeba form an aggregate entity, and flocking in birds.14
The language that evolved in the guessing game has many of the same features as these systems, said Steels. It exhibited an absence of central planning, an adaptation to changing circumstances, and a resilience to the unexpected appearance and disappearance of elements (whether objects or individuals). Meaning and linguistic structure simply arose out of interaction between bodies in space.
Steels has recently taken embodiment to more complicated levels. In 2001 he started work with AIBO robots, which are among the most complex robots ever built.15 Each AIBO is an independent entity. Steels and his co-workers place the robots in various situations—on a floor with objects like boxes and colored balls, for example—and like the talking heads they must build both a conceptual system and a way of talking about it. The robots develop speaking and hearing processes while constantly trying to map their world (as they move about in it). They also have to work out where another is in space, and if one asks, “Where are you?” and the other answers, “To the left of the box,” the first AIBO has to decipher what “left” might mean. His group has also just finished a series of experiments in Tokyo with the QRIO humanoid robot. Working with the QRIO allowed them to implement many of the mechanisms humans use for joint attention, like pointing with a finger.
Because the robots engage in real image analysis (as opposed to being fitted with programs that dictate how to see the world), many errors arise in their interactions. But that’s the point, explained Steels. When successful communication does evolve, it shows how language is possible in difficult circumstances. “There is no reason,” Steels said, “to think that language processing is any less complicated than vision processing—which is very complicated.” He added: “The complexity of language is incredible, but we shouldn’t be afraid of that.”
As they grope their way through the world, Steels’s robots end up evolving rudimentary grammar as well as words and concepts. Syntax arises mainly from a situation of ambiguity. In phrases such as “red ball next to green box,” it is not clear to agents whether “red” goes with “box” or “ball” (unless they already have grammar). When an ambiguity like this is detected, the agent will invent a grammatical pattern to make his intended meaning clear to the listener. This suggests to Steels that human language ability is an emergent adaptive system that is created by a basic cognitive mechanism rather than by a genetically endowed language module.
Neither robotic nor digital linguistic systems can tell us exactly how language evolved. Indeed, the communication systems that arise in Kirby’s modeling or Steels’s experiments may or may not have the characteristics of human languages. What each can do is show how language might have evolved, and this is invaluable data. We can’t think these concepts through with our brain alone—instead we had to achieve this stage of technological innovation with computers fast enough to model such complicated processes and robots that can enact them. Kirby’s virtual linguistic creatures and Steels’s real ones suggest that in order to get to something that looks a lot like language, you may not need a language-specific mental device. Humans do a lot more with language than simple pointing and referring, but in order for language to become established, the ability to perform these steps is essential.
The most elusive part of the language evolution mystery is working out why all these things happened. Why did our species evolve in the way it did? Why does culture evolve the way it does? And even more complicated, how and why do they evolve together? The rollover of language change is thousands of times more rapid than biological evolution. We might find it difficult to talk with English speakers from a thousand years ago, but we wouldn’t have any trouble procreating with them. The final and greatest challenge for language evolution is discovering how the language suite and language itself evolved together.
14. Why things evolve
Genes mutate as a matter of course. If the carrier of a mutated gene is lucky, some effect of the new version will improve its chances of having offspring that survive, and then those offspring will have their own successful offspring, and so on and so forth. Every animal alive today stands at the end of a long line of lucky entities that begat lucky entities that begat lucky entities. They may not have been happy or fulfilled or at peace with their lives, but that’s not the point.
For a long time people have wondered why a particular trait has evolved. What was it about that trait and the environment in which it arose that meant it was a good thing to have? These considerations have been the most contentious part of the language evolution debate: Why did language evolve?
Part of the problem with posing this question in decades past was that even though scientists were using the same words, they were asking a fundamentally different question. At that time, language was still generally thought of as a single entity. Regarded as such, it left the question truly unanswerable, for different components of language have evolved in different stages in the history of life. If you ask, “Why did the whole thing evolve?” the implication is that it happened all at once, and no evolutionary pressure is up to the task of bringing forth everything from nothing.
The other problem with asking this question is that to some extent you have to imagine the answer. No one can ever know all the details of what happened when our distant ancestors began to talk. The only way to be completely sure is to travel back in time to witness the process, and we can’t do that. And there’s the problem of language fossils. There are none, at least none as definitive as the femur that Lucy left behind. As Chomsky has pointed out: “There is a rich record of the unhappy fate of highly plausible stories about what might have happened, once something was learned about what did happen—and in cases where far more is understood [than with language evolution].”1
However, the same objections could be raised about any attempt to explain the origins of the universe. In Fire in the Mind, George Johnson reminds us that the big bang scenario is still only a theory. Nevertheless, the intense layering of evidence and theoretical modifications that have accumulated since it was first proposed have given the theory the heft of unassailable truth. Today, says Johnson, the theory remains a work in progress that underpins the productive work of thousands of astronomers and physicists all over the world.
Cautions against employing “just-so” stories and fairy tales to trace language evolution had great resonance when less data were available about what happened and when it happened in the development of language in evolutionary time. Now the accumulation of evidence from genetics, comparative biology, behavioral studies, linguistics, and neuroscience makes such stories more feasible by placing powerful constraints on them.
With the information scientists now have about gesture, thought, and behavior both in humans and in close and distant species, they are better equipped to carve out the problem space and define the outlines of their story. They know more about where to look for clues and what paths not to take in a possible reconstruction of language evolution. It will never be possible to recover and rebuild every step of the way. But significant steps, major biological traits, and evolutio
nary landmarks can be identified. And while there are a number of ways in which the facts about humans and life and language evolution can be mapped onto the known evolutionary path that brought us to where we are today, data gathered over the next few years will further refine those conjectures.
In this context the prohibition against asking “why?” is starting to look as unscientific as the kind of fairy tale it once warned against. Indeed, there’s something a little disingenuous about the insistence that because you can’t prove it, you shouldn’t imagine it. Imagination is at the core of the scientific process. All the tests and experiments in the world mean nothing without the hunch or the story—the hypothesis—that kicks the process off. Now, instead of not venturing into the imagination or simply not declaring what they suspect, many scientists in the field of language evolution choose to propose a story and be up-front about how much their theory has been informed by data and how much is not yet verifiable.
Michael Arbib, one of the researchers who has investigated mirror neurons, has an idea about what he thinks might have happened and why it might have happened, based on the rigorous work he has carried out on the brain. Arbib’s approach is the opposite of the traditional Chomskyan one. Instead of emphasizing the fundamental sameness of language in a search for universals, he is interested in the different ways that people solve problems with language. As he explained: “Once you get beyond the fact that you’ve got to have words for actions, you’ve got to have words for objects and the agents that act upon them then, I think, you get into the realm of what people have learned over the centuries to do, rather than something that must be in the brain. People advertising universal grammar focus on what is common. I’m just struck by how varied the approaches people in different communities have to solving communicative problems.” Instead of tracing the parameters of language back to genes, Arbib thinks that most of grammar and the way that structure relates to meaning are products of culture. “My feeling is that most of it is probably a tribute to human ingenuity. I mean, kids can surf the Web, and nobody says there’s a Web-surfing gene.”
The First Word: The Search for the Origins of Language Page 28