The language of the mind
The fact that we can use language to describe the world around us suggested to philosophers and logicians that we translate what we hear or read into little more than a mental description of the things being talked about-the mental equivalent of a language, in fact. Logicians were keen on this approach because they found that they could think about the meaning of an utterance in terms of the mathematical languages that they worked with. Meaning could be talked about by analogy to mathematics and logic, and all that was required was a procedure for translating the sentences of, say, English, into the `sentences' of the mathematical language.
To say that what we hear or read is translated into another, albeit mental, language is simply passing the buck, however. If we had access to the mental language, we would still have to explain what any sentence expressed in that language meant. And we would still have to say how it came to have that meaning.
In the late 1970s and early 1980s, Phil Johnson-Laird, at the University of Sussex, proposed instead that what happens when we hear or listen to language (in effect, its meaning) has much in common with what would happen if we directly observed the situation that the language described. Alan Garnham, a student of his at the time, devised an ingeniously simple experiment to establish whether this was right. His aim was to distinguish between meaning as the mental equivalent of the language used to describe something, or as the mental equivalent of what happens when one observes that something directly. He gave people some short stories to read, one of which, for instance, contained the following information:
By the window was a man with a martini.
Subsequently, they read that this person waved at the hostess of the party he was at (recall that this experiment was devised in the late 1970s, when parties, martinis, and hostesses were still an important part of the social calendar). But in a memory test carried out later, people could not remember which of the following two sentences they had in fact read:
The man with the martini waved at the hostess.
The man by the window waved at the hostess.
The fact that people confused these two sentences suggested that the information that had originally been used to describe the man who did the waving was lost. If the meaning of the text (used to narrate the story) had been stored in teens of the mental equivalent of the sentences that made up that text, one would not expect this kind of information to disappear from memory. But if the meaning of the text had been stored in terms of the mental equivalent of something like a film of what had happened, it would not be at all surprising. If, in such a film, the man who waved at the hostess had been holding a drink, and had been standing by a window, it would be impossible to tell whether the script for the film (i.e. the text itself) had said that the man with the martini had done the waving, or that the man by the window had done the waving.
Of course, it is all well and fine to talk about a text as a film script, and as the meaning of that text as the film itself. But what does it mean to say that we store the meaning of something as the equivalent of a film in our head? Where is the cinema screen?
Wor(l)ds in the mind
Films are basically just memories on celluloid. They allow images to be projected in the absence of the original source of the image. Language is similar in some respects-it allows us to communicate ideas that relate to things which are not around us. In these cases, the language we use refers really to some kind of memory of the things we are talking about. But although `Whom did she mean by the man she saw last night?' can only be answered with reference to a memory of the individuals concerned, it can be understood without any such memory-it is enough to know what kind of memory would be required.
Of course, `memory' is not quite the right word. In the absence of a specific memory about the man she met last night, what is really evoked in response to this phrase is a kind of mental picture of the world, and the way that world would have to be for that sentence to be true. In effect, it is a model of the world, much like a model of a plane, or a medieval castle. The fact that these things are built of balsa wood, or Lego bricks, is immaterial if what we want to do is represent how many wings the plane has, or rooms the castle has. The beauty of these models is that they do not have to represent anything real:
The winged unicorn flew high across the fiery skies.
Despite the fact that there are no winged unicorns, we can construct a (mental) model of what the world would be like if unicorns did exist and if the sky was fiery. So one way to think about what we do when we hear a sentence is to imagine that we build a kind of mental model of whatever is being described. But if the information we need to represent in our minds is the mental equivalent of a Lego model (with moving parts), out of what mental material do we build the model?
The Lego model has the property that if an arch supports a rampart in the real castle, then an arch supports a rampart in the model castle. But whereas in the Lego model, an arch can physically support a rampart, in the brain, physical support of this kind is just not possible. There are no little arches in there busily supporting little ramparts. So should we abandon the idea that there is a mental equivalent of Legoland? No: seeing an arch has an effect on the neural circuitry of the brain. So does thinking an arch. The two effects will not be quite the same, but we can suppose that the neural activity that happens when we think the arch somehow reflects what is common to all the different patterns of neural activity that have been evoked when we have experienced arches (whether by walking under them, seeing them, reading about them, whatever). We supposed something very similar when thinking earlier about the meanings of single words. The difference, then, between mental models and Lego models is that in the Lego model, there is a oneto-one correspondence between things in the model and the things in the world that are being modelled. But in the mental model, there is instead a one-to-one correspondence between the neural activity that corresponds to the mental model and the neural activity that corresponds to actually experiencing (by whatever means) the things being represented. So the mental equivalent of a Lego model is not too far-fetched.
Building a mental world
How does a sentence cause the mental model to be added to and updated? How does it cause the appropriate patterns of neural activity? These questions can best be answered by considering some real examples. First, a simple case:
A balding linguist ate a very large fish.
This sentence is relatively straightforward. Grammatical convention tells us to add two entities into the model-one representing the linguist and another representing the fish. We also add that the linguist is balding, because the grammatical conventions of English tell us that the word immediately before `linguist' describes a property of the linguist (and similarly for `very large' and `fish'). The conventions also tell us the roles that the characters each play, the roles themselves being defined by the verb `ate'. In this case, the fish has the role of being eaten, while the linguist has the role of doing the eating. Of course, this is just metaphorspeak-mental models are just patterns of neural activity. So for anyone who prefers to read about these patterns, the following points ought to do the trick (and anyone who wants to stick to metaphor-speak should just skip them).
The pattern of neural activity associated with `a balding linguist' will have something in common, we can suppose, with the pattern that would be evoked when a balding linguist (as opposed to any other kind of linguist) was actually seen.
The sequence of patterns that would be evoked by the entire sentence would, likewise, have something in common with the sequence that would be evoked if a balding linguist were actually seen to eat a very large fish.
The pattern of activity evoked by the example sentence would differ from that elicited by `A very large fish ate a balding linguist', even though the two sentences share the same wordsparalleling the fact that seeing the corresponding events would also elicit different patterns. And although the patterns of neural activity in response to `a balding li
nguist ate' and `a very large fish ate' would differ, they would also reflect a feature that is common to both, namely that the characters mentioned are the ones doing the eating. This commonality, combined with the different individual patterns evoked by `a balding linguist' and `a very large fish', would ensure that the difference in the overall sequence of neural activity reflected the difference in who did the eating and who was eaten. To put it crudely, the changing pattern of neural activity directly encodes who did what to whom, and it is this changing pattern that corresponds, in our model metaphor, to building the model.
Whether we use metaphor-speak, or neuro-speak, the final result is the same; a mental representation is built of what the example sentence describes. But there is one further point that is fundamental to the entire notion of mental models and their neural equivalents; the mental model must exist for more than just the lifetime of an individual sentence, otherwise the information conveyed by a new sentence could not be integrated within that model. The model is not simply what happens when one reads a sentence, but rather it is a memory of what happened, and it is this memory that gets updated as other sentences are read (or heard). Fortunately, the nature of human memory is outside the remit of this book, so we shall just take it for granted.
What should be clear by now is that an individual sentence, or an individual utterance, is really just a specification of what should be done to the mental model. In Lego terms, it is like the instruction leaflet that details which things to add into the model, in what order, and next to which other things. It sounds easy but, as with real Lego, figuring out which pieces the instruction leaflet is referring to is quite another matter.
Keeping track of all the pieces
Linguists eating fish are hardly inspiring, and a more extended example will give a better flavour of the range of operations that mental modelling normally requires.
A stonemason and his apprentice set down a block of stone by the side of the road. They were hungry. The stonemason had left their lunch under a nearby olive tree. It was a hot day butfortunately the beer was still cold. There was a large piece of nougat too, but when the apprentice tried to cut through it, the knife broke. They decided to eat it later. After lunch, the stonemason picked up his tools, and headed towards the tower. Another few weeks and it would be finished.
The one redeeming feature of this otherwise somewhat boring text is that it illustrates a range of model-building processes, including the most important of all-the ability to keep track of the things that are added to the model. This is reflected in the relationship between phrases such as `a stonemason', `the stonemason', and `he'. The difference between these is fundamental, as the following adaptation shows (one of the sentences is omitted as it is not relevant to this point):
A stonemason and an apprentice set down a slab of stone by a side of a road. A stonemason and an apprentice were hungry. A stonemason had left a stonemason and an apprentice's lunch under a nearby olive tree. There was a large piece of nougat, but when an apprentice tried to cut through a large piece of nougat, a knife broke. A stonemason and an apprentice decided to eat a large piece of nougat later. After lunch, a stonemason picked up a stonemason's tools, and headed towards a tower. Another few weeks and a tower would be finished.
Perhaps the first thing that comes to mind when reading this rather odd passage is that it is quite unclear whether or not the same people, or the same things, are being referred to at different points in the text. It is a little like a film in which, from one scene to the next, the same role is being assumed by different actors-keeping track of whether a new actor is playing the part of someone already established within the plot, or is playing the part of someone new, would be impossible. The reason this text gives rise to a similar problem is that, ordinarily, an expression like `a stonemason' introduces a new actor playing a new part. If what is needed is an actor to play an established part (i.e. one that has been introduced earlier), then either `he' or `the stonemason' would be used. Not surprisingly, there is an important distinction between these two as well. Words like `he' or `it' tend to refer to characters or things that are, in metaphor-speak, centre-stage. But if more than one character, or thing, is centre-stage, then a way is needed to distinguish between them, which is where `the' comes in ('They were hungry. The stonemason . . .'). Occasionally, one needs also to distinguish between characters or things that are no longer centre-stage, perhaps because they have not been mentioned for a while. Here again, `the' comes in handy; the sentence `It didn't fill him up' is probably harder to make sense of at this point than the sentence `The lunch which the stonemason had left under the olive tree didn't fill him up'.
So the differences between `a', `the', and `he/she/it' allow us to keep track of which pieces are being referred to at any one time. The reason we need to do this is because we tend to talk or write about thingsthe earlier part of each sentence tends to establish what is being talked about, and the later part tends to introduce new information about it.
It all sounds simple enough, really. But as usual, things are more complex than they may at first appear. Although the grammatical conventions we use do, for the most part, successfully dictate who or what is being referred to, it is a sad fact that often they do not go nearly far enough. For instance, what did `it' refer to each time it was used in the stonemason's story?
They were hungry. The stonemason had left their lunch under a nearby olive tree. It was a hot day but fortunately the beer was still cold. There was a large piece of nougat too, but when the apprentice tried to cut through it, the knife broke. They decided to eat it later.
Needless to say, `it' does not always refer to the same thing, and much hinges on being able to work out, each time we come across it, what it means. For instance the first `it' does not refer to anything already mentioned, but instead refers to the time at which this episode took place. But we can only work this out once we get to the word `day', as the sentence could just as easily have started `It [the lunch] was a hot pastrami sandwich'. The second occurrence of `it' is easier, because the last-mentioned thing was the nougat. But when we get to the third `it', the last-mentioned thing was the knife. In this case, we need to use general knowledge about what is eatable in order to rule out the possibility that the thing being referred to is the knife.
So figuring out which pieces in the mental model are being referred to is not always straightforward. Another complication is that sometimes pieces are referred to which do not actually exist within the modelinstead, their existence is assumed:
The stonemason had left their lunch under a nearby olive tree. It was a hot day but fortunately the beer was still cold.
Whereas the stonemason had already been mentioned, the beer had not. It can only be inferred on the basis of the likelihood that lunch might include beer. If `beer' were to be replaced by `bear', the sentence would seem very odd indeed, because no bears could be inferred to exist in the model on the basis of what had been read so far-it would be impossible to establish a coherent link between the bear and anything else.
What makes a text or a conversation understandable is that there is continuity between the different sentences regarding who or what is being talked about (the equivalent is true for films also). This continuity often relies, as we have just seen, on being able to infer links between the different things being talked about, as when the beer's relationship to the previously mentioned lunch had to be inferred. Another kind of inference happened in the following case:
When the apprentice tried to cut through the nougat, the knife broke.
In this case, the knife could be inferred because the meaning of the verb `cut' implies the use of something to do the cutting with. This is a different kind of inference from the lunch-beer one. That case was based on general knowledge, while this one is based on knowledge about the meaning of one of the items in the sentence.
But why not just accept, at face value, the introduction of the beer, or the knife? Why not simply add the new piece int
o the model? Why bother with these inferences? The fact that the following adaptation of the stonemason text is so bizarre demonstrates that we care very much about linking things together.
A stonemason and his apprentice set down a block of stone by the side of the road. They were hungry. The stonemason had left their batteries under a nearby olive tree. It was a hot day but fortunately the bear was still cold. There was a large piece of nougat too, but when the juggler tried to cut through it, the ball broke ...
What is so special about trying to link everything to everything else?
The role of prediction in language understanding
It is not difficult to imagine a film in which no character remains onscreen for long, or in which the actors keep changing, or in which the storyline is fragmented and incomprehensible. If you want to make a decent film, the key ingredient, in the right dose, is predictability. Of course, if things are too predictable, nothing can happen in the film that is new or interesting. But if things are too unpredictable, then the film ends up being a series of unconnected and random events-perhaps a great artform, but not much else.
The Ascent of Babel: An Exploration of Language, Mind, and Understanding Page 16