There is a joke from the days of plenty in the former Soviet Union about a woman who goes to the butcher’s and asks, “Could you measure me out two hundred grams of salami, please?” “No problem, madam,” replies the butcher. “Just bring me the salami.” In our case, the salami may be there, but the measuring instrument is missing. I would be happy to measure for you the overall complexity of any language, but I have no idea where to find a scale, and neither does anyone else. As it happens, none of the linguists who profess the equal complexity dogma has ever tried to define what the overall complexity of a language might be.
“But wait,” I can hear you thinking. “Even if no one has bothered to define complexity so far, surely it can’t be too difficult to do it ourselves. Couldn’t we decide, for instance, that the complexity of a language is defined as the difficulty it poses for foreign learners?” But which learners exactly? The problem is that the difficulty of learning a foreign language crucially depends on the learner’s mother tongue. Swedish is a snap—if you happen to be Norwegian, and so is Spanish if you are Italian. But neither Swedish nor Spanish is easy if your native language is English. Still, both are incomparably easier for an English speaker than Arabic or Chinese. So does that mean that Chinese and Arabic are objectively more difficult? No, because if your mother tongue is Hebrew, then Arabic isn’t difficult at all, and if your mother tongue is Thai, then Chinese is less challenging than Swedish or Spanish. In short, there is no obvious way to generalize a measure of overall complexity based on the difficulty of learning, because—just like the effort required for traveling somewhere—it all depends on where you are starting from. (A proverbial Englishman learned this the hard way when he got desperately lost in the wilds of Ireland one day. After hours of driving round in circles through deserted country lanes, he finally spotted an elderly man walking by the side of the road, and asked him how to get back to Dublin. “If I were to go to Dublin,” came the reply, “I wouldn’t be starting from here.”)
I can sense that you are not ready to give up so easily. If the notion of difficulty will not do, you may now suggest, then what about basing the definition of complexity on a more objective measure, such as the number of parts in the language system? Just as a puzzle is more complex the more pieces it has, couldn’t we simply say that the complexity of language is determined by the number of distinct forms it has, or the number of distinctions it makes, or the number of rules in its grammar, or something along these lines? The problem here is that we will be comparing apples and oranges. Language has parts of very different kinds: sounds, words, grammatical elements such as endings, types of clauses, rules for word order. How do you compare such entities? Suppose language X has one more vowel than language Y, but Y has one more tense than X. Does this make X and Y equal in overall complexity? Or, if not, what is the exchange rate? How many vowels are worth one tense? Two? Seven? Thirteen for the price of twelve? It is even worse than apples and oranges, it is more like comparing apples and orangutans.
To make a long story short, there is no way to devise an objective and non-arbitrary measure for comparing the overall complexity of any two given languages. It’s not simply that no one has bothered to do it—it’s inherently impossible even if one tried. So where does all this leave the dogma of equal complexity? When Joe, Piers, and Tom claim that “primitive people speak primitive languages,” they are making a simple and eminently meaningful statement, which just happens to be factually incorrect. But the article of faith that linguists swear by is even worse than wrong—it is meaningless. The alleged central finding of the discipline is nothing more than a hollow mouthful of air, since in the absence of a definition for the overall complexity of a language, the statement that “all languages are equally complex” makes about as much sense as the assertion that “all languages are equally cornflakes.”
The campaign to convince the general public of the equality of all languages may be paved with best intentions, for it is undoubtedly a noble enterprise to disabuse people of the belief that primitive tribes speak primitive languages. But surely the road to enlightenment is not through countering factual errors with empty slogans.
While the pursuit of the overall complexity of language is a wild-goose chase, there is no need to give up on the notion of complexity altogether. In fact, we can considerably improve our chances of catching something meaty if we turn away from the phantom of overall complexity and instead aim for the complexity of particular areas of language. Suppose we decide to define complexity as the number of parts in a system. If we delineate specific areas of language carefully enough, it becomes eminently possible to measure the complexity of each of these areas individually. For example, we can measure the size of the sound system simply by counting the number of phonemes (distinct sounds) in a language’s inventory. Or we can look at the verbal system and measure how many tense distinctions are marked on the verb. When languages are compared in this way, it soon emerges that they vary greatly in the complexity of specific areas in their grammar. And whereas the existence of such variations is hardly stop-press news in itself, the more challenging question is whether the differences in the complexity of particular areas might reflect the culture of the speakers and the structure of their society.
There is one area of language whose complexity is generally acknowledged to depend on culture—this is the size of the vocabulary. The obvious dividing line here is between languages of illiterate societies and those with a written tradition. The aboriginal languages of Australia, for example, may have many more words than the two hundred that the Cairns radio presenter was granting them, but they still cannot begin to compete with the word hoard of European languages. Linguists who have described languages of small illiterate societies estimate that the average size of their lexicons is between three thousand and five thousand words. In contrast, small-size bilingual dictionaries of major European languages typically contain at least fifty thousand entries. Larger ones would contain seventy to eighty thousand. Decent-size monolingual dictionaries of English contain about a hundred thousand entries. And the full printed edition of the Oxford English Dictionary has around three times that many entries. Of course, the OED contains many obsolete words, and an average English speaker would recognize only a fraction of the entries. Some researchers have estimated the passive vocabulary of an average English-speaking university student at about forty thousand words—this is the number of words whose meaning is recognized, even if they are not actively used. Another source estimates the passive vocabulary of a university lecturer at seventy-three thousand words.
The reason for the great difference between languages with and without a written tradition is fairly obvious. In illiterate societies, the size of the vocabulary is severely restricted precisely because there is no such thing as “passive vocabulary”—or at least the passive vocabulary of one generation does not live to see the next: a word that is not actively used by one generation will not be heard by the next generation and will then be lost forever.
MORPHOLOGY
While the cultural dependence of the vocabulary is neither surprising nor controversial, we are entering more troubled waters when we try to ascertain whether the structure of society might affect the complexity of areas in the grammar of a language, for instance its morphology. Languages vary considerably in the amount of information they convey within words (rather than with a combination of independent words). In English, for example, verbs like “walked” or “wrote” express the pastness of the action within the verb itself, but they do not reveal the “person,” which is instead indicated with an independent word like “you” or “we.” In Arabic, both tense and person are contained within the verb itself, so that a form like katabn means “we wrote.” But in Chinese, neither the pastness of the action nor the person is conveyed on the verb itself.
There are also differences in the amount of information encapsulated within nouns. Hawaiian does not indicate the distinction between singular and plural on the noun
itself and uses independent words for the purpose. Similarly, in spoken French, most nouns sound the same in the singular and plural (jour and jours are pronounced in the same way, and one needs independent words, such as the definite article le or les, to make the difference heard). In English, on the other hand, the distinction between singular and plural is audible on the noun itself (dog–dogs, man–men). Some languages make even finer distinctions of number and have special forms also for the dual. Sorbian, a Slavic language spoken in a little enclave in eastern Germany, distinguishes between hród, “a castle,” hródaj, “two castles,” and hródy, “[three or more] castles.”
The information specified on pronouns also varies between languages. Japanese, for instance, makes finer distinctions of distance on demonstrative pronouns than modern English. It differentiates not just between “this” (for close objects) and “that” (for objects farther away) but has a three-way division between koko (for an object near the speaker), soko (near the hearer), asoko (far from both). Hebrew, on the other hand, makes no such distance distinctions at all and can use just one demonstrative pronoun regardless of distance.
Is the amount of information expressed within the word related to the complexity of a society? Are hunter-gatherer tribes, for example, more likely to speak in short and simple words? And are words likely to encapsulate more elaborate information in languages of advanced civilizations? In 1992, the linguist Revere Perkins set out to test exactly this question, by conducting a statistical survey of fifty languages. He assigned the societies in his sample to five broad categories of complexity, based on a combination of criteria that have been established by anthropologists, including population size, social stratification, type of subsistence economy, and specialization in crafts. On the simplest level, there are “bands” that consist of only a few families, don’t have permanent settlements, depend exclusively on hunting and gathering, and have no authority structure outside the family. The second category includes slightly larger groups, with incipient use of agriculture, semi-permanent settlement, and some minimal social organization. The third category is for “tribes” that produce most of their food by agriculture, have permanent settlements, a few craft specialists, and some form of authority figure. The fourth category refers to what is sometimes called “peasant societies,” with intensive agricultural production, small towns, craft specialization, and regional authorities. The fifth category of complexity refers to urban societies with large populations and complex social, political, and religious organizations.
In order to compare the complexity of words in the languages of the sample, Perkins chose a list of semantic features like the ones I mentioned above: the indication of plurality on nouns, tense on verbs, and other such bits of information that identify the participants, the time, and the place of events. He then checked how many of these features are expressed within the word, rather than through independent words, in each language. His analysis showed that there was a significant correlation between the level of complexity of a society and the number of distinctions that are expressed inside the word. But contrary to what Joe, Piers, and Tom might expect, it was not the case that sophisticated societies tend to have sophisticated word structures. Quite the opposite: there is an inverse correlation between the complexity of society and of word structure! The simpler the society, the more information it is likely to mark within the word; the more complex the society, the fewer semantic distinctions it is likely to express word-internally.
Perkins’s study did not really make waves at the time, perhaps because linguists were too busy preaching equality to pay much heed. But more recently, the increased availability of information, especially in electronic databases of grammatical phenomena from hundreds of languages, has made it easier to test a much larger set of languages, so in the last few years a few more surveys of a similar nature have been conducted. Unlike Perkins’s study, however, the recent surveys do not assign societies to a few broad categories of cultural complexity but instead opt to use just one measure, which is both more easily determined and more conducive to statistical analysis: the number of speakers of each language. Of course, the number of speakers is only a crude indication for the complexity of social structures, but the fit is nevertheless fairly tight: at the one extreme the languages of the simplest societies are spoken by fewer than a hundred people, and at the other the languages of complex urban societies are typically spoken by millions. The recent surveys strongly support Perkins’s conclusions and show that languages of large societies are more likely to have simpler word structure, whereas languages of smaller societies are more likely to have many semantic distinctions coded within the word.
How can such correlations be explained? One thing is fairly clear. The degree of morphological complexity in a language is not usually a matter of conscious choice or deliberate planning by the speakers. After all, the question of how many endings there should be on verbs or nouns hardly features in party political debates. So if words tend to be more elaborate in simple societies, the reasons must be sought in the natural and unplanned paths of change that languages tread over time. In The Unfolding of Language, I showed that words are constantly buffeted by opposing forces of destruction and creation. The forces of destruction draw their energy from a rather unenergetic human trait: laziness. The tendency to save effort leads speakers to take shortcuts in pronunciation, and with time the accumulated effects of such shortcuts can weaken and even flatten whole arrays of endings and thus make the structure of words much simpler. Ironically, the very same laziness is also behind the creation of new complex word structures. Through the grind of repetition, two words that often appear together can be worn down and, in the process, fuse into a single word—just think of “I’m,” “he’s,” “o’clock,” “don’t,” “gonna.” In this way, more complex words can arise.
In the long run, the level of morphological complexity will be determined by the balance of power between the forces of destruction and creation. If the forces of creation hold sway, and at least as many endings and prefixes are created as are lost, then the language will maintain or increase the complexity of its word structure. But if more endings are eroded than created, words will become simpler over time.
The history of the Indo-European languages over the last millennia is a striking example of the latter case. The nineteenth-century German linguist August Schleicher memorably compared the sesquipedalian Gothic verb habaidedeima (first-person plural past subjunctive of “have”) with its cousin in modern English, the monosyllabic “had,” and likened the modern form to a statue that has been rolling around on a riverbed and whose limbs have been worn away, so that hardly anything remains but a polished stone cylinder. A similar pattern of simplification is evident also with nouns. Some six thousand years ago, the ancient ancestor, Proto-Indo-European, had a highly complex array of case endings that expressed the precise role of the noun in the sentence. There were eight different cases, and most of them had distinct forms for singular, plural, and dual, creating a mesh of almost twenty endings for each noun. But in the last millennia this elaborate mesh of endings largely eroded in the daughter languages, and the information that had previously been conveyed through endings is now mostly expressed with independent words (such as the prepositions “of,” “to,” “by,” “with”). For some reason, then, the balance tipped toward destruction of complex morphology: old endings eroded, while relatively new fusions materialized.
Can the balance between creation and destruction have anything to do with the structure of a society? Is there something about the way people in small societies communicate that favors new fusions? And when societies become larger and more complex, can there be something in the communication patterns that tilts the balance toward simplification of word structures? All the plausible answers suggested so far go back to one basic factor: the difference between communication among intimates and among strangers.
To appreciate just how often we who live in larger societies communi
cate with strangers, just try to do a quick count of how many unfamiliar people you talked to over the last week. If you live a normally active life in a big city, there would be far too many to remember: from shop assistants to taxi drivers, from phone salespeople to waiters, from librarians to policemen, from the repairman who came to fix the boiler to the random person who asked you how to get to such-and-such street. Now add up a second circle of people who may not be complete strangers but whom you still hardly know: those you only occasionally meet at work, at school, or at the gym. Finally, if you add to these the number of people you have heard without actively speaking to, on the street or on the train or on television, it will be obvious that you have been exposed to the speech of a vast crowd of strangers—all in just one week.
In small societies the situation is radically different. If you are a member of an isolated tribe that numbers a few dozen people, you hardly ever come across any strangers, and if you do you will probably spear them or they will spear you before you get a chance to chat. You know every single person you talk to extremely well, and all the people you speak to know you extremely well. They also know all your friends and relatives, they know all the places you frequent and the things you do.
But why should all that matter? One relevant factor is that communication among intimates more often allows compact ways of expression than communication among strangers. Imagine that you are speaking to a member of the family or to an intimate friend and are reporting a story about people you both know extremely well. There will be an enormous amount of shared information that you will not need to provide explicitly, because it will be understood from the context. When you say “the two of them went back there,” your hearer will know perfectly well who the two of them are, where “there” is, and so on. But now imagine you have to tell the same story to a complete stranger who doesn’t know you from Adam, who knows nothing about where you live, and so on. Instead of merely “the two of them went back there” you’ll now have to say “so my sister Margaret’s fiancé and his ex-girlfriend’s husband went back to the house in the posh neighborhood near the river where they used to meet Margaret’s tennis coach before she . . .”
Through the Language Glass: Why the World Looks Different in Other Languages Page 13