Book Read Free

Before the Dawn: Recovering the Lost History of Our Ancestors

Page 26

by Nicholas Wade


  The Coming of the Indo-Europeans

  The Indo-European languages provide a leading test case for whether warfare or agriculture has been the dominant generator of new spread zones. The spread zone of Indo-European stretches from western Europe to the Indian subcontinent. The family includes extinct languages such as Latin, ancient Greek, Hittite and Tokharian, once spoken in northwestern China. The living descendants of proto-Indo-European include, besides English, the other Germanic languages (German, Dutch, Icelandic, Norwegian), the Slavic languages (Russian, Serbo-Croat, Czechoslovak, Polish), the Baltic languages (Latvian, Lithuanian), the Italic languages (Italian, French, Spanish, Portuguese) and the Celtic languages (Breton, Welsh, Irish).

  Where was the homeland of the speakers of proto-Indo-European? When did they live? How did they and their language spread? On these questions there exist two main schools of thought, one of which asserts that Indo-European spread by the sword, the other by the plough.

  In a series of papers written between 1956 and 1979, the archaeologist Marija Gimbutas identified the Indo-Europeans with the people who built the characteristic burial mounds, called kurgan in Russian, in the steppe area to the north of the Black Sea and the Caspian. The Kurgan people, benefiting from the domestication of the horse, started expanding from their homeland sometime after 4000 BC. By 2500 BC, in Gimbutas’s estimation, these warrior-pastoralists had reached the extremities of Britain and Scandinavia, and their language developed into its many descendant tongues that are spoken from Europe to India today.

  This view is supported on linguistic grounds by Ehret, who argues that if the Indo-Europeans had been peaceful farmers, many words to do with cereals should trace back to them. But Indo-European literatures are full of allusions to fighting. “We find preserved in early myths and legends almost everywhere among Indo-Europeans a glorification of battle, and particularly of death in battle, not entirely unknown elsewhere in the world, but of an intensity not often matched. We also find widely in these stories a division of society that singles out warriors as an elite group,” Ehret says.

  A rival hypothesis was proposed in 1987 by the archaeologist Colin Renfrew.261 He argued that the Indo-Europeans must have been the first farmers, and that they spread out from their homeland because the new agricultural techniques allowed the population to grow and therefore expand. Looking to the archaeological evidence bearing on the spread of agriculture, Renfrew placed the homeland of the first Indo-European speakers in Anatolia, now Turkey, the region where some of the earliest Neolithic settlements have been found. Because the Neolithic revolution started expanding through Europe around 9,500 years ago, Renfrew’s hypothesis required the Indo-European languages to have arrived several thousand years earlier than implied by Gimbutas’s Kurgan warrior theory and indeed than the date favored by most historical linguists.

  It seemed for a time that genetics might decide the issue. The first genetic insight into the peopling of Europe came from Luca Cavalli-Sforza of Stanford University. Working just with the protein products of genes, since DNA sequencing was not then available, he showed there was a genetic gradient, based on 95 genetic markers, that spread across Europe in a southeast to northwest direction. He and the archaeologist Albert Ammerman suggested the gradient was caused by Neolithic farmers moving across Europe in a slow wave of advance. Although the farmers were assumed to intermarry with the existing foragers, giving rise to the observed genetic gradient, the basic engine behind the wave of advance was assumed to be the population growth of the more numerous farmers.262

  This idea lent serious but not conclusive weight to Renfrew’s theory. Cavalli-Sforza noted that several other genetic gradients emerged from his data besides the one possibly associated with farmers from the Near East. Another gradient suggested a flow of genes westward from the steppe area above the Black Sea. This gradient “supports Gimbutas’ hypothesis,” he and his coauthors said, just as the first gradient supported Renfrew’s.263

  New assessments of population numbers have undercut Renfrew’s original idea that population growth was the engine of Indo-European expansion. The archaeologist Marek Zvelebil, of the University of Sheffield in England, writes that “Demographically, there is no evidence for population pressure sufficient to encourage first farmers to migrate, nor is there evidence for rapid population growth. Archaeological evidence does not record rapid saturation of areas colonized by Neolithic farmers, or demographic expansion [with one possible exception].”264

  But Renfrew’s theory could still be correct even if Indo-European-speaking farmers did not overwhelm the indigenous population of Europe. The farmers’ language could have been adopted by the European hunter-gatherers along with the new agricultural technology. In terms of population numbers, relatively few farmers entering Europe from the Near East could have had a catalytic effect in spreading both their language and their farming techniques. Perhaps they bought or captured extra wives from the Paleolithic inhabitants, and the next generation moved a few miles farther into Europe, also adding wives from the existing forager population. The farther this wave of farmers advanced into Europe, the more its Neolithic genes would get diluted with Paleolithic genes. But regardless of the shifting composition of the genetic pool, each generation of farmers would speak the language of its parents’ community, presumably Indo-European.

  In this way, the new farming techniques would have triggered a language change throughout the area to which they were applied, but with only a small number of Anatolian immigrants relative to the indigenous forager population. This could explain how it is that Europeans speak Indo-European languages yet carry only 20% or less of the genes of those assumed to have introduced the languages.

  Can Languages Be Dated?

  European genetics seems at present compatible with both theories of Indo-European spread. A more decisive test would be to put a date on when proto-Indo-European was spoken, since the two theories imply very different times of expansion. The Kurgan warrior expansion started some 6,000 years ago, the spread of farming from the Near East some 9,500 years ago.

  The dating of languages is not yet a settled science. One approach is to estimate the rate of historical change in a group of languages by analyzing similarities in vocabulary. Glottochronology, one version of this method, depends on estimating the percentage of cognates that two languages have in common. (Cognates are words derived from a common ancestor; apple is a cognate of German’s Apfel but not of French pomme.)

  The cognates that glottochronologists examine are not chosen randomly but belong to special vocabularies, drawn up by the method’s inventor, Morris Swadesh, from items that are particularly resistant to linguistic change. These include words for numbers, pronouns and parts of the body. A Swadesh list of 100 words is the most commonly used.

  In comparing two languages, a linguist will decide how many Swadesh-list words in each are true cognates with each other. The fewer cognates, the longer ago the languages diverged, and there are various methods of translating the percentage of matching cognates into a date of language split. In Ehret’s view, a 5% match indicates a language split of about 10,000 years ago, a 22% agreement means a divergence around 5,000 years ago, and two languages that parted ways only 500 years ago will retain 86% of their Swadesh-list vocabulary in common.

  Given the simplicity of the method, glottochronology can produce surprisingly plausible dates. But it has flaws. Linguists have put considerable effort into criticizing glottochronology, perhaps more than in trying to get it to work better. The result has been continuing disagreement among linguists as to whether it is a usable technique. At a conference held at Cambridge University in 1999, opinion ranged from one extreme to the other. Robert Blust, of the University of Hawaii, gave a paper explaining why the glottochronology kind of method “doesn’t work” for Austronesian languages, and James Matisoff, of the University of California, Berkeley, talked about “the uselessness of glottochronology for the subgrouping of Tibeto-Burman.” They were fol
lowed by Ehret, who explained how well glottochronology works for dating language splits in the Afroasiatic family.265

  Historical linguists are much more enthusiastic about a quite different dating technique called linguistic paleontology. The idea is to reconstruct words for objects of material culture in a language family and date the language by noting the times at which such objects first appear in the archaeological record.

  In many Indo-European languages, for example, there are words for wheel that are clear cognates of each other. Greek has kuklos (a word that is also the origin of circle), Sanskrit cacras, Tokharian kukäl, and Old English hweowol (initial “k”s in proto-Indo-European turn to “h” sounds in the Germanic family branch). Since the daughter languages of proto-Indo-European have cognate words for wheel, they must be derived from a common source, and linguists assert that this was the proto-Indo-European word for wheel, which they reconstruct as *kwekwlos (the asterisk indicates a reconstructed word).

  Now, the earliest known wheels in the archaeological record date from 3400 BC (5,400 years ago). The proto-Indo-European language must have split into its daughter languages sometime after this date, the argument goes, since how else could the daughter languages, spoken over an enormous region, all have cognate words for wheel?

  Similar arguments can be made for words like yoke, axle, and wool. Work on this issue by linguists like Bill Darden of the University of Chicago has encouraged many linguists in their belief that Indo-European was a single language as recently as 5,500 years ago and that its daughter languages could not have come into existence until after this date.266

  Linguistic paleontology is an ingenious exercise of the linguist’s craft. But it has two conceptual weaknesses. One is that a splendid new invention like the wheel is likely to spread like wildfire from one culture to the next, carrying its own name with it. Linguistic paleontologists claim they can spot such borrowed words. It’s true that “Coca-Cola” is easy enough to recognize as a foreign borrowing in many languages, but the more ancient the borrowing, the more a word may take on the coloration of its host language. One of the criticisms linguists level at glottochronology is that it is confounded by unrecognized borrowed words.

  Another weakness in linguistic paleontology is the danger of constructing highly plausible words that didn’t, in fact, exist. Related words for bishop exist in Greek (episkopos), Latin (episcopus), Old English (bisceop), Spanish (obispo) and French (evêque), from which the proto-Indo-European word *apispek for bishop could be reconstructed; but of course, in a language spoken at least 5,000 years ago, no such word existed. As for wheel, proto-Indo-European is thought to have had a word *kwel, meaning to turn or twist, of which *kwekwlos is assumed to be a duplication. But it could be that proto-Indo-European had no word for wheel, and what happened was that its daughter languages each independently used their inherited *kwel/turn words to form their own words for wheel. In which case proto-Indo-European could have been spoken thousands of years before the invention of the wheel.

  A New Date for Proto-Indo-European

  A better, more systematic way of dating languages has long been needed, and biologists hope they may have provided it by adapting one of their own methods for drawing phylogenetic trees. The favored approach is called a maximum likelihood method because it asks what is the most probable shape of tree to account for the observed data. In the case of language families, the data are each language’s list of Swadesh words, along with a designation of which are cognates and which are not.

  The idea of applying a maximum likelihood method to language history was laid out by Mark Pagel, an evolutionary biologist at the University of Reading in England. Pagel showed that with a list of just 18 words he could generate a maximum likelihood tree for 7 languages (Welsh, Romanian, Spanish, French, German, Dutch and English) that was the same as the tree constructed by linguists with purely linguistic techniques.267

  The method has now been further developed by Russell D. Gray, an evolutionary biologist at the University of Auckland in New Zealand. Gray has carefully analyzed the problems of glottochronology and adapted the method so as to address them. One of the problems is unrecognized borrowing. Unrecognized loan words make languages appear younger than they are. But they also knit the side branches of a language together, making a netlike structure. Netlike structures can be tested for and the offending words eliminated.

  Another problem that has vexed glottochronology is that languages may evolve at different rates. Both modern Icelandic and Norwegian are known to have evolved from Old Norse, which was spoken between AD 800 and 1050. Norwegian and Old Norse have 81% of their Swadesh list words as cognates, correctly implying a separation of 1,000 years ago. But modern Icelandic, which has been much more isolated, shares 99% of its words with Old Norse, wrongly implying the two languages separated only 200 years ago.268 Rate variation can be taken account of in the maximum likelihood approach, essentially by choosing trees with the minimum amount of variation necessary to fit known dates of language divergence.

  The mathematical techniques for addressing both word borrowing and variation in evolution rate were available because biologists had encountered the same two problems in drawing up trees based on DNA data. As with languages, some genes evolve at faster rates than others. And just as words may be borrowed instead of inherited, an organism may acquire genes through borrowing as well as by inheritance; bacteria, for instance, transfer packets of genes to each other, which is why they so quickly acquire genes for resistance to antibiotics.

  In one maximum likelihood approach currently favored by biologists, called the Bayesian Markov chain Monte Carlo method, the DNA sequences of various genes are fed into a computer that generates a large number of possible trees by which the genes might be related. The program samples the classes of tree that seem most promising (there are far too many for even the fastest computer to examine each one), and then repeats the whole process a large number of times. At each iteration there are fewer promising trees, and eventually the process will converge on a single, most probable tree to account for the data.

  With this powerful tree-drawing technique, Gray and his colleague Quentin Atkinson have constructed a family tree of Indo-European. For data, he relied on a 200 word Swadesh list for 84 Indo-European languages drawn up by the linguist Isidore Dyen, to which he added data from three extinct languages (Hittite and the two versions of Tokharian, known as Tokharian A and B).

  Gene trees can often be anchored in real time by matching a date from the fossil record to one of the tree’s branch points. The same can be done with maximum likelihood trees constructed for languages. Having found the statistically most likely tree to account for the Indo-European data, Gray then constrained certain branch points in the tree to fit attested historical dates for divergence of certain languages. Hittite must have been a separate language by 1800 BC, the date of the oldest known inscription. Greek must have been separate by 1500 BC, the date of the Linear B inscriptions. Latin and Romanian started to diverge when Roman troops withdrew south of the Danube in AD 270.

  Altogether Gray plugged in 14 known dates, constraining the tree to fit itself to the dates in the most statistically probable way. Because the branch lengths of the tree are proportional to elapsed time, anchoring the tree to historical events allows all the other branch points in the tree to be dated. Gray’s tree was published in Nature in November 2003, with a terse description of the rather complex methodology behind its construction.269 The first reaction of many historical linguists was that he had done nothing new because his tree of Indo-European was just like theirs. But that very fact, in Gray’s view, was the best possible validation of his method.

  FIGURE 10.2. A GENETICIST’S TREE OF THE INDO-EUROPEAN LANGUAGE FAMILY.

  A tree of Indo-European was constructed by Russell Gray and Quentin Atkinson using an advanced statistical method. Because the tree is anchored to 14 known dates of recent language origin, the dates of its ancient branch points can be estimated. F
igures show the years before the present at which languages split apart.

  According to the Gray-Atkinson tree, the original language, called proto-Indo-European by linguists, split 8,700 years ago into the two branches, of which the first led to Hittite and the second to all the other Indo-European languages. The early date assigned to proto-Indo-European suggests that it was the language of the people who introduced farming into Europe from the Middle East.

  English is a member of the Germanic group of languages, as are Dutch, Swedish and Icelandic. The Romance language family includes French, Italian and Spanish. Russian, Czech and Lithuanian are among the members of Balto-Slavic. Hittite, now extinct, was the language of the Hittite empire in what is now Turkey; Tokharian was spoken in western China.

  The novel feature of his tree was not its shape but its dates. They were very different from anything the linguists had imagined. The tree showed that proto-Indo-European was spoken before 8,700 years ago, the date at which it underwent its first split, when the branch leading to Hittite split off from all the rest. This date is nearly 3,000 years older than the 5,500 to 6,000 years ago date favored by many historical linguists for the breakup of Indo-European.

 

‹ Prev