Dark Matter of the Mind
Page 35
But syntax became analyzable, and following this, recursion was able to play a role in the grammar. In this sense, recursion for McNeill, as for me (inter alia D. Everett 2005a, 2008, 2009a, 2010a, 2010b, 2012a, 2012b) is a nonessential, yet extremely useful component of language evolution (contra Hauser et al. 2002). Recursion is used to render the syntagmatic (string) paradigmatic (a slot), enabling speakers to pack more information into single utterances and, as I point out in D. Everett (2012b), making it easier to follow complex events via oral discourse. McNeill (2012, 223) cites Shelley’s “The Masque of Anarchy” to illustrate the syntagmatic to paradigmatic shift:
His big tears, for he wept full well,
Turned to millstones as they fell.
He then points out: “The rhyming ‘-ell’s, on the axis of combination, project a new semantic opposition.” Due to the rhyming, the words are highlighted—potentially leading to their separate storage and analysis as words, parts of utterances, introducing compositionality into grammar. This opposition is between paradigmatic parts of larger sentences (syntagmemes) that themselves derive from syntagmemes. Thus having provided a plausible scenario for the evolution of syntax, McNeill turns to consider the resultant spread of static grammar.
McNeill (2012, 92) suggests that “if we take the Tower of Babel story as a parable of migration, it is not as far-fetched as one might suppose. The insight is that migration leads to encounters and breeds diversification; and the further the migration, the more the encounters, and the greater the diversification.” Continuing with the Biblical metaphor, however, McNeill’s speculation has to overcome the “Who did Cain marry?” problem. Many students of the Bible find it curious that Cain was the son of the first man, Adam, yet he was able to find a wife in a neighboring city (Genesis 4:1–5:5). Where did the inhabitants of that city come from? By the same token, one must legitimately inquire as to who encountered whom in the migrations from Africa. If the first encounter of human languages following the rising of ur-language produced linguistic changes because of distinct languages coming into contact, then how did the first language change into the second? There must be change without contact. And in our discussion of blowgun manufacture among the Banawás, I showed how cultural change can take place without contact, just as linguistic change could and does (Schönpflug 2008). And in fact, specialists in diachronic linguistics tell us that this is correct—that language change can be internal (e.g., sociolinguistic shift) or external (via language contact). But if change can occur without contact, does McNeill’s hypothesis lose its force? No, because he also predicts that the trail of change should lead to greater and greater complexity the farther humans migrated from the geographical source of Ur-Language. The reasoning is that the earliest language would have had the simplest syntax, mapping meanings to temporal orders (i.e., iconically). But language contact would complicate that.
On the other hand, Thomason (2008) argues that contact does not necessarily make languages more complex. She presents a few cases where it can. But folks who work in language contact know that there are too many variables to say that a given language-contact situation will result in one of the languages becoming either more complex or more simple. Going further, theoretical linguists and typological linguists agree that there is no content to the claim “Language A is more complex than language B.” There is just no widely accepted means to evaluate linguistic complexity. About all we can say is, “Phenomenon x in language A is more complex than the same phenomenon in language B.” But even that is problematic because it assumes that we can say that this or that phenomenon in one language is the same as a phenomenon in another language, an idea I see little evidence for. That is, whether in smaller “societies of intimates” or larger “societies of strangers,” diversity of pronunciation, grammar, and meaning all enter languages for any number of reasons that the diachronic linguistics literature is full of. So even if McNeill were correct (2012, 92) that initial syntax mapped meanings onto temporal orders of events—and frankly, I see no evidence that this was ever the case—why would contact unidirectionally increase complexity? In fact, as Trudgill (2011) argues, contact can either simplify languages or complicate them. I am therefore unable to accept McNeill’s conjecture that languages with a greater history of contact and thus farther from Africa should be more complex than languages with a lesser history of contact (and how could one even measure that, in any case, apart from geography which need not necessarily entail greater contact?). It would also be necessary, were we to take McNeill’s proposal here seriously, to discuss the possible confounding factor of the “serial-founder effect” common to migrating populations and how this might also impact language (see, inter alia, Slatkin and Excoffier 2012). All of this is important for our theory here because it shows (i) the utility of gestures in communication; and (ii) the connection of gestures to speech that slowly aggregates over time and hence need not be “assumed” because it can be learned so naturally.
Gestures, Apperceptions, and Culture Research
At this point of the discussion of gesture and dark matter, I want to consider again briefly the possible role of gestures in language evolution in other species, in particular with regard to the possible language of Neanderthals and Denisovan hominins. McNeill says this (contrary to what he says about Homo sapiens):
Gesture-first may have existed in the two now extinct human lines, Neanderthals and Denisova hominin. It could have existed in our line and extinguished as well, but we have survived to evolve a new form of language, Mead’s Loop based on speech-gesture equiprimordiality. This new language, as we have seen could not have emerged from gesture-first and was a second origin. (2012, 165)
From this, McNeill speculates further that these two species lacked language and failed to survive in connection with that deficiency. To me (D. Everett, forthcoming) we know too little about these species to warrant such claims.15 One other speculation McNeill offers, a variant of “ontogeny recapitulates phylogeny,” is worth additional thought. It is that while children apparently use gesture-first in their initial language acquisition, this dies out before two years of age (McNeill 2012, 165). Then, at three or four years of age, GPs emerge, “suggesting that gesture-first had existed once phylogenetically but went extinct and was followed by a new form of language in which speech and gesture imagery merged into the unified packages inhabited by thought and being that we ourselves have now” (ibid.). Although, again, the significance of children’s acquisitional stages is somewhat conjectural, it is interesting enough to merit further investigation. It is a very positive indication of the quality and originality of his thinking, that even where McNeill goes out on very thin empirical limbs, his suggestions are interesting and worth considering further. Nevertheless, even this is equally supportive of the more parsimonious idea that we communicate holistically, using our entire bodies, not just mouths, hands, and brains (but see Iverson and Goldin-Meadow 1997). This resonance with our communicative actions brings an entire body—facial expressions, gestures, words, body orientation—into the act (Kita 2000). Indeed, an alternative to McNeill’s ontogeny-recapitulates-phlylogeny view immediately suggests itself. This is that the need or desire or instinct to communicate precedes words—obviously—and shows up initially in body movements (e.g., gestures) prior to learning lexical items. This makes sense, again, if communication is a holistic effort of the individual as a whole, not merely their mouths.
Other Research on Gestures
Though McNeill’s work is exceptionally fecund—in particular as it relates to language more broadly, culture, and dark matter—I want to transition now to briefly discuss other work on gestures, beginning with the work of Gianluca Giorgolo, of Carleton University and Oxford University. Giorgolo’s work is sophisticated and formal. As he says in the abstract of Giorgolo (2010):
The paper presents a formal framework to model the fusion of gesture meaning with the meaning of the co-occurring verbal fragment. The framework is based on the formalizati
on of two simple concepts, intersectivity and iconicity, which form the core of most descriptive accounts of the interaction of the two modalities. The formalization is presented as an extension of a well-known framework for the analysis of meaning in natural language. We claim that a proper formalization of these two concepts is sufficient to provide a principled explanation of gestures accompanying different types of linguistic expressions. This supports my gesture-in-culture understanding. The formalization also aims at providing a general mechanism (iconicity) by which the meaning of gestures is extracted from their formal appearance.
Giorgolo’s work is clearly within the general framework of formal linguistics, and thus ought to have a growing influence in the years to come on the integration of gesture into formal syntax and semantics studies. Many of the ways in which mainstream linguistics ought to incorporate gesture research are already being pioneered by Giorgolo, though it won’t be easy, as Giorgolo’s work underscores the complexity of our tacit knowledge of how to communicate.
Another prominent researcher whose work has been heavily influenced by McNeill is his former student, Justine Cassell, at Carnegie Mellon University. Cassell’s work on computational communication is ground-breaking. This research, summarized as follows, includes:16
developing the Embodied Conversational Agent (ECA), a virtual human capable of interacting with humans using both language and nonverbal behavior. More recently Cassell has investigated the role that the ECA can play in children’s lives, as a Story Listening System (SLS): peer support for learning language and literacy skills. And Cassell has also employed linguistic and psychological analyses to look at the effects of online conversation among a particularly diverse group of young people on their self-esteem, self-efficacy, and sense of community.
Once machines have human-like capabilities, can they be used to evoke the best communicative skills that humans are capable of, the richest learning? This is the goal of Cassell’s research: to develop technologies that evoke from humans the most human and humane of our capabilities, and to study their effects on our evolving world.
Cassell’s work ties into dark matter by showing how even machines can learn nonverbal behavior connected to communication. This fascinating result shows that learning of gestures certainly need not be innate. However, it does not escape the general constraints on communication and thinking by machines that we have noted several times above.
The final researcher I would like to mention who is doing research on the multimodal nature of human language is Dr. Jennifer Green. Green’s (2014) work is important to our discussion because it shows that communication ranges into the environment, such that humans can use various communicative strategies exploiting their enveloping ecology (see also L. Green 2013; Kohn 2013; Descola 2013). Green, at the University of Melbourne, works on Aboriginal sand stories. Her summary is worth citing at length because of the originality and significance of her focus of multimodal research for linguistics more broadly.
Sand stories from Central Australia are a traditional form of Aboriginal women’s verbal art that incorporates speech, song, sign, gesture and drawing. Small leaves and other objects may be used to represent story characters. This detailed study of Arandic sand stories takes a multimodal approach to the analysis of the stories and shows how the expressive elements used in the stories are orchestrated together.
Speakers of the Arandic languages of Central Australia have a range of semiotic resources or “systems” in their communicative repertoire. These include everyday language, spoken auxiliary languages, such as those used to encode respect for certain kin, sign language, the esoteric language of songs, and symbolic or graphic conventions used in sand stories and in various forms of Aboriginal art. Spontaneous gesture is also part of this complexity. In everyday communication it is the norm for several of these systems to coexist and be interdependent. The performance of Arandic sand stories (called tyepety in some Arandic languages) is a traditional form of visual storytelling in which co-speech graphics form an essential part. A skilled narrator of these stories incorporates multiple semiotic systems and uses the potentials within these systems to great creative effect. Speech, sign, gesture and drawing are employed, in sequence and in unison. As well as drawing on the ground, narrators may also use a variety of objects to establish a visual field in front of them, somewhat like a miniature stage-set. Leaves or sticks are used to represent story characters, and other small items which come to hand may be used to symbolize objects that are part of everyday life, such as shelters, shades, windbreaks and fire pits. The use of the ground for illustrative and explanatory purposes is pervasive in the environment of Central Australia where there is ample inscribable ground, and this attention to the surface of the ground arises partly from a cultural preoccupation with observing the information encoded on its surface. (J. Green 2014, i, 1ff)
Giorgolo, Cassell, and Green are on the cutting edge of gesture and multimodal aspects of language research that one hopes will gain momentum. All of this research demonstrates, as clearly as one could ever hope, how misguided it was in the early days of linguistic theory to set the sentence as the “start symbol” of the grammar, rendering it in effect the exclusive empirical domain of the majority of formal linguistics research. Grammar itself is multimodal, from the conversation down to the morpheme. This research also strongly reinforces the view of language as primarily communicative in function, using various subtools, of which sentences are but one. It also shows that without a multimodal perspective, sentence grammars are extremely limited in their contribution to our understanding of either interaction or cognition. Earlier, we referred to this as the “reification” of linguistics and discussed many ways in which the otherwise innocuous idealization of restricting analytical focus to sentences has arguably retarded progress in the understanding of human languages.
Cognitive scientists, anthropologists, philosophers, linguists, and others should be grateful for the careful, painstaking, long-term research into the multimodal nature and origin of human language—research that hopefully will not continue to be ignored in debates on the evolution, use, and structuring of human languages.
Perhaps the greatest lesson from gesture research is that it settles—to my mind, at least—the debate of whether language evolved for mental life or for communication. If McNeill is correct about the role of the growth point in the evolution of syntax, for example, then the preeminence of communication as the key to language evolution (over expression of thought) is unavoidable. In this regard, McNeill’s arguments are essential. They make the case that language is both static and dynamic and that therefore compositional meaning is not sufficient to provide humans with language. His theoretical understanding of the role of the growth point, the theory of mind, and Mead’s loop in human language are or ought to be transformational for the discipline. This is strong support for the interactional instinct as well.
Another way in which gesture studies come to bear on fundamental issues of human language and cognition is their relevance to what has come to be known as “embodied cognition” (Gibbs 2005; Chemero 2011; Lakoff and Nuñez 2001; etc.; see also C. Everett 2013b for very different ways in which language can affect cognition). For example, in a recent report on research at the University of Chicago, McNeill’s home institution, it is reported that the use of gestures—that is, embodying cognition—can contribute to cognitive acquisition of concepts as difficult as mathematics (Ingmire 2014).
What we learn from gestures in normal human languages is that they vary from culture to culture in their forms and meanings, but they are found apparently in all cultures. There are important reasons for their universal appearance since the consonant-vowel speech stream, word order, and other grammatical devices need help to get across the informational richness and nuances of communicative intentions. Prosody—the use of pitch, loudness, and intensity—is one way to help out, as we have seen. Gestures are a complementary form.
So far, we have seen nothing in
grammar, gestures, or other aspects of language that would lead us to believe that anything needs to be attributed to the genome of Homo sapiens that is specific to language. Cultural learning, statistical learning, and individual apperceptional learning complemented with human episodic memory seem up to the task, especially when considered with the arguments of D. Everett (2012a) and this essay. Nevertheless, the literature is rife with claims to the contrary, namely, that there are phenomena that can only be explained if language is acquired at least partially based on language-specific biases in the newborn learner. One of the sets of studies that has attracted a great deal of attention in this regard is the work of Goldin-Meadow on “homesigns,” gestural languages that emerge from the deaf children of non-signing parents or who otherwise, Goldin-Meadow claims, have no access to linguistic input.
HOMESIGNS
One thing is clear from all claims of the emergence (what Goldin-Meadow [2015] calls “resilience”) of language features in communities that are claimed to otherwise lack language—from Nicaraguan Sign Language to Al-Sayyid Bedouin Sign Language to creole languages—is that they begin simple and then become more complex over time with more social interaction. Often it takes at least three generations to develop a complexity roughly like many older languages. Thus, even if homesigns and the like are evidence for nativist or Bastian-like knowledge of language, such knowledge is very limited, perhaps no more than vague biases or solution spaces (which is one way to interpret, for example, the work of Berlin and Kay [1969] on the development of color terminology from the biological bases of color perception). More important, what marks the work of Goldin-Meadow and many others is what I consider to be an over-charitable interpretation of the linguistic aspects of the signs and a less charitable view of the cultural input the child receives, as well as the nature of the task the child is facing. Absent a serious consideration of either the task or the input, such claims of nativism are severely weakened.