To see what is meant more clearly, consider a single-syllable word (monosyllable) such as sat, whose structure would be as shown in figure 9.1. Since [a] is the most sonorous element, it is in the nucleus position. [s] and [t] are at the margins—onset and coda—as they should be. Now take the hypothetical syllables mentioned earlier, [bli] and [lbi]. Both [bli] and [lbi] have what phonologists refer to as “complex onsets” (see figs. 9.2 and 9.3): multiple phonemes in a single onset (the same can happen with codas as with pant, in which [n] and [t] form a complex coda). Now, according to the SSG, since [b] is less sonorous than [l], it should come first in the onset. This means that [bli] is a well-formed syllable. In other words, it rises in sonority to the nucleus and falls from the nucleus to the end.
Figure 9.1
Figure 9.2
Figure 9.3
Such preferences emerge even when the speakers’ native languages otherwise allow grammatical strings that appear to violate the SSG. Since the SSG is so important to the work on a phonological instinct, we need to take a closer look at it. To make it concrete, let’s consider one proposal regarding the so-called sonority hierarchy (as we will see, not only do many phoneticians consider this hierarchy to be a spurious observation, but it is also inadequate to account for many phonotactic generalizations, suggesting that not sonority but some other principle is behind Berent’s experimental results).8 This hierarchy is illustrated below (Selkirk 1984, 116), from most sonorant on left to least on right:
[a] > [e o] > [i u] > [r] > [l] > [m n ŋ] > [z v ð] > [s f θ] > [b d ] > [p t k]
The hierarchy has often been proposed as the basis for the SSG, which might also be thought of as organizing syllables left to right into a crescendo, peak, and decrescendo of sonority, going from the least sonorant (least inherently loud) to the most sonorant (most inherently loud) and back down, in inverse order, to the least sonorant (in fact, I was once a proponent of the SSG myself; see D. Everett 1995 for a sustained attempt to demonstrate the efficacy of this hierarchy in organizing Banawá syllable structure).
Without reviewing all of her experimental results (which all roughly show the same thing—preference in subjects for the SSG in some conditions), consider the following evidence that Berent brings to bear:
The evidence comes from perceptual illusions. Syllables with ill-formed onsets (e.g., lba) tend to be systematically misidentified (e.g., as leba)—the worse formed the syllable, the more likely the misidentification. Thus, misidentification is most likely in lba followed by bda, and is least likely in bna. Crucially, the sensitivity to syllable structure occurs even when such onsets are unattested in participants’ languages, and it is evident in adults [64, 67–70, 73] and young children. (2013b, 322)
Again, as we have seen, a licit syllable should build from least sonorant to most sonorant and then back down to least sonorant, across its onset, nucleus, and coda. This means that while [a] is the ideal syllable nucleus for English, a voiceless stop like [p, t, k] would be the least desirable (though in many languages—e.g., Berber—this hierarchy is violated regularly). Thus a syllable like [pap] would respect the hierarchy, but there should be no syllable like [opa] (though of course there is a perfectly fine bisyllabic German word opa “grandpa”). For the latter word, the SSG would only permit this to be syllabified as two syllables, [o] and [pa], with each vowel its own syllable nucleus. This is because both [o] and [i] are more sonorous than [p], so [p] must be either the coda or the onset of a syllable in which one of these two vowels is the nucleus.9 Moreover, according to the SSG, a syllable like [psap] should be favored over a syllable like [spap]. This gets us to the obvious question of why “misidentification” by Korean speakers is least likely in bna (even though Korean itself lacks such sequences); because, according to Berent, all humans are born with an SSG instinct.
I do not think anything of the kind follows. My criticism of Berent’s conclusions take the following form. First, I argue that there is no SSG either phonetically, grammatically, or even functionally. There is simply nothing there to have an instinct of. Second, even if some other, better (though yet undiscovered) principle than the SSG were appealed to, the arguments for a phonology instinct do not follow, as seen in my suggested alternative explanations of her results. Third, I offer detailed objections to every conclusion she draws from her work, concluding that there is no such thing as the “phonological mind.”
Let’s address first the reasons behind the claim that the SSG is not an explanation for phonotactics. The reasons are three: (i) there is no phonetic or functional basis for the generalization; (ii) the SSG that Berent appeals to is too weak—it fails to capture important, near-universal phonotactic generalizations; (iii) the generalization is too strong—it rules out commonly observed patterns in natural languages (e.g., English) that violate it. But then, if the SSG has no empirical basis in phonetics or phonology and is simply a spurious observation, it is unavailable for grammaticalization (i.e., to be incorporated as a grammatical principle) and cannot serve as the basis for the evolution of an instinct (though, of course, some other concept or principle might be; see below). One might reply that if the SSG is unable to explain all phonotactic constraints, that doesn’t mean that we should throw it out. Perhaps we can simply supplement the SSG with other principles. But, as we see, why accept a disjointed set of “principles” to account for something that may have an easier account based more solidly in phonetics and perception? Before we can see this, though, let’s look at the SSG in more detail.
The ideas of sonority and sonority sequencing have been around for centuries. Ohala (1992) claims that the first reference to a sonority hierarchy was in 1765. Certainly there are references to this in the nineteenth and early twentieth centuries. As Ohala observes, however, references to the SSG as an explanation for syllable structure are circular, descriptively inadequate, and not well integrated with other phonetic and phonological phenomena.
According to Ohala, both the SSG and the syllable itself are theoretical constructs that lack universal acceptance. There is certainly no complete phonetic understanding of either—a fact that facilitates circularity in discussing them. If we take a sequence such as alba, most phonologists would argue that the word has two syllables, and that the syllable boundary must fall between /l/ and /b/, because the syllable break a.lba would produce the syllable [a], which is fine, but also the syllable [lba] which violates the SSG ([l] is more sonorous than [b] and thus should be closer to the nucleus than [b]). On the other hand, if the syllable boundary is al.ba, then both syllables respect the SSG: [al] because [a] is a valid nucleus and [l] a valid coda, and [ba] because [b] is a valid onset and [a] is a valid nucleus. The fact that [l] and [b] are in separate syllables by this analysis means that there is no SSG violation, which there is in [a.lba]. Therefore SSG guides the parsing (analysis) of syllables. However, this is severely circular if the sequences parsed by the SSG then are used again as evidence for the SSG.
The SSG is also descriptively inadequate because it is at once too weak and too strong. For example, most languages strongly disprefer sequences such as /ji/, /wu/, and so on, or, as Ohala (1992, 321) puts it, “offglides with lowered F2 and F3 are disfavored after consonants with lowered F2 and F3.10,11 Ohala’s generalization here is vital for phonotactics crosslinguistically; yet it falls outside the SSG, since the SSG allows all such sequences. This means that if a single generalization or principle, of the type Ohala explores in his article, can be found that accounts for the SSG’s empirical range plus these other data, it is to be preferred. Moreover, the SSG would then hardly be the basis for an instinct, and Berent’s experiments would be merely skirting the edges of the real generalization. As we see, this is indeed what seems to be happening in her work. The SSG simply has no way of allowing a [dw] sequence, as in dwarf, or [tw] in twin, while prohibiting [bw]. Yet [dw] and [tw] are much more common than [bw], according to Ohala (though this sequence is observed in some loanwords, e.g., bwana), facts entirely missed by the SSG.
Unfortunately, Berent neither notices the problem that such sequences raise for the SSG “instinct,” nor experimentally tests the SSG based on a firm understanding of the relevant phonetics. Rather, she assumes that since the SSG is “grammaticalized” and now an instinct, the phonetics no longer matter. But this is entirely circular. Here Berent’s lack of phonetic experience and background in phonological analysis seem to have led her to accept the SSG based on the work of a few phonologists, without carefully investigating its empirical adequacy. This is not a problem in some senses—after all, her results still show speakers do prefer some sequences and disprefer others—but it is a crucial shortcoming, as we see below, when it comes to imputing these behaviors to “core knowledge” (knowledge that all humans are hypothesized to be born with) that would have to have evolved. It hardly needs mentioning, however, that a spurious observation of a few phonologists is not likely to be an instinct.
To take another obvious problem for the SSG, sequences involving syllable-initial sibilants are common crosslinguistically, even though they violate the SSG. Thus the SSG encounters problems in accounting for English words like spark or start. Since [t], [k], [p]—the voiceless stops—are not as loud/sonorous as [s], they should come first in the complex onset of the syllable. According to the SSG, that is, [psark] and [tsart], should be grammatical words of English (false), while [spark] and [start] should be ungrammatical (also false). Thus the SSG is too strong (incorrectly prohibits [spark]) and too weak (incorrectly predicts [psark]) to offer an account of English phonotactics. Joining these observations to our earlier ones, we see not only that the SSG allows illicit sequences such as /ji/ while prohibiting perfectly fine sequences such as /sp/, but it simply is not up to the task of English phonotactics more generally. And although many phonologists have noted such exceptions, there is no way to handle them except via ancillary hypotheses (think “epicycles”) if the basis of one’s theory of phonotactics is the SSG.
I conclude that Berent’s phonology instinct cannot be based on the SSG, because the latter doesn’t exist. She might claim instead that the instinct she is after is based on a related principle or that the SSG was never intended to account for all of phonotactics, only a smaller subset, and that phonotactics more broadly require a set of principles. Or we might suggest that the principles behind phonotactics are not phonological at all, but phonetics, having to do with relative formant relationships, along the lines adumbrated by Ohala (1992). But while such alternatives might better fit the facts she is invested in, a new principle or set of principles cannot rescue her proposal. This is because the evidence she provides for an instinct fails no matter what principle she might appeal to. To see why, let’s consider what Berent (2013b, 320) infelicitously refers to as “the seven wonders of phonology.” She takes all of these as evidence for “phonological core knowledge.” I see them all as red herrings that merely underscore faulty reasoning than as any evidence for a phonological mind or an instinct of any kind. These “wonders” are:
1. Phonology employs algebraic rules.
2. Phonology shows universal constraints or rules (e.g., the SSG).
3. Phonology shows shared design of all phonological systems.
4. Phonology provides useful scaffolding for other human abilities;
5. Phonology shows regenesis—phonological systems (e.g., sign languages) created de novo always draw on the same principles; they never emerge ex nihilo.
6. Phonological constraints such as the SSG show early ontogenetic onset.
7. Phonology shows a unique design, unlike other cognitive domains.
I believe that every one of these “wonders” is insignificant and tells us nothing about language, offering no evidence whatsoever, individually or together, for a “phonological mind.” Let’s consider each in turn.
“Algebraic rules” are nothing more than the standard rules that linguists have used since Panini (fourth century BCE). For example, Berent uses an example of such a rule that she refers to as the “AAB rule” in Semitic phonologies. Thus, in Semitic languages, consonants and vowels mark the morphosyntactic functions of words, using different spacings and sequences (internal to the word) of Cs or Vs based on conjugation, or binyanim—the order of consonants and intercalated vowels. An example of such variables are illustrated below:
Modern Hebrew
CaCaC katav
“write”
niCCaC niršam
“register”
hiCCiC himšix
“continue”
CiCeC limed
“teach”
hitCaCeC hitlabeš
“get dressed”
In other languages, such functions would most frequently be marked by suffixes, infixes, prefixes, and so on. So, clearly, taking only this single, common example, variables are indeed found in phonological rules.
Now, in Berent’s AAB rule (more precisely, it should be stated, as a constraint “*AAB,” where * indicates that the sequence AAB is ungrammatical) is designed to capture the generalization that the initial consonants of a word cannot be the same. Thus a word like *sisum would be ungrammatical, because the first two consonants are /s/ and /s/, violating the constraint. The constraint is algebraic because A and B are variables ranging across different phonological features (though A must be a consonant). But calling this an algebraic rule and using this as evidence for an instinct makes little sense. Such rules are regularly learned and operate in almost every are of human cognition. For example, one could adopt a constraint on dining seating arrangements of the type *G1G1X,—that is, the first two chairs at a dinner table cannot be occupied by people of the same gender (G), even though between the chairs there could be flower vases and the like. Humans learn to generalize across instances, using variables frequently. Absolutely nothing follows from this regarding instincts.
Universality is appealed to by Berent as further evidence for a phonology instinct. But as any linguist can affirm (especially in light of controversies over how to determine whether something is universal or not in modern linguistic theory), there are many definitions, uses, and abuses of the term universality in linguistics. For example, some linguists, such as Joseph Greenberg (1966) and N. Evans and Levinson (2009), argue that for something to be meaningfully universal, it actually has to be observable in every language. That is, a universal is a concrete entity. If it is not found in all languages, it is not universal. That is simple enough. But some linguists, such as Berent, Jackendoff (2003), and Chomsky (1986), prefer a more abstract conception of universal such that for something to be universal, it need only be available to human linguistic cognition. This set of universal affordances is referred to as the “toolbox.” I have argued against this approach in many places, for being imprecise and often circular (in particular D. Everett 2012a, 2012b). But in any case, Berent clearly follows the notion of “universal” advocated by Chomsky and Jackendoff, among many others. Such universals need not be observed in all languages. Thus Berent would claim that the SSG is universal, not because it is obeyed in all its particulars in every language—like me, she would recognize that English allows violations of the SSG—but because her experiments with speakers of various languages show that they have preferences and so on that seem to be guided by knowledge of the SSG, even when their own native languages do not follow the SSG or have a simple syllable structure that is by definition unable to guide their behavior in experiments. If a Korean speaker, for example, shows preference for or perceptual illusions with some onset clusters and not others—in spite of the fact that there are no such clusters in Korean (and thus he or she could not have learned them, presumably), then this shows the universality of the SSG (as part of the linguistic toolbox). But there is a huge and unjustifiable leap taken in reasoning from this type of behavior to the presence of innate constraints on syllable structure. For example, there are phonetic reasons why Korean (or any) speakers prefer or more easily perceive, let us say, [bna] sequences rather than [lba], even th
ough neither sequence is found in Korean. One simple explanation that comes to mind (and highlighted by phoneticians, though overlooked by many phonologists), is that the sequence [bna] is easier to perceive than [lba] because the interconsonantal transition in the onset of the former syllable produces better acoustic cues than in the second. Berent tries to rule out this kind of interpretation by arguing that the same restrictions show up in reading. But reading performance is irrelevant here for a couple of reasons. First, we know too little about the relationship between speaking and reading cognitively to draw any firm conclusions about similarity or dissimilarities in their performance to use as a comparison, in spite of a growing body of research on this topic. Second, in looking at new words, speakers often try to create the phonology in their heads, and so this “silent pronunciation” could guide such speakers’ choices. Everyone (modulo pathology) has roughly the same ears matched to roughly the same vocal apparatus. Thus, although phonologies can grammaticalize violations of functionally preferable phonotactic constraints, one would expect that in experiments that clearly dissociate the experimental data from the speaker’s own language, the functionality of the structures (e.g., being auditorily easier to distinguish) will emerge as decisive factors, accounting for speakers’ reactions to nonnative sequences that respect or violate sonority sequencing, and so on. In fact, there is a name for this, though with a somewhat different emphasis, in optimality theoretic phonology (Prince and Smolensky [1993] 2004; McCarthy and Prince 1994): the “emergence of the unmarked.” So there is nothing special I can see about the universality of these preferences. First, as we have seen, the SSG is not the principle implicated here, because there is no such principle. It is a spurious generalization. Second, local phonologies may build on cultural preferences to produce violations of preferable phonetic sequences, but the hearers are not slaves to these preferences. Let us say that a language has a word like lbap. In spite of this, the phonetic prediction would be that in an experimental situation, the speakers would likely prefer blap and reject lbap, since the former is easier to distinguish clearly in a semantically or pragmatically or culturally neutral environment. In other words, when asked to make judgments in an experiment about abstract sequences, it is unsurprising that the superiority of the functionality of some structures emerges as decisive. Such motivations reflect the fact that the ear and the vocal apparatus evolved together. Therefore, what Berent takes to be a grammatical and cognitive universal is neither, but rather a fact about perceptual ability, showing absolutely nothing about a phonology instinct.
Dark Matter of the Mind Page 41