The Science of Language
Page 33
Appendix VIII: Variation, parameters, and canalization
The naturalistic theory of language must speak not only to ways in which languages are the same (principles, UG), but also to ways in which languages can differ. A descriptively and explanatorily adequate naturalistic theory of language should have the resources available to it to describe any given I-language and, to do that, it must have the theoretical resources to describe any biophysically possible I-language.
Some differences between I-languages are, however, beyond the reach of naturalistic study. People can and do differ in how they pair ‘sound’ information with ‘meaning’ in their lexicons (Chomsky 2000). To one person, the sound “arthritis” is associated with JOINT AILMENT; to another, perhaps with LIMB DISEASE (or with whatever else a person understands the sound “arthritis” to mean). These pairings are from the point of view of the natural scientist simply irrelevant; they are examples of what Chomsky calls “Saussurean arbitrariness.” Natural science must ignore the pairings because they are conventional, social, or idiosyncratic. They are not due to natural factors, to the way(s) that nature “cuts things at the joints,” paraphrasing Plato. This is not to say that these differences are unimportant for practical purposes: if you want to communicate easily with another person, your pairings better overlap the other person's. It is only to say that the pairings are a matter of choice, not nature, and so irrelevant to a natural science of language.
To put the point in another way, the natural science of language focuses on what is innate. So it focuses on the lexicon, but not on the pairings found in a particular person's head. It focuses (or should) on the sounds and meanings that are available for expression – that is, for pairing in the lexicon, and for appearance at relevant interfaces. These sounds and meanings are innate in that they are built into whatever kinds of acquisition mechanisms make them available, at the same time limiting the ones that are available, for a mechanism can only yield what it can. Its possible variants are built into it.
There seem to be limits on the sounds that are available within any specific natural language. Chomsky remarks (1988) that while “strid” could be a sound in English, it cannot in Arabic. On the other hand, “bnid” is available in Arabic, but not in English. To deal with what is or is not available in the class of I-languages that are sometimes called “natural languages,” one appeals to parameters. Parameters, it is assumed, are built into the acquisition mechanisms. They might be biological in nature (built into the genome), or due to other factors, those that Chomsky labels “third factor” contributors to acquisition/growth mechanisms.
Far more attention is paid, however, to the parametric differences in ‘narrow syntax,’ to the different ways available for different natural languages (taken here to be classes of I-languages that are structurally similar) to carry out computations. The conception of parameters and their settings has changed since their introduction with the Principles and Parameters program in the late 1970s and early 1980s. The original conception – and the one that is easiest to describe and illustrate, so that it is useful for explication in texts like this – held that a parameter is an option available in a linguistic principle (a universal ‘rule’). The “head” parameter is often mentioned. Beyond lexical features and morphemes, the next most basic units of language are phrases. Phrases are understood to consist of a “head” (a lexical item of a specific category, such as N(oun) or V(erb)) and a “complement” which may be another phrase, such as an adjective/adverb phrase. So one might have a Verb Phrase (VP) amounting to wash slowly with a V head followed by an adjectival/adverbial complement, an AP reduced to an A. Rather, that is what you would find in English and many other languages. That is because these are “head first” languages. In others, such as Japanese or Miskito, the order is reversed. Phrases in these languages also have the structure of a head and a complement, but in these languages, the head appears after the complement. Stating the relevant parameter with options included ‘in’ the parameter, it is:
P is ‘phrase,’ and the variables X and Y can have the values N, V, or A, and (on earlier views) P (for pre/postposition). The dash (–) is unordered, allowing X to be before YP, or YP to be before X. The dash allows for initial (X-YP) or final (YP-X) heads. In this sense, the parametric options are ‘in’ the formal statement of the “phrase principle.”
More recent discussion of parameters reconceives them in two significantly different ways. One is due to the suggestion that all parametric differences between languages are found in what are called “functional categories.” A functional category amounts to any category of (oversimplifying here) lexical item that indicates a difference not in content, but in the ‘shape’ of a grammar. Relevant differences arise in verb phrase structure, in complementation (that . . ., which . . .), and in the forms of ‘auxiliary’ operation that determine subject–verb agreement, and the like. A functional category might be expressed in different languages in different ways. In English, some prepositions (such as of) express a grammatical or functional difference, others (such as under) a lexical content one. By assuming that some lexical items express functional categories or grammatical information alone (and not lexical ‘content’ information), and by assuming further that parametric differences between languages are different ways the language faculty has available to it to meet the “output conditions” set by the systems with which it must ‘communicate’ at its interfaces, it came to be assumed that parametric differences are lodged in “function words,” rather than in the principles themselves, as was assumed in the early account.
The other major line of development is due largely to the work of Richard Kayne (2000, 2005), who pointed to many more parameters than had been thought to be required before. He called them “microparameters,” thereby turning the older parameters into “macroparameters.” Microparameters detail relatively fine-grained differences between aspects of what are usually claimed to be closely related languages (“closely related” is not always clearly defined). The microparameter thesis came to be wedded also to the idea that all parametric differences are located ‘lexically’ (although some might be unpronounced). That idea also became a significant assumption of much work in the Minimalist Program, thereby effectively abandoning the early idea that parameters are ‘in’ principles. Discussion continues, and focuses on topics that one might expect: can macroparameters be analyzed in terms of microparameters? Is there room for an unqualified binary macroparametric distinction between languages such as that expressed in the head parameter? If so, how does one conceive it? And so on. The answer to the next-to-last question, by the way, seems at the moment to be “no,” but perhaps major features of the old conception can be salvaged. On that, see Baker (2008). The discussion continues, but I do not pursue it further here. Chomsky adds something in the 2009 supplement (see pp. 54–55), however – among other things, the possibility that there are infinitely many parameters.
Parameters continue as before to have a central role in discussions of language acquisition or growth. Imagine a child growing up in an English-speaking environment, and take the headedness macroparameter as an example. He or she – or rather, his or her mind, for this is not a conscious decision – will set the “headedness” switch/parameter to “head initial.” The same child's mind in a Miskito-speaking environment will automatically set the parameter to “head final.” The details of how the setting take place are a matter for discovery; for interesting discussion, see Yang (2004) and the discussion in the main text.
Canalization is a label for what is on the face of it a surprising phenomenon. Humans and other organisms seem to manage to develop into a relatively uniform ‘type’ despite different environments, ‘inputs,’ and genetic codings. It is not at all clear what explains the observations, although there are suggestive ideas. One is that “control” or “master” genes play a role. Waddington, who first used the term, spoke of epigenetic factors influencing development. Another possible factor is the limited s
et of options made available, given non-genetic physiochemical, ‘processing,’ and other constraints. Since these limit possible mutation too, it would not be surprising if they limited possible organic structures and operations. Canalization is discussed further in the text.
Appendix IX: Simplicity
Seeking simplicity (elegance, austerity, beauty, optimality . . .) for one's theory of a system or phenomenon, it has often been pointed out, is a crucial aspect of scientific investigation and has a prominent place in natural science methodology. Some aspects of it are discussed elsewhere in this volume: an insistence on seeking ‘atoms’ or what Newton called “corpuscles,” Galileo's focus on inclined planes and not on how plants grow, and Goodman's nominalism and constructive systems, as well as his effort to find a completely general conception of simplicity. Simplicity of several sorts (theory-general, computational, optimization, efficiency) is exhibited – remarkably, given earlier work and the complications that were a part of the ‘format’ picture of grammars found until the introduction of the principles and parameters framework – in Chomsky's Minimalist Program conception of the language faculty. This conception, as indicated, suggests that linguistic structure and possible variants in it amount to Merge and the developmental constraints built into parameters, where these could be based on the genome or third factor contributions. It also suggests that the human language system is a perfect (or as close to perfect as possible) solution to the problem of linking sounds and meanings over an unbounded range – or at the least, putting together complexes of concepts that we can think of as thoughts, or perhaps as language's contribution to thoughts. If the minimalist approach continues to make progress, we can with some confidence say that the language faculty appears to be an oddity among biological systems as they are usually conceived. They are usually kludges: “bricolage,” in François Jacob's phrase (cf. Marcus 2008). They are seen to be the result of the accidents of history, environment, and adventitious events: they are functioning systems that come out of millennia of gradual change, as conceived in the usual selectional story about evolution. However, the language faculty appears to be more like a physical system, one that exhibits elegance and simplicity – for example, atomic structure and the structured table of elements that it underwrites.
Getting this kind of result might have been a desideratum in Chomsky's early efforts to construct a theory of language, but it could not have been more than a dream at the time. The focus of early work (e.g., Aspects of the Theory of Syntax) was to find a theory of language that would be descriptively adequate – that is, provide a way to describe (with a theory/grammar) any possible natural language – while also answering the question of how a child could acquire a given natural language in a short time, given minimal input which is often corrupt and without any recourse to training or ‘negative evidence.’ The acquisition issue – called in more recent work “Plato's Problem” because it was the problem that confronted Plato in his Meno – was seen as the task of providing an explanatorily adequate theory. Taking a solution to the acquisition problem as the criterion of explanatory adequacy might seem odd, but it is plausible: if a theory shows how an arbitrary child can acquire an arbitrary language under the relevant poverty of the stimulus conditions, we can be reasonably confident that the theory tracks the nature of the relevant system and the means by which it grows in the organism. Unfortunately, though, early efforts to meet descriptive adequacy (produce a theory of language with the resources that make it able to describe any of the thousands of natural languages, not to mention the indefinitely large number of I-languages) conflicted with meeting explanatory adequacy. If we all had a single language and its structure were simple so that we could understand how it developed quickly in the human species, and if our theory of it and of how it develops in an individual within the relevant time constraints were fully adequate, we would have a theory that meets both requirements. However, this counterfactual has nothing to do with the facts.
It was thought at the time that the only route available to the theoretician is to conceive of the child as being endowed with something like a format for a possible language (certain conditions on structure, levels of representation, and possible computations) and a relative optimization routine. The child, endowed with a format for a possible language and given input from his or her speech community, would automatically apply this routine so that the rules of his or her language faculty somehow converged on those that contribute to speech behaviors in the relevant community. The format would specify ways of ‘chunking’ linguistic data (word, phrase) and tying it together (rule, general linguistically relevant computational principles such as what was called “the principle of the cycle” . . .) and the routine would yield a measure of simplicity in terms of, say, number of rules to encompass the data. This routine, one that is internal to the system as conceived by the theory, yields a way of speaking of how one grammar is better than another, neighbor one: ‘better’ is cashed out in terms of a relative simplicity measure. Chomsky called this routine an “evaluation” procedure. The child's mind is conceived to have some devoted (language-specific) relative optimization principle available, one that within the relevant time period comes up with the (relatively) best theory (grammar) of the rather thin data set that the mind is offered. It was an intuitively obvious way to conceive of acquisition at the time for – among other things – it did appear to yield answers and was at least more computationally tractable than what was offered in structural linguistics, where the alternatives found in structural linguistics could not even explain how that child managed to get anything like a morpheme out of data. But the space of choices remained far too large; the approach was theoretically implementable, but completely unfeasible. It clearly suffers in comparison to the new version. Moreover, it blocked progress by making it very difficult to conceive of how the specification of a format, and of UG as conceived this way, could have developed in the species. UG – thought of as that which is provided in the way of language-specific genetic information – would have to be rich and complex, and it was difficult to see how something both devoted and rich and complex could have developed in the human species.
To solve the acquisition problem and meet the condition on explanatory adequacy (understood as solving Plato's Problem), it is far better to have a theory that provides something like very few universal, invariant principles, plus language-universal acquisition algorithms that automatically return the ‘right’ language/grammar, given a set of data. That would be a selection procedure, a procedure that yielded a single solution without weighing alternatives. It – or a reasonably close approximation – became a prospect with the introduction in the late 1970s and early 1980s of the principles and parameters view of the language faculty. Intuitively, the child is provided through UG at birth with a set of principles – grammatical universals or rules common to all languages. Among these principles are some that allow for options. The options are parameters. The parameters – conceived originally as options ‘internal’ to a principle – can be ‘set’ with minimal experience (or at least, with the amount of experience actually afforded children in the relevant developmental window). (See Appendix VIII on parameters and their role.) Setting them one way as opposed to another would determine one class of possible natural languages as opposed to another. This was real progress on the road to meeting explanatory adequacy. Moreover, with the Minimalist Program's growing acceptance of the idea that Merge is all that one needs in the way of an exceptionless principle, and the further suggestion that parameters might even amount to general constraints on development constituted by – and set by – the non-biological factors included in what Chomsky calls “third factor” contributions to language growth and the shape that a language takes, the burden placed on the language-specific instruction set included in the human genome becomes less and less. Maybe the only genetically specified language-specific contribution is Merge. If this were the case, it would be much easier to understand how languag
e came to be introduced into the species at a single stroke. It would also make it easy to understand how and why language acquisition is as quick and automatic as it appears, while allowing for different courses of development. And it would allow linguists such as Chomsky to begin to raise and provide tentative answers to questions such as what is biologically crucial to language. We would begin to have answers to ‘why things are the way they are.’
With this in mind, where Chomsky speaks of biolinguistics (a term first introduced by Massimo Piattelli-Palmarini in 1974 as the title for a joint MIT-Royaumont Foundation conference, held in Paris), perhaps we should speak instead of “biophysical linguistics” or perhaps “bio-compu-physical linguistics,” so that it becomes clear that the set of possible natural languages and I-languages depends not just on genetic coding but also on other factors – all, though, conceived of as somehow built into nature and the ways in which it permits development/growth. And if UG is thought of as what we are provided by biology alone (i.e., genomic specification), perhaps UG becomes nothing but the specification for Merge.
Interestingly, the principles and parameters framework seems to allow us to abandon the theory-internal conception of simplicity that played such an important role in early efforts. If the child's mind knows what the switches or options are, relative optimization of simplicity plays no role. You can think of the language acquisition matter as solved (at least for narrow syntax) and turn to other explanatory matters. That is no doubt part of the reason why in a recent paper Chomsky speaks of minimalism as going “beyond explanation” – part of the reason, not the whole, for third factor considerations appear to begin to allow answers to questions concerning why principles X rather than alternatives Y, Z . . . Explanation in the sense of solving Plato's Problem remains crucial, of course, but with parameters, solving Plato's Problem no longer need be the single, central goal of linguistic explanation.