by Martin Erwig
The philosopher Ludwig Wittgenstein famously wrote: “The limits of my language mean the limits of my world.”1 In chapter 8 I use the language of music to illustrate concepts of language relevant to computer science and to show that the application of language has few limits in this world.
8
The Prism of Language
We use language effortlessly every day, unaware of the complicated mechanisms involved when language does its work. It is similar to walking. Once it is learned, we can easily wade through water, plow through sand, climb up stairs, and step over obstacles. The complexity of the task becomes apparent when we try to build a robot that mimics the behavior. The same is true of languages. Once you try to instruct machines how to use language effectively, you realize how difficult this actually is. Did you ever get frustrated using Siri or failing to find what you were looking for on Google? The Turing test for machine intelligence reflects this state of affairs perfectly. According to this test, a machine is deemed intelligent if a user cannot distinguish it from a human being in a conversation. In other words, language competence is used as a benchmark for artificial intelligence.
The phenomenon of language has been studied in several disciplines, including philosophy, linguistics, and sociology, so there is no single agreed-upon definition of what a language is. From a computer science point of view, a language is a precise and effective means to communicate meaning. Chapter 3 showed how signs form the basis for representation and thus can give meaning to computations by linking symbols to the concepts they stand for. While signs can represent merely individual concepts, a language defines the meaningful combination of signs into sentences and represents relationships between such concepts. Signs and languages are both representations, but while signs are sufficient to represent objects of interest for a particular discourse, it takes a language to represent computations through algorithms.
This chapter explains what a language is and how languages can be defined. I begin by illustrating how languages enable the representation of algorithms and thus computation, which accounts for the importance of languages as a subject in computer science. I then show how languages can be defined through grammars. A language consists of sentences, and while a sentence is often considered to be simply a list of symbols or words, this is too narrow a view. It is like looking at the drawing of a scene and only recognizing the objects in it but ignoring their relationships. Imagine a drawing that shows a man, a dog, and a stick. It matters whether the dog is carrying the stick and is walking toward the man or whether the man is holding the stick and the dog is biting the man. Similarly, each sentence of a language has an internal structure that plays an important role in assembling its meaning. To illustrate this aspect, I demonstrate how a grammar not only defines the external appearance of sentences as sequences of words (concrete syntax) but also their internal structure (abstract syntax).
This chapter is not about computation per se but about the description of computation, or rather about the description of the description of computation. What does that mean? An algorithm is a description of a computation, but this chapter is not about specific algorithms; it’s about how to describe algorithms. It turns out that such a description amounts to a language.
When we have to explain what a car or a flower is, we do not talk just about a few specific cars or flowers but rather about what all cars or flowers have in common, the essence of cars or flowers, so to speak. Individual specimens of cars and flowers will likely be mentioned as examples, but to talk about all possible cars or flowers requires the development of a model of what a car or a flower is. A model defines the type of all cars or flowers (see chapter 14).
Templates, which consist of a fixed structure with some variable parts, can be effective models for cars and flowers, since much of their inherent structure is fixed. While templates work well to describe some kinds of algorithms, for example, spreadsheets or the forms for ordering blood work, a template-based model would be too rigid and not expressive enough for algorithms in general. The flexibility required for the description of arbitrary algorithms is provided by languages.
In the story accompanying this chapter, language itself plays a prominent role. I chose music as the domain in which to study the concepts of language because musical languages are simple and structured enough so that the concepts to be explained are easily appreciable. The domain of music is narrow enough to lend itself to a specialized notation, yet it is universally understood and thus can be taken for granted while the discussion focuses directly on musical languages.
Regarding music as a language is not a new idea. For example, using ideas that go back to Pythagoras, the astronomer Johannes Kepler employed musical concepts to explain the laws governing the orbiting frequencies of the planets. Francis Godwin describes in the novel Man in the Moone (1638) how the inhabitants of the moon use musical language to communicate. In the movie Close Encounters of the Third Kind, humans communicate with aliens using a five-tone scale. Since music is a highly structured yet concrete and culture-transcendent medium, it is well suited to explain the concepts of language.
Take Note of the Melody
“Over the Rainbow” is one of the most popular songs of the twentieth century in the United States.1 It was composed by Harold Arlen2 for the movie The Wizard of Oz. At the beginning of the movie, Dorothy, played by Judy Garland, wonders whether there is a place without trouble, which leads to her singing the song.
If this were an audio book or a YouTube video, you could listen to the song right now. However, in the absence of an audio medium, how can I communicate the music to you? I can show you a representation of the music that can be interpreted to produce the melody. One such representation, widely used today, is the standard music notation, also called staff notation. Here is a small part of the song represented in this notation:
Don’t worry if you don’t know music notation. I explain the necessary concepts over the course of this chapter. At this point, it is sufficient to know that the balloon-shaped symbols represent individual notes of the melody, their vertical position determines the pitch of the note, and that the duration of a note is indicated by the kind of stem attached and whether the note oval is black or empty. Pitch and duration are the basic elements to form melodies. By singing or listening to the tune and following the score one can get a pretty good understanding of the meaning of the individual notes and the notation in general.
This score can be regarded as a sentence in the language of music notation. What’s more, the score is the description of an algorithm to produce music. Anyone who understands this language can execute this sentence, which effectively turns the musician into a computer. The analogy with computing is depicted in figure 8.1. The generated computation produces representations of sounds, which can take quite different forms. A vocalist will exert movement of her vocal chords, and a pianist, guitarist, or violinist will start and stop string vibrations, using keys and hammers, fingers, or a bow.
Figure 8.1 Playing (executing) a music score (an algorithm) generates a performance (a computation), which transforms silence into music. The performance is produced by a musician (a computer), for example, a person or a machine that can understand the music notation (language) in which the piece is written. See figure 2.1 on page 36.
The remarkable thing about music notation is that it facilitates the faithful reproduction of a melody, even by people who have never heard it before. The composer Harold Arlen, who invented the melody and then coded it by writing it down in music notation, could have simply sent the score to Judy Garland, and this would have been enough to ensure that she could sing the song properly. Without such a notation, the only way to share music would be through a recording or by hearsay (hearplay?), that is, to perform it in front of others who would then have to remember it to reproduce it. If you have ever played the telephone game, you know how unreliable such a process is.3
Pondering the design of the staff notation, one can notice how ar
bitrary it is. Even though a note is an indexical sign (see chapter 3) whose height reflects the pitch of the represented note, there seems to be no particular reason for having five staff lines, for the elliptical shape of notes, or for the particular length and direction of stems. This could all be done quite differently. In fact, it is sometimes done differently.
For example, the tablature notation for guitar is based on a very different representation. It shows how to interact directly with the strings of the guitar and thus avoids an abstract concept of notes. A number on a line indicates where (on which fret) to hold down the string while plucking it. One advantage of this notation lies in its directness: it allows beginners to quickly play tunes without first having to learn an abstract music notation. A disadvantage is that the notation does not precisely reflect the duration of notes. Moreover, it is limited to one kind of instrument.
The tablature notation reflects the physical structure of the guitar and is thus less arbitrary. For example, since a guitar usually has six strings, the design of the notation requires that it have six horizontal lines and that numbers appear only on those lines. By contrast, the number of five lines in the score notation is arbitrarily chosen and can be extended when needed. In fact, the very first note in the staff notation example does just that and employs an auxiliary line.
Tablature and staff notation are two different languages for representing the domain of music. Each language is defined by a set of rules that specify what kind of symbols can be used and how they can be combined. The rules define the syntax of the language; they are used to distinguish proper elements of the language, called sentences, from gibberish. A properly constructed music score or tablature is a sentence. Any notation that violates the rules cannot be properly executed as an algorithm. For example, it would make no sense to use negative numbers in tablature. A guitar player would not know how to interpret them and what to do. Similarly, if we had a note in the staff notation that was spread out over several staff lines, Judy Garland would not have known which note to sing.
A music performer can reproduce music only if the music notation is clear and unambiguous.4 This is another way of saying that the music notation, like any other algorithmic notation, must represent effectively executable steps that can be interpreted unambiguously. Thus, to ensure the effectiveness of any musical (and other algorithmic) language we first need a precise definition of what counts as a sentence of that language.
Grammar Rules
The syntax of a language can be defined with a grammar, which is understood as a set of rules for constructing sentences of the language. Using natural languages like Spanish or English to describe syntax is not a good alternative, because the descriptions tend to be long and imprecise. This is why math equations or physics laws are generally not expressed in prose but in some specialized notation. In fact, many scientific and technical fields have developed their own terminology and notation to enable effective communication about the ideas in their respective domains. And the same is true for the study of languages, whether it is in the field of linguistics or computer science, and one special notation to define languages is that of grammars.
One problem in describing a language is how to represent in a finite way a potentially infinite set of sentences. Science faces a similar challenge by having to describe an infinite set of facts through a small set of laws. Science’s solution to this problem is helpful in explaining some concepts of grammars. Consider, for example, the famous physics equation E = mc2, which relates the energy (E) that an object contains to its mass (m). The exact meaning of this equation is not really important here; what matters is that the equation contains a constant c and two variables, m and E, which can stand for arbitrary positive numbers. In particular, the two variables in this equation allow the computation of the energy content of any object, no matter how large or small, how simple or complicated. The use of variables enables the single equation to represent an infinite number of physical facts. A variable in an equation serves the same purpose as a parameter in an algorithm.
Similar to equations, a grammar also contains constants and variables. The constants are called terminal symbols, or terminals for short, and the variables are called nonterminal symbols, or nonterminals for short. The reason for these names will become clear shortly. A sentence of a language is given by a sequence of terminal symbols and does not contain any nonterminals, which play only an auxiliary role in constructing sentences. In the case of staff notation, terminals are, for example, individual notes appearing on staffs and bars that group notes into measures. Of course, the notation contains other terminal symbols, but notes and bars are sufficient to illustrate the grammar concepts needed to define simple melodies.
For example, the terminals that constitute the first measure of “Over the Rainbow” are the two note symbols and , followed by a bar . Similar to a constant in an equation, a terminal symbol represents a fixed, unchangeable part of a sentence, and a sentence corresponds to one particular scientific fact that results from an equation by substituting numbers for the variables.
In contrast, a nonterminal symbol acts like a variable in an equation that can take on different values. A nonterminal can be substituted by a sequence of other (terminal and nonterminal) symbols. However, substitution is not arbitrary. Instead, all the possible substitutions are defined by a set of rules. A nonterminal is like a placeholder and is typically used to represent a particular part of a language. The substitution rules define what those parts look like in terms of sequences of terminals. For example, in the definition of a simplified grammar for the staff notation we find a nonterminal note that can stand for an arbitrary note terminal symbol.5
Table 1.1
Grammars Equations Algorithms
Nonterminal Variable Parameter
Terminal Constant, operation Value, instruction
Sentence Fact Algorithm with only values
Rule Instruction
A grammar consists of several rules. Each such rule is given by a nonterminal to be substituted, an arrow to indicate the substitution, and a sequence of symbols to replace the nonterminal with. This replacement sequence is also called the rule’s right-hand side (RHS). A simple example is the rule . Here the RHS consists of just one terminal symbol. The correspondence between the components of grammars, equations, and algorithms is summarized in table 8.1. A grammar consists of rules, much like an algorithm is built out of individual instructions. For equations there are no corresponding components.
Since notes vary in pitch and duration, and since we have only one note nonterminal, we need a large number of rules for note, as illustrated in figure 8.2.6 The nonterminal note represents the category of individual notes, and this category is defined through all the rules for ; note.
In general, the RHS of a rule can contain several symbols, and those symbols can be terminals as well as nonterminals. A sequence that consists only of terminal symbols cannot be changed, since rules can only substitute nonterminal symbols. This also explains the two names terminals and nonterminals. A finished sequence of terminals is a sentence of the language described by the grammar rules. On the other hand, a sequence that still contains nonterminal symbols is not finished yet and is also not a sentence of the language. Such a sequence is also called a sentential form, since it generally describes a whole class of sentences that can be obtained from it by further substitution of the nonterminals. A sentential form is similar to an equation in which some but not all variables have been substituted or an algorithm in which some but not all parameters have been replaced by input values.
Sentential forms can be seen in the rules that define a melody. The nonterminal melody stands for all melodies that can be formed by the grammar. In a first approach, we can define melodies by the following three rules. (Each rule is given a name for later reference.)
Figure 8.2 Grammar rules defining the possible substitutions for the note nonterminal. Since an arbitrary note is represented by one nonterminal, a separ
ate rule is required for each combination of pitch and duration. Each column shows the rules for notes for a particular duration: left, half notes (notes that last one half of a measure); middle, quarter notes; right, eighth notes.
The first rule, NewNote, says that a melody starts with some note and is followed by another melody. This may seem strange at first, but if we consider that a melody is a sequence of notes, this rule says that a sequence of notes starts with a note and is followed by another sequence of notes. The form of the rule looks peculiar because it replaces a symbol by an RHS that contains that very symbol. Such a rule is said to be recursive. It seems that a recursive rule does not really achieve the substitution that it is supposed to accomplish. And if we had only recursive rules for melody, the grammar would indeed be problematic because we could never get rid of the nonterminal. However, in the example, the third rule, LastNote, is not recursive and can always be employed to replace a melody nonterminal by a note nonterminal, which could then be replaced by a terminal note symbol. Instead of the rule LastNote we could also employ the following alternative third rule that has an empty RHS:
This rule says to replace melody by an empty sequence of symbols (that is, nothing). Such a rule can be used to effectively remove a melody nonterminal from a sentential form.