Once Upon an Algorithm
Page 18
Recursive rules are typically used to generate some number of symbols through repeated applications.
The second rule, NewMeasure, is similar to the first and also allows the repeated generation of note nonterminals. In addition, however, it generates a bar terminal symbol , which indicates the end of one measure and the beginning of a new one.
Starting with a nonterminal such as melody, we can generate sequences of symbols by repeatedly replacing nonterminals through the application of grammar rules. For example, the first measure of “Over the Rainbow” can be produced as follows:
The labeled arrow in each line indicates which rule was applied. For example, first the rule NewNote is applied to generate the first note nonterminal. Then this nonterminal is immediately replaced by a terminal that represents the first note of the song. The specific rule from figure 8.2 that is used is indicated by the pitch and duration annotations of the rule (C for the pitch and ½ for a note that lasts for one half of a whole measure). The process then continues by using the NewMeasure rule to generate another note nonterminal, followed by a bar terminal that ends the current measure. The new note nonterminal is then also substituted in the next step by a note terminal.
The decisions about which rules to use determine the melody that is generated. Notice that the order in which the rules are applied is flexible to some degree. For example, we could have swapped the application of the rules and NewMeasure and still obtained the same sentential form:
We could have also swapped the order in which the two note nonterminals have been substituted and obtained the same result.
We can terminate the series of rule applications by applying the third rule (either LastNote or EndMelody), which eliminates the remaining melody nonterminal. (If we use LastNote, we also have to apply one more Note rule to eliminate the resulting note nonterminal.) The produced sequence of terminals is a sentence of the language, and the sequence of sentential forms and rule applications, from the initial melody nonterminal to the final sentence, is called a derivation. A sequence of terminals is a sentence of the language only if such a derivation exists, and deciding a sentence’s membership in a language boils down to finding a corresponding derivation. A derivation is proof that its resulting sequence of terminals is an element of the language.
One of the nonterminals of a grammar is designated as the start symbol, which represents the main category of sentences that are defined by the grammar. This nonterminal also gives the grammar its name. For example, since the purpose of this grammar is to define melodies, the start symbol of the grammar should be melody, and we can refer to the grammar as the melody grammar. The language defined by the melody grammar is the set of all sentences that can be derived from the melody start symbol.
Structure Grows on Trees
Having more than one language for a particular domain (such as staff notation and tablature for music) may seem peculiar, and one may wonder whether there are good reasons for this. Perhaps it would be better to have only one language that could serve as a standard notation? We appreciate variety in food, clothing, vacation destinations, and so on, but having to work with different languages is often an annoying and costly business. It may require translation and clarification, and it can lead to misunderstandings and mistakes. In the story about the tower of Babel the existence of many languages is considered a punishment. Efforts to create universal languages such as Esperanto were an attempt to eliminate the problems caused by having too many languages. And language standardization committees find themselves in a constant struggle to keep the diversification of technical languages under control.
Why do we have so many different languages, despite all the costs? Often a new language is adopted because it serves a particular purpose. For example, languages such as HTML or JavaScript can help represent information on the internet, an extremely useful thing to do for many businesses and organizations. In the domain of music, the tablature notation works well for many guitar players, in particular, for those who do not know the staff notation. With the advent of programmable music machines (such as sequencers and drum machines) the language MIDI (Musical Instrument Digital Interface) was developed to encode control messages to tell synthesizers to produce sounds. Here is the beginning of the MIDI version of “Over the Rainbow”:
4d54 6864 0000 0006 0001 0002 0180 4d54 ⋯
However, while effective in controlling a synthesizer, this representation is not very user friendly. It is not clear what the numbers and letters mean and how they relate to the music they are supposed to represent. It is convenient to keep staff notation and tablature for people’s use and to translate scores to MIDI if we want to feed the music to a synthesizer. We may also want to translate between staff notation and tablature. Of course, we want this translation between languages to be done automatically, using an algorithm, and we also want it to preserve the meaning of what is represented, that is, the music.
Translation between different notations is best accomplished via an intermediate representation, called abstract syntax. In contrast to concrete syntax, which defines the textual or visual appearance of a sentence, abstract syntax reveals the structure of a sentence in a hierarchical form. The concrete syntax of a music piece in staff notation is given by a sequence of note, bar, and other symbols, which is not adequate for capturing the hierarchical structure of a music piece. For that we can employ trees, which were used in chapters 4 and 5 to represent a family hierarchy and a dictionary. The abstract syntax is represented in an abstract syntax tree.
The transformation from concrete syntax into an abstract syntax tree can be achieved in two steps. First, the sequence of symbols in the concrete syntax is turned into a parse tree, which captures the hierarchical relationships between the symbols. Second, the parse tree is simplified into an abstract syntax tree. A parse tree can be constructed alongside the derivation of a sentence. Remember that in each step of a derivation a nonterminal is replaced with the RHS of a rule for that nonterminal. Now instead of replacing the nonterminal with the RHS, we simply add a node for each symbol of the RHS and connect it by an edge to the nonterminal. Thus, each derivation step extends the abstract syntax tree by adding new nodes to a nonterminal at the fringe of the tree. If we follow the previous derivation, we first obtain the following sequence of trees, which illustrates how the application of a rule extends a tree downwards:
These two steps are simple and straightforward. The next two steps lead to a somewhat unexpected result because the parse tree does not reveal the structure of the song. The parse tree does not show how a melody consists of measures and how measures consist of notes:
The lack of structure is a consequence of how the earlier grammar was defined: its rules expanded a melody simply into a sequence of notes. Therefore, it is not surprising that the parse tree does not mention any measures. By changing the grammar definition to account for measures, we can remedy the situation. Figure 8.3 shows parts of the parse and abstract syntax trees. In computer science trees are drawn upside down with the root at the top and the leaves at the bottom. Nonterminals are the locations of branching in the tree, and terminals occur as the leaves of the tree.
The parse tree in the figure is the immediate result of turning a derivation into a tree; it retains all the details of the derivation, even those parts that are not required for translating the tree into a different notation. In contrast, the abstract syntax tree ignores terminals and nonterminals that are not essential for the structure of the sentence. For example, the bar terminals are redundant, since the grouping of notes into measures is already captured by measure nonterminals. The note nonterminals are not needed either, since each one of them is always expanded into exactly one note terminal. Adding the note terminals directly as children to the measure nonterminals captures the same structural information and thus justifies the removal of the note nonterminals. The parse tree and the abstract syntax tree share a common structure: both roots are given by the melody nonterminal, and all the l
eaves are terminal symbols. The abstract syntax tree reflects the structure of the sentence more directly and is the basis for analysis and translation; it also facilitates the definition of a sentence’s meaning.
Figure 8.3 The structure of a melody represented by syntax trees. A syntax tree captures the structural result of a derivation but ignores details, such as which rules have been applied in which order. Left: A parse tree, which captures all the details of the derivation. Right: An abstract syntax tree, which omits unnecessary details and retains only the structurally relevant information.
The process of constructing a parse tree or an abstract syntax tree for a given sentence is called parsing. The relationships between sentences, syntax trees, and parsing are illustrated in figure 8.4. Parsing is a computation for which several different algorithms exist. While a grammar provides a clear definition of which sentences belong to a language, parsing also needs a strategy for turning a given sentence into a syntax tree. The difficulty in parsing lies in deciding which grammar rules to select during the analysis of a sentence. There are different strategies for parsing sentences of a grammar. We have seen one, called top-down parsing, which begins with the start symbol of the grammar and repeatedly applies rules to gradually build a syntax tree by expanding nonterminals. This process is repeated until the sentence appears as the sequence of terminals in the leaves of the tree. In contrast, bottom-up parsing tries to match right sides of rules in the sentence and applies rules in reverse. The goal is to build the syntax tree by adding the nonterminals of the left side of matched rules as parents to the tree. This is repeated until a tree with a single root node is obtained.
Figure 8.4 Parsing is the process of identifying the structure of a sentence and representing it as a syntax tree. Pretty printing turns a syntax tree into a sentence. Since an abstract syntax tree may omit certain terminal symbols (for example, bars), pretty printing cannot just collect the terminals in the leaves of the syntax tree but must generally employ grammar rules to insert additional terminal symbols. The parsing arrow is missing for the tablature notation because it is an ambiguous notation and thus doesn’t allow the construction of a unique abstract syntax tree.
The opposite process, turning a syntax tree into concrete syntax, is called pretty printing. This is a mostly straightforward computation, since the structure of the sentence is already given and provides guidance for the creation of the concrete representation.
With parsing and pretty printing we have the necessary tools to translate between languages. A translation between two languages that have the same abstract syntax is obtained by parsing a sentence in the first language and applying pretty printing for the second. For example, in figure 8.4 we can see how to get from staff notation to tablature by first parsing a sentence in the staff notation and then applying the tablature pretty printer to the resulting abstract syntax tree. To translate between languages that do not share the same abstract syntax requires an additional transformation between the abstract syntax trees.
Parsing is also the necessary first step in determining the meaning of a sentence, since it depends on the structure of the sentence, which is represented by its abstract syntax tree. This is a crucial observation that deserves repeating. To understand a sentence one has to first establish the structure of the sentence in the form of an abstract syntax tree.7 This is also true for listening to music. To understand the song “Over the Rainbow” one has to parse the sounds and identify different notes that give rise to the melody. Moreover, the grouping of notes into measures goes along with noticing emphasis on notes and making sense of phrases as part of the melody. Finally, the higher-level structuring of the song into chorus and verses provides a framework for recognizing repetitions and motifs.
Since syntax trees are a gateway to the meaning of sentences, the question arises whether parsing always succeeds and what happens to the understanding of a sentence if.8 The previous (non)sentence was lacking a syntax tree and thus had no clear meaning. But what happens if we can construct several different syntax trees for one sentence? This question is considered in chapter 9.
The Pharmacy Calls Back
You are at the pharmacy to pick up your medication, but there is a problem. The prescription lacks information about the dosage form—capsules or liquid? Since the prescription is ambiguous and lacks a precise meaning, it does not represent an algorithm. The prescription does not describe a set of effective steps that would allow the pharmacist to assemble a medication.
Ambiguities can occur for several reasons. In this case the lack of information is reflected in the abstract syntax tree by a nonterminal for the dosage form that is unexpanded. Since the pharmacist cannot execute the prescription provided by the physician, she has to call and ask for clarification. Once the ambiguity is resolved, she has an algorithm to execute and can successfully prepare your medication.
The separation of work between physician and pharmacist is successful because both use a shared language of prescriptions. But that alone is not enough. Both of them also have to understand the sentences of the language in the same way. This is not a trivial requirement, evidenced by the fact that abbreviations of drug names and dosing have actually led to dispensing mistakes. A naive idea for defining the meaning, or semantics, for a language is to explicitly assign a meaning for each possible sentence. This cannot work for languages that contain an infinite number of sentences, but even for finite languages such an approach is highly impractical.
Instead the semantics definition has to work in two related steps. First, it assigns semantics to individual words, and then it defines rules for how to derive the semantics for a sentence from the semantics of its parts. For example, in the form for ordering blood work, the semantics of an individual check box is to order one particular blood test. The semantics of a group of such check boxes is defined as a collection of orders, that is, to carry out all the orders that are obtained from the semantics of each individual check box in the group. This compositional approach requires the syntax definition of the language to make the structure of sentences available to the semantics definition, which it does in the form of abstract syntax trees.
The languages for blood work and prescriptions show that the semantics of a language can take different forms. For example, a prescription defines an algorithm for filling prescriptions, whereas the form for blood work provides instructions to draw blood and perform lab tests. The computers for executing the algorithms in these different languages need to understand the different semantics and consequently need different skill sets to do their work successfully. The form language for blood work also illustrates that even a single language can have different semantics, namely, instructions to draw a specific amount of blood and to perform different lab tests. Thus one sentence in one language can be executed differently, often by different computers (for instance, a phlebotomist and a lab technician) and produce completely unrelated results. This can be beneficial and supports the division of labor.
One important application of having different semantics for one language is the analysis of algorithms to detect and eliminate potential mistakes, which can help prevent the computation of incorrect results. Compare this with a document that can be read by different people to find typos or grammar mistakes, to judge the content, or to check for compliance with typesetting conventions. All of these tasks have different goals and can be performed before the document is published. Similarly, a pharmacist should double-check a prescription before filling it to prevent any unintended side effects for the patient.
Languages are everywhere. In addition to prescriptions, lab work, music, and numerous other specific domains, computer science itself is full of languages, not even counting the thousands of programming languages. We have already seen a language for describing grammars. There are also languages to define the semantics of languages, to define parsers and pretty printers, and many others. Languages are an essential computer science tool for representing data and computation. The effec
tive use of languages depends on having a precise semantics, just as the health and safety of patients depends on a clear semantics for prescriptions. Based on the song “Over the Rainbow,” I will illustrate in chapter 9 how the semantics of a language can be defined and some of the challenges involved in establishing such a definition.
9
Finding the Right Tone: Sound Meaning
In chapter 8 we saw that the score for the song “Over the Rainbow” is an algorithm that can be executed by musicians to produce music. The algorithm was written (that is, invented and coded) by the composer Harold Arlen in the language of staff notation. The syntax of this language can be defined by a grammar, which defines the appearance of a sentence as a score and its internal structure as an abstract syntax tree.
We also saw that one language can be defined by different grammars, which may at first seem like an oddity. However, differences in grammars lead to differences in the abstract syntax and represent different views of music structure. These differences matter because when Judy Garland wants to perform “Over the Rainbow,” she first has to parse the music notation to uncover the song’s structure. In other words, the meaning of a language builds on its abstract syntax.
This is why ambiguity poses a problem for any language. If a sentence can have more than one abstract syntax tree, then it is not clear which structure to follow when applying the semantics definition. Consequently, an ambiguous sentence may have more than one potential meaning, which is how the word ambiguous is usually defined.
This chapter looks more closely at the problem of ambiguity and addresses the question of how sentences acquire meaning. A key insight is that a systematic definition of language meaning hinges on the concept of compositionality, which says that the semantics of a sentence is obtained by combining the semantics of its parts in a systematic way, as defined by the sentence’s structure. This shows that the structure of a language plays a pivotal role in defining its meaning.