by Martin Erwig
Both restrictions of physical dictionaries, the need for sorted keys and the fixed size and content, do not apply to electronic dictionaries. A widely used dynamic dictionary is Wikipedia,5 which not only lets you use it but also extend and update information in it. In fact, Wikipedia’s content has been assembled by its users, a remarkable success of crowdsourcing and a testimony to the power of collaboration. If Sherlock Holmes and Watson were working on the The Hound of the Baskervilles case these days, they might very well, instead of sending letters back and forth, use a wiki6 for maintaining information about the suspects and the case.
The dynamic nature of the dictionary is not limited to only inserting and removing information about suspects; it also allows updating that information. For example, the fact that the escaped convict, Selden, is Eliza Barrymore’s brother is not known when Selden becomes a suspect and has to be added later to an already existing dictionary entry for him. But how can this be done? Since we have only three operations for adding, removing, and finding entries in a dictionary, how can we update information once it is stored with a key in the dictionary? We can achieve this by combining operations: find the entry using the key, take the returned information, modify it as needed, remove the entry from the dictionary, and finally add it back with the updated information.
We can add new operations to the set data type in a similar fashion. For example, if Sherlock Holmes maintains a set of people who benefit from Sir Charles’s death, he might want to add a motive for some of the suspects. To do that he could compute the intersection of the set of beneficiaries with the set of suspects. Or perhaps he wants to identify new suspects by determining the beneficiaries who are not in the set of suspects. To do that he could compute the set difference between the two sets. Assuming that the set data type has an operation for reporting all elements in a set, an algorithm for computing the intersection or set difference of two sets can simply go through all elements of one set and for each element check whether it is in the second set. If it is, it reports the element as a result of set intersection. If it is not, it reports the element as a result of set difference. Such computations are quite common, since they correspond to the combination of predicates. For example, the intersection of the suspect and beneficiary sets corresponds to the predicate “is a suspect and a beneficiary,” and the difference between beneficiaries and suspects corresponds to the predicate “is a beneficiary and not a suspect.”
Finally, we need a data structure to implement dictionaries so that we can actually compute with them. Since the dictionary data type differs from the set data type only by associating additional information with elements, most data structures for sets, including arrays and lists, can be extended to also implement dictionaries. This is the case for all those data structures that explicitly represent each element of the set, because in that case we can simply add the additional information to the key. This also means that any data structure for implementing a dictionary could be used to implement a set; one can simply store an empty or meaningless piece of information together with the key.
When Order Matters
As mentioned in chapter 3, a computation can only be as good as the representation it is working with. Thus Sherlock Holmes and Watson would like the set of suspects to accurately reflect the state of their investigation. In particular, they want the set to be as small as possible (to avoid wasting effort on false leads)7 but also as large as needed (to avoid having the murderer go undetected). But otherwise the order in which suspects are added to or removed from the set doesn’t really matter to them.
For other tasks the order of items in a data representation does matter. Consider, for example, the heirs of the deceased Sir Charles. The difference between the first and second in line is the entitlement to a heritage worth one million pounds. This information is not only important for determining who gets rich and who doesn’t but also provides Holmes and Watson with clues about potential motives of their suspects. In fact, the murderer Stapleton is second in line and tries to kill the first heir, Sir Henry. While the succession of heirs matters, the ordering of heirs is not determined by the time people enter the collection of heirs. For example, when a child is born to the bequeather, it does not become the last in line but takes precedence in the ordering over, say, nephews. A data type in which the ordering of elements is not determined by the time of entry but by some other criterion is called a priority queue. The name indicates that the position of the element in the queue is governed by some priority, such as the relationship to the bequeather in the case of heirs, or the severity of the injury of patients in an emergency room.
In contrast, if the time of entry does determine the position in a collection, the data type is called a queue if the elements are removed in the order they were added or a stack if they are removed in reverse order. A queue is what you go through in a supermarket, a coffee shop, or a security check at the airport. You enter at one end and come out at the other. People are served (and then leave the queue) in the order they entered. A queue data type therefore establishes a first come, first serve policy. The order in which elements enter and leave a queue is also called FIFO for first in, first out.
In a stack, on the other hand, elements leave in the opposite order in which they were added. A good example is a stack of books on your desk. The topmost book, which was placed on the stack last, has to be removed before any other book can be accessed. When you have a window seat in an airplane, you are effectively on the bottom of a stack when it comes to going to the bathroom. The person in the middle seat who sat down after you has to leave the row before you can, and they have to wait for the person in the aisle seat, who sat down last and is on top of the stack, to get out first. Another example is the order in which Hansel and Gretel place and visit the pebbles. The pebble that is placed last is the one they visit first. And if they use the improved algorithm for avoiding loops, the last-placed pebble is the first they pick up.
At first, using a stack data type seems like a strange way of processing data, but stacks are really good for staying organized. For Hansel and Gretel the stack of pebbles allows them to systematically go to the place where they have been before, which ultimately leads all the way back to their home. In a similar way, Theseus escaped the Minotaur’s labyrinth with the help of a thread given to him by Ariadne: unraveling the thread amounts to adding it inch by inch to a stack, and coming out of the labyrinth by rewinding the thread amounts to removing the thread from the stack. If in our daily lives we are interrupted in one task by, say, a phone call, which is then interrupted by, say, someone knocking at the door, we mentally keep those tasks on a stack, and we finish the last one first and then go back and pick up where we left the previous one. The order in which elements enter and leave a stack is therefore also called LIFO for last in, first out. And to complete the mnemonics, we could call the order in which elements enter and leave a priority queue HIFO for highest (priority) in, first out.
As for the set and dictionary data types, we would like to know which data structures can be used to implement the (priority) queue and stack data types. It is easy to see that a stack can be well implemented by a list: by simply adding and removing elements always at the front of a list we implement the LIFO ordering, and this takes only constant time. Similarly, by adding elements at the end of a list and removing them at the front, we obtain the FIFO behavior of a queue. It is also possible to implement queues and stacks with arrays.
Queues and stacks preserve the ordering of the elements during their stay in the data structure and thus exhibit a predictable pattern of elements leaving them. Waiting in the security line at an airport is usually rather boring. Things get more exciting when people cut into line or when special lanes for priority customers exist. This behavior is reflected in the priority queue data type.
It Runs in the Family
The heirs of Sir Charles Baskerville provide a nice example of a priority queue. The priority criterion that defines each heir’s posit
ion in the queue is the distance to the deceased Sir Charles. But how is this distance to be determined? A typical simple inheritance rule says that the children of the deceased are first in line to inherit in order of their age. Assuming the wealth of the deceased is itself inherited, this rule implies that if the deceased has no children, his or her oldest siblings will continue the succession, followed by their children. And if there are no siblings, the succession continues with the oldest aunts and uncles and their children, and so on.
This rule can be illustrated algorithmically: all family members are represented in a tree data structure that reflects their ancestor/descendant relationships. Then an algorithm for traversing the tree constructs a list of the family members in which the position of a person defines their priority for inheriting. This priority would then also be the criterion used in a priority queue of heirs. But if we have a complete list of heirs that is correctly sorted, why do we need a priority queue? The answer is, we don’t. We need the priority queue only if the complete family tree is not known at first, or when it changes, for example, when children are born. In that case the family tree would change, and the list computed from the old tree would not reflect the correct inheritance ordering.
The information provided in The Hound of the Baskervilles indicates that the old Hugo Baskerville had four children. Charles was the oldest child and inherited from Hugo. The next-oldest brother, John, had one son named Henry, and the youngest brother, Rodger, had a son who is Stapleton. The story mentions that Hugo Baskerville had a daughter, Elizabeth, who I assume to be the youngest child. The names in the tree are called nodes, and as in a family tree, when a node B (such as John) is connected to node A above it (such as Hugo), B is called A’s child, and A is called B’s parent. The topmost node in a tree, which has no parent, is called its root, and nodes that don’t have any children are called leaves.
The inheritance rule applied to the Baskerville family tree says that Charles, John, Rodger, and Elizabeth should inherit from Hugo, in that particular order. Since the rule also says that children should inherit before siblings, this means that Henry is in line before Rodger, and Stapleton precedes Elizabeth. In other words, the ordered list of heirs should be as follows:
Hugo → Charles → John → Henry → Rodger → Stapleton → Elizabeth
This inheritance list can be computed by traversing the tree in a particular order, which visits each node before its children. The traversal also visits the grandchildren of older children before visiting younger children (and their grandchildren). The algorithm for computing an inheritance list for a node in a tree can be described as follows:
To compute the inheritance list for a node N, compute and append the inheritance lists for all of its children (oldest to youngest), and place the node N at the beginning of the result.
This description implies that the inheritance list for a node that does not have children simply consists of that node alone, so the inheritance list for that tree is obtained by computing the inheritance list for its root. It might seem peculiar that the algorithm refers to itself in its own description. Such a description is called recursive (see chapters 12 and 13).
The following illustrates how the algorithm works by executing it on an example tree, one with a few more members. Assume Henry had two children, Jack and Jill, and a younger sister, Mary:
If we execute the algorithm on this tree, starting with Hugo, we have to compute the inheritance list for each of Hugo’s children. Beginning with the oldest child, the inheritance list for Charles consists of only himself, since he has no children. The situation is more interesting for John. To compute his inheritance list, we need to compute the inheritance lists for Henry and Mary and append them. Henry’s inheritance list consists of his children plus himself, and since Mary has no children, her inheritance list contains only herself. So far we have computed the following inheritance lists:
Node Inheritance List
Charles Charles
John John → Henry → Jack → Jill → Mary
Henry Henry → Jack → Jill
Mary Mary
Jack Jack
Jill Jill
The inheritance list for a node always starts with that node itself, and the inheritance list for a node without children is a list that only contains that node. Moreover, the inheritance lists for Henry and John demonstrate how the inheritance lists for nodes with children are obtained by appending their children’s inheritance lists. The inheritance lists for Rodger and Elizabeth are computed in a similar way, and if we append the inheritance lists for Hugo’s four children, adding him at the front, we obtain the following ordered list of heirs:
Hugo → Charles → John → Henry → Jack → Jill → Mary → Rodger → Stapleton → Elizabeth
The inheritance algorithm is an example of a tree traversal, an algorithm that systematically visits all the nodes of a tree. Since it proceeds top-down, from the root to the leaves, and visits nodes before their children, it is called a preorder traversal. Think of a squirrel that scours a tree for nuts. In order not to let any nut go undetected, the squirrel has to visit every branch of the tree. It can do that by employing different strategies. One possibility is to ascend the tree in levels (of height) and visit all branches on one level before visiting any branch on the next level. A different approach is to follow each branch to the end, no matter how high, before switching to another branch. Both strategies ensure that every branch will be visited and every nut will be found; the difference is the order in which branches are visited. This difference doesn’t really matter for the squirrel, since it will collect all nuts either way, but the order of visiting the nodes in a family tree does matter as far as inheritance is concerned, since the first person encountered will get it all. The preorder traversal of the inheritance algorithm is a tree traversal of the second kind, that is, following a branch until the end before switching.
A major use of data types and their implementation through data structures is to gather data at one point in a computation and use it later on. Since searching for an individual item in a collection is such an important and frequently used operation, computer scientists have spent considerable effort on investigating data structures that support this operation efficiently. This topic is explored in depth in chapter 5.
Further Exploration
The examples in The Hound of the Baskervilles have illustrated how signs work as representations and how data types and data structures organize collections of data.
Many Sherlock Holmes stories are replete with signs and their interpretation, including the analysis of fingerprints, footprints, and handwriting. Even in The Hound of the Baskervilles, there are several additional signs that were not discussed. For example, Dr. Mortimer’s stick contains scratches that indicate its use as a walking stick, and it also has bite marks indicating that Dr. Mortimer has a pet dog. Or, examining the anonymous message sent to Sir Henry, Sherlock Holmes concludes that based on their typeface, the letters used must have been cut out from the Times newspaper. He also concludes from the size of the cut marks that a small pair of scissors must have been used. For a detective, the conclusions to be drawn can differ when signs have multiple interpretations or when some signs are ignored. This is illustrated in Pierre Bayard’s Sherlock Holmes Was Wrong, where he argues that Sherlock Holmes identified the wrong person as a murderer in The Hound of the Baskervilles. Of course, many other detective stories, movies, and popular TV series, such as Columbo or CSI, provide examples of signs and representations and demonstrate how they are used in computing conclusions about unsolved crimes.
Umberto Eco’s The Name of the Rose is the story of a murder mystery that takes place in a monastery in the fourteenth century. It embeds semiotics into a detective story. The protagonist William of Baskerville (the last name is actually a signifier for the Sherlock Holmes story), who is in charge of the murder investigation, is open-minded in his interpretation of signs and thus encourages the reader to play an active role in
analyzing the events in the story. More recently, Dan Brown employed many signs in the popular novels The Da Vinci Code and The Lost Symbol.
In Gulliver’s Travels, by Jonathan Swift, we are told about a project in the grand academy of Lagado that aims at avoiding the use of words, and thus signs, because they only stand for things. Instead of uttering words, it is suggested that people carry objects that they produce whenever they need to talk about them. Swift’s satire illustrates the importance of signs in contributing to the meaning of languages. Similarly, Lewis Carroll’s books Alice’s Adventures in Wonderland and Through the Looking-Glass contain examples of clever wordplay that sometimes challenge the traditional role of words as signs.