During and immediately after World War II, scientists and engineers embarked on ambitious plans in information design, communication systems, and artificial intelligence. One goal was to calculate, control, and mobilize the nation’s resources to wage modern war. During the Cold War, a massive war with the Soviet Union seemed imminent. The mobilizations of World War II had also created an optimistic, even utopian sense that societies could be quickly improved and modernized via scientific methods. On the civilian side of things, researchers tried to reduce the unpredictable clutter of human activity to formulas. By quantifying the human world, they figured they could predict it. By teasing out the rules that govern human behavior, they could engineer better schools, prisons, and cities.
Mahl’s work was part of this wave of interest in communication and information theory that sought more advanced “command and control” technologies. Take, for instance, the problem of how to hit a moving airplane with an anti-aircraft gun, which involved calculating not only the arc of the shell, the firing speed of the gun, and the speed and direction of the plane but the ways that smaller, faster fighter planes and bombers might zigzag to avoid getting hit. To solve the problem, Norbert Wiener of MIT developed sophisticated algorithms that relied on real-time data from radar (itself a new technology) to tweak a set of tracking orders and compensate for the zigzag.*26 As the scope of research into the dynamics of communication broadened from an individual’s speaking in isolation to the interaction between individuals, the focus also shifted from the self to the transmission of information signals. Disfluencies were treated as noise in signals that disrupted communication.
Social engineering put language under the microscope as well, and scientists began to observe a range of communicative behaviors in humans and animals. Some tried to figure out the universal algorithms that underlie all languages, in order to build computers that could translate from one language to another. Others tried to teach human language to primates or investigated the communication of dolphins.
Frieda Goldman-Eisler was a psychiatrist who pioneered the study of pauses and hesitating in speaking—later dubbed “pausology.” A former student of Freud’s from Austria, she arrived in London in 1940 and worked menial jobs at Maudsley Hospital, a prominent psychiatric hospital in Camberwell, England, until she was eventually appointed as a therapist there. A year later she met and married Paul Eisler, an Austrian engineer who was trying to find a market for the electronic circuit board he’d invented. She published widely and was eventually appointed to the faculty of University College, London.
One of Goldman-Eisler’s strengths was her curiosity and willingness to try new ideas. (One of her graduate students, Brian Butterworth, now a neuropsychologist at University College, said she tried to instill the same values in him when they ate lunch at the Indian restaurant around the corner: she always ordered curries for Butterworth, expecting him to try hotter and hotter concoctions.) In the 1940s, she discovered the work of Eliot Chapple, a Harvard anthropologist who was timing human behavior to explore the basic temporal principles of human interactions, such as how long two individuals talked to each other, how much time they spent saying certain things, and how much time they were silent. He’d invented a device called the “chappleograph,” a typewriter through which a paper tape moved at a constant rate; the observer pressed certain keys when events started and ended, then went back and calculated the time interval from the marks. By measuring gestures, head nods, and body postures as well as speaking, he was able to capture the texture of interaction more closely, especially its use of silences.
Goldman-Eisler first used the chappleograph to measure breathing behavior, especially among psychiatric patients, the population she knew best. This led her to look at their pauses in speaking, which were related to their thinking. What were they thinking, anyway? How could anyone ever figure that out? In her formulation, speaking was a peripheral activity related to a central activity: thought. “If activity in conversation, if vocal action, is a peripheral phenomenon,” she wondered, “might not absence of activity indicate the presence of central activity?”
In other words, pauses in speech pointed to thinking—not, as had been previously thought, a lack of thinking, a gap between two thoughts, some psychic tension, or embarrassment. The pauses were parts of the cycles of thinking and speaking that involved sequences of planning what to say, then actually saying them. Goldman-Eisler discovered these cycles when she asked subjects to describe some pictures. On the first attempt, half of what her subjects said came out in phrases less than three words long. Longer phrases were fairly rare: those with ten or more words occurred only 10 percent of the time. Perhaps, she thought, this would change if the speakers had more practice. Indeed, subjects who rehearsed used fewer short phrases and fewer medium-length phrases. Yet they increased their long phrases (those with ten words or more) a mere 5 percent. Even with practice, speech remained stubbornly fragmented—two-thirds of speaking, Goldman-Eisler concluded, came in chunks of less than six words apiece.*27
We commonly think that some people speak faster than others, but Goldman-Eisler found that someone who sounds like a fast speaker simply uses shorter pauses. Speed has more to do with the amount of time left between sounds than how quickly the sounds are spoken, which is a fairly constant ten to twelve sounds per second.
She also found that speakers paused strategically, at moments in sentences that presented them with more choices. The more predictable the next word or phrase was, the less likely a speaker was to pause. The notion that the length of a pause, whether it was silent or filled, equaled the length of the mental delay itself became the basis for observations of many types of talkers. In the 1990s, Stanley Schachter, a psychologist at Columbia University, looked at how professors spoke in lectures. He wanted to investigate the notion that speakers are disfluent when they’re deciding, choosing, or selecting the next increment of language they’re going to say. Since academic knowledge varies across fields, and since professors communicate that knowledge with language, their speaking should be different. One particularly common form of speaking is lecturing, which was also easy for Schachter to study.
Indeed, in their lectures humanities professors said “you know” and “uh” more frequently than social science professors—4.85 “uhs” per minute to the social scientist’s 3.84. Before the social scientists erupt in huzzahs, however, they should know that they blundered more than the hard scientists, 3.84 per minute to 1.39. On the whole, humanists said “uh” at a higher rate than the others.*28
Humanists also said “uh” more frequently, at 4.76 per one hundred words, and they said it more frequently than the social scientists (2.67 per hundred words), who outstripped the hard scientists (1.47 per hundred words).
Why was this the case? What might these patterns say about the people who choose different fields? Schachter interviewed the professors in their offices, where they all blundered at rates similar to each other; only in lectures did their blundering profiles diverge. Schachter surmised that it was the content of their lectures and ultimately the shape of knowledge in the professors’ respective disciplines that led them to blunder as they did, not who they were as individuals.
Thus, it was no surprise that the humanists blundered more: their field gave them more thoughts and ideas to express, and they hesitated and paused more often because they had more options for self-expression. Their hesitations didn’t reflect their level of certainty; it reflected their creativity. By contrast, what the scientists had to say seemed to be more cut and dried. It wouldn’t be the first time that passionate disfluency would be romanticized, or science stereotyped as cold and uncreative. To be fair to both sets of speakers, Schachter doesn’t rule out the possibility that professors deliberately train in the acceptable level of disfluency in their field—in other words, whether or not the humanities are more creative, it’s important that practitioners act (and sound) as if they are. As odd as it sounds, community norms shaped the blund
ering.
Goldman-Eisler’s idea that delay often indicates active decision making found a darker expression in the work of Basil Bernstein, a British sociologist, who applied it to a problem in educational reform in the 1960s. It was a time when educators in the United States and the United Kingdom were attempting to predict how a child’s language performance could predict his or her success in school. In the United States, psychologists focused on the supposedly “impoverished” language of African American children, while in the United Kingdom, they looked at class differences.
To Bernstein, longer pauses seemed to be a sign of greater language resources. After all, one paused to think, and part of thinking involved planning what to say based on one’s stash of words. He hypothesized that working-class students would pause less than middle-class students. His findings confirmed this notion. Even among the middle-class boys, the ones with the higher IQ paused longer. Because they had more complex sentence structures and larger vocabularies at their mental fingertips, Bernstein supposed that the boys had to pause to riffle through their verbal riches. By contrast, the working-class boys had an empty cupboard, so they could speak more quickly. Here was the language deficit that endangered their educations, Bernstein concluded. If we fill their verbal larders, they’ll be more successful in school.
This conclusion set off a firestorm that’s never really subsided. Though he was well-meaning, Bernstein had overlooked a crucial point that had also escaped Goldman-Eisler: the rates of speech, hesitations, and length of pauses may have depended as much on the norms of the boys’ communities as on their own mental processes. What Bernstein counted had more to do with environment than with intelligence, vocabulary, or cognitive style. His conclusions pointed out the pitfalls of doing class-based research that comes with hidden assumptions about class.
Bernstein’s research also showed, albeit inadvertently, how one group’s norms for speaking aren’t always greeted happily by another group. In America, we encounter regional dialects all the time, as well as ethnic or racial forms of speaking. We readily link accents with cultural identity. And while we hold stereotypes about aspects of speech style—for instance, Northerners speak quickly; Southerners speak more slowly—we tend not to think that other stylistic characteristics, such as pausing, may be shaped in communities, too.*29
In the United States, we often admire people for their brilliance when they’re merely glib and smooth. Yet if we’re interested in other qualities, such as honesty, authenticity, and charisma, then glib, uninterrupted speaking may not be what we want to hear. In fact, we might want to begin to mistrust perfect fluency. In the 1960s, Walter Weintraub, a psychiatrist at the University of Maryland School of Medicine, developed a method for analyzing the personality traits of political leaders through their speech styles. After listening to millions of words by public figures, Weintraub concluded that true fluency is a chimera. “While several sentences of spontaneous and fluent ad-libbing are possible, lengthy impromptu remarks free of qualifications and hesitation are extremely rare,” he wrote.*30
Yet the notion that speaking is usually—and should properly be—hitch free remains deeply ingrained in ideas about language. For instance, the communications research literature is filled with studies confirming that research subjects perceive disfluent subjects negatively. Their messages are corroded, their character impugned. However, these subjects are almost always college students in large speech courses, where their attitudes about disfluencies may already have been altered. Such research isn’t well equipped to understand what happens on the speaker’s side of things. As a result, the research serves only to confirm the existence of a stereotype, not uncover anything remarkable about human communication.
These types of studies cannot account for the large number of disfluencies in spontaneous talking or conversation, which are permeated with halts, interruptions, backtracks, repetitions, and grammatical tangents. In 1994, the Scottish speech researcher Robin Lickley found a disfluent phenomenon every 9.4 words, if “well” and “sort of” were not counted. Every 36.8 words, the speaker repeated a word, mostly a word like “the,” “that,” “was,” “of,” or other function words. Every 44 words, there was an attempt to fix a word or a sentence that the speaker had already embarked on. Pause fillers occurred once every 33 words. Most of these, about 74 percent, were “um.” Many fewer, or 24 percent, were “uh.”
A bigger problem for the idea that disfluencies limit, block, or negate communication is a group of experiments showing the opposite. Repeated words don’t impede word recognition, and “uh” can actually speed recognition (for some reason, “um” provides no such benefit). Disfluencies can also help listeners determine what new information is contained in a sentence and can also help them parse ambiguous sentences. Such advantages are measured in milliseconds—but so are any actual disadvantages.
The idea that speaking is smooth and unbroken results, in part, from the fact that we don’t hear the fragments. They simply don’t register. “Speech may be peppered with the various disturbances,” Mahl wrote, “but most of them escape the awareness of both speaker and listener. Only when they occur at relatively rapid rates and are ‘bunched’ does the listener perceive them, sensing an episode of ‘flustered’ speech.” For instance, normal speakers usually repeat sounds at the beginnings of words, not in the middle of them, which can be a sign of chronic stuttering. Making more than ten disfluencies per one hundred words (not counting editing words like “well” and “like”) may be a sign of organic disorder like aphasia or dementia.
This presents additional questions: Why are we such inattentive listeners? And should we worried about it? One reason for this is that humans are evolved to be that way; it’s only slightly an exaggeration to say that we do it for the sake of survival. In order to hear what someone says, we filter out noise all the time: party chatter, the traffic, the rumble of the train, regional accents, and even “uh” and “um” (which tend to be said at a lower pitch anyway, so they’re more easily tuned out). We filter them out so unconsciously that even scientists who study verbal blunders admit that they don’t always hear completely. Robin Lickley says that he’s often had the experience of pulling out a fluent snippet of conversation from tapes only to find a disfluency in it. “I’ve realized there’s something in there I didn’t notice before—this disfluency just appears.”
On the other hand, the experts also admit that after they’ve learned to tune in to disfluencies, it requires a deliberate effort to ignore them. Dan O’Connell, a retired psycholinguist, spent more than thirty years studying pause and hesitations in people’s speaking. As a result, “when I’m speaking with people in a cocktail situation or an academic situation,” he told me, “I make it a very deliberate decision, I do not listen for these things. I think that’s profoundly impolite. I simply refuse to do it. If something comes up that you can’t ignore, then we joke about it and talk about it.” When it comes time to be a scientist, he looks at his watch, often in lectures. “I sit there with my sweep on my watch and count per minute. If that person begins to say something intelligent, I shift into my other mode of listening. That’s the way I do that. Just for fun. It’s really a matter of mental hygiene.”
The truth is, our language systems don’t have to work perfectly; they simply have to be good enough. Part of the way we deal with the complexity of language is to make guesses—ever so brief, and with low stakes—about what’s coming next. Such predictions can be a weakness: what happens when a listener predicts incorrectly? Or when he or she needs time to recover from an infelicitous prediction? For the most part, predictions are a strength; they keep us from getting overly hung up on noise and other errors in the signal. They also allow us to patch any gaps. This is how some disfluencies aid predictions, in the way that an “um” in the right place signals to the listener that he or she should pay more attention. Far from hindering communication, some disfluencies lubricate it. Their function has little to do with how we feel about
them.
Social considerations also filter our listening: our ears perk for everything a president says, but we pay less attention to what the ordinary working person says. Erving Goffman argued that listeners notice verbal blunders according to the social role that the speakers are expected to play. When the blunder indicates a deviation from the assigned social role, listeners are more likely to notice them. Formal speaking situations are one such situation. (So are experimental situations.) Differences in social status trigger more attention, as well. The blunders aren’t themselves inherently funny; when the janitor says “this venereal institution,” you probably won’t laugh quite as hard as when the governor says it (as Ohio governor James A. Rhodes did in 1964). When ordinary people are on television, the audience has the same high expectations for their speaking. “And are you enjoying your honeymoon?” the announcer asks the newlywed. “I’m enjoying every inch of it!” the young woman exclaims, and the laughter from the audience breaks slowly, then engulfs her, as she puts her finger to her forehead, her eyes ablaze. In general, Goffman says, audiences are “on the prowl for faultables”—and what puts them on the prowl is an important but unanswered question.
Um-- Slips, Stumbles, and Verbal Blunders, and What They Mean Page 9