The Information

Page 8

by James Gleick

But they were no slaves to fashion, these Oxford lexicographers. As a rule a neologism needs five years of solid evidence for admission to the canon. Every proposed word undergoes intense scrutiny. The approval of a new word is a solemn matter. It must be in general use, beyond any particular place of origin; the OED is global, recognizing words from everywhere English is spoken, but it does not want to capture local quirks. Once added, a word cannot come out. A word can go obsolete or rare, but the most ancient and forgotten words have a way of reappearing—rediscovered or spontaneously reinvented—and in any case they are part of the language’s history. All 2,500 of Cawdrey’s words are in the OED, perforce. For thirty-one of them Cawdrey’s little book was the first known usage. For a few Cawdrey is all alone. This is troublesome. The OED is irrevocably committed. Cawdrey, for example, has “onust, loaden, overcharged”; so the OED has “loaded, burdened,” but it is an outlier, a one-off. Did Cawdrey make it up? “I’m tending towards the view that he was attempting to reproduce vocabulary he had heard or seen,” Simpson said. “But I can’t be absolutely sure.” Cawdrey has “hallucinate, to deceive, or blind”; the OED duly gave “to deceive” as the first sense of the word, though it never found anyone else who used it that way. In cases like these, the editors can add their double caveat “Obs. rare.” But there it is.

For the twenty-first-century OED a single source is never enough. Strangely, considering the vastness of the enterprise and its constituency, individual men and women strive to have their own nonce-words ratified by the OED. Nonce-word, in fact, was coined by James Murray himself. He got it in. An American psychologist, Sondra Smalley, coined the word codependency in 1979 and began lobbying for it in the eighties; the editors finally drafted an entry in the nineties, when they judged the word to have become established. W. H. Auden declared that he wanted to be recognized as an OED word coiner—and he was, at long last, for motted, metalogue, spitzy, and others.♦ The dictionary had thus become engaged in a feedback loop. It inspired a twisty self-consciousness in the language’s users and creators. Anthony Burgess whinged in print about his inability to break through: “I invented some years ago the word amation, for the art or act of making love, and still think it useful. But I have to persuade others to use it in print before it is eligible for lexicographicizing (if that word exists)”♦—he knew it did not. “T. S. Eliot’s large authority got the shameful (in my view) juvescence into the previous volume of the Supplement.” Burgess was quite sure that Eliot simply misspelled juvenescence. If so, the misspelling was either copied or reprised twenty-eight years later by Stephen Spender, so juvescence has two citations, not one. The OED admits that it is rare.

As hard as the OED tries to embody the language’s fluidity, it cannot help but serve as an agent of its crystallization. The problem of spelling poses characteristic difficulties. “Every form in which a word has occurred throughout its history”♦ is meant to be included. So for mackerel (“a well-known sea-fish, Scomber scombrus, much used for food”) the second edition in 1989 listed nineteen alternative spellings. The unearthing of sources never ends, though, so the third edition revised entry in 2002 listed no fewer than thirty: maccarel, mackaral, mackarel, mackarell, mackerell, mackeril, mackreel, mackrel, mackrell, mackril, macquerel, macquerell, macrel, macrell, macrelle, macril, macrill, makarell, makcaral, makerel, makerell, makerelle, makral, makrall, makreill, makrel, makrell, makyrelle, maquerel, and maycril. As lexicographers, the editors would never declare these alternatives to be wrong: misspellings. They do not wish to declare their choice of spelling for the headword, mackerel, to be “correct.” They emphasize that they examine the evidence and choose “the most common current spelling.” Even so, arbitrary considerations come into play: “Oxford’s house style occasionally takes precedence, as with verbs which can end -ize or -ise, where the -ize spelling is always used.” They know that no matter how often and how firmly they disclaim a prescriptive authority, a reader will turn to the dictionary to find out how a word should be spelled. They cannot escape inconsistencies. They feel obliged to include words that make purists wince. A new entry as of December 2003 memorialized nucular: “= nuclear a. (in various senses).” Yet they refuse to count evident misprints found by way of Internet searches. They do not recognize straight-laced, even though statistical evidence finds that bastardized form outnumbering strait-laced. For the crystallization of spelling, the OED offers a conventional explanation: “Since the invention of the printing press, spelling has become much less variable, partly because printers wanted uniformity and partly because of a growing interest in language study during the Renaissance.” This is true. But it omits the role of the dictionary itself, arbitrator and exemplar.

For Cawdrey the dictionary was a snapshot; he could not see past his moment in time. Samuel Johnson was more explicitly aware of the dictionary’s historical dimension. He justified his ambitious program in part as a means of bringing a wild thing under control—the wild thing being the language, “which, while it was employed in the cultivation of every species of literature, has itself been hitherto neglected; suffered to spread, under the direction of chance, into wild exuberance; resigned to the tyranny of time and fashion; and exposed to the corruptions of ignorance, and caprices of innovation.”♦ Not until the OED, though, did lexicography attempt to reveal the whole shape of a language across time. The OED becomes a historical panorama. The project gains poignancy if the electronic age is seen as a new age of orality, the word breaking free from the bonds of cold print. No publishing institution better embodies those bonds, but the OED, too, tries to throw them off. The editors feel they can no longer wait for a new word to appear in print, let alone in a respectably bound book, before they must take note. For tighty-whities (men’s underwear), new in 2007, they cite a typescript of North Carolina campus slang. For kitesurfer, they cite a posting to the Usenet newsgroup alt.kite and later a New Zealand newspaper found via an online database. Bits in the ether.

When Murray began work on the new dictionary, the idea was to find the words, and with them the signposts to their history. No one had any idea how many words were there to be found. By then the best and most comprehensive dictionary of English was American: Noah Webster’s, seventy thousand words. That was a baseline. Where were the rest to be discovered? For the first editors of what became the OED, it went almost without saying that the source, the wellspring, should be the literature of the language—particularly the books of distinction and quality. The dictionary’s first readers combed Milton and Shakespeare (still the single most quoted author, with more than thirty thousand references), Fielding and Swift, histories and sermons, philosophers and poets. Murray announced in a famous public appeal in 1879:

A thousand readers are wanted. The later sixteenth-century literature is very fairly done; yet here several books remain to be read. The seventeenth century, with so many more writers, naturally shows still more unexplored territory.

He considered the territory to be large but bounded. The founders of the dictionary explicitly meant to find every word, however many that would ultimately be. They planned a complete inventory. Why should they not? The number of books was unknown but not unlimited, and the number of words in those books was countable. The task seemed formidable but finite.

It no longer seems finite. Lexicographers are accepting the language’s boundlessness. They know by heart Murray’s famous remark: “The circle of the English language has a well-defined centre but no discernable circumference.” In the center are the words everyone knows. At the edges, where Murray placed slang and cant and scientific jargon and foreign border crossers, everyone’s sense of the language differs and no one’s can be called “standard.”

Murray called the center “well defined,” but infinitude and fuzziness can be seen there. The easiest, most common words—the words Cawdrey had no thought of including—require, in the OED, the most extensive entries. The entry for make alone would fill a book: it teases apart ninety-eight di
stinct senses of the verb, and some of these senses have a dozen or more subsenses. Samuel Johnson saw the problem with these words and settled on a solution: he threw up his hands.

My labor has likewise been much increased by a class of verbs too frequent in the English language, of which the signification is so loose and general, the use so vague and indeterminate, and the senses detorted so widely from the first idea, that it is hard to trace them through the maze of variation, to catch them on the brink of utter inanity, to circumscribe them by any limitations, or interpret them by any words of distinct and settled meaning; such are bear, break, come, cast, full, get, give, do, put, set, go, run, make, take, turn, throw. If of these the whole power is not accurately delivered, it must be remembered, that while our language is yet living, and variable by the caprice of every one that speaks it, these words are hourly shifting their relations, and can no more be ascertained in a dictionary, than a grove, in the agitation of a storm, can be accurately delineated from its picture in the water.

Johnson had a point. These are words that any speaker of English can press into new service at any time, on any occasion, alone or in combination, inventively or not, with hopes of being understood. In every revision, the OED’s entry for a word like make subdivides further and thus grows larger. The task is unbounded in an inward-facing direction.

The more obvious kind of unboundedness appears at the edges. Neologism never ceases. Words are coined by committee: transistor, Bell Laboratories, 1948. Or by wags: booboisie, H. L. Mencken, 1922. Most arise through spontaneous generation, organisms appearing in a petri dish, like blog (c. 1999). One batch of arrivals includes agroterrorism, bada-bing, bahookie (a body part), beer pong (a drinking game), bippy (as in, you bet your ———), chucklesome, cypherpunk, tuneage, and wonky. None are what Cawdrey would have seen as “hard, usual words,” and none are anywhere near Murray’s well-defined center, but they now belong to the common language. Even bada-bing: “Suggesting something happening suddenly, emphatically, or easily and predictably; ‘Just like that!’, ‘Presto!’ ” The historical citations begin with a 1965 audio recording of a comedy routine by Pat Cooper and continue with newspaper clippings, a television news transcript, and a line of dialogue from the first Godfather movie: “You’ve gotta get up close like this and bada-bing! you blow their brains all over your nice Ivy League suit.” The lexicographers also provide an etymology, an exquisite piece of guesswork: “Origin uncertain. Perh. imitative of the sound of a drum roll and cymbal clash. Perh. cf. Italian bada bene mark well.”

The English language no longer has such a thing as a geographic center, if it ever did. The universe of human discourse always has backwaters. The language spoken in one valley diverges from the language of the next valley, and so on. There are more valleys now than ever, even if the valleys are not so isolated. “We are listening to the language,” said Peter Gilliver, an OED lexicographer and resident historian. “When you are listening to the language by collecting pieces of paper, that’s fine, but now it’s as if we can hear everything said anywhere. Take an expatriate community living in a non-English-speaking part of the world, expatriates who live at Buenos Aires or something. Their English, the English that they speak to one another every day, is full of borrowings from local Spanish. And so they would regard those words as part of their idiolect, their personal vocabulary.” Only now they may also speak in chat rooms and on blogs. When they coin a word, anyone may hear. Then it may or may not become part of the language.

If there is an ultimate limit to the sensitivity of lexicographers’ ears, no one has yet found it. Spontaneous coinages can have an audience of one. They can be as ephemeral as atomic particles in a bubble chamber. But many neologisms require a level of shared cultural knowledge. Perhaps bada-bing would not truly have become part of twenty-first-century English had it not been for the common experience of viewers of a particular American television program (though it is not cited by the OED).

The whole word hoard—the lexis—constitutes a symbol set of the language. It is the fundamental symbol set, in one way: words are the first units of meaning any language recognizes. They are recognized universally. But in another way it is far from fundamental: as communication evolves, messages in a language can be broken down and composed and transmitted in much smaller sets of symbols: the alphabet; dots and dashes; drumbeats high and low. These symbol sets are discrete. The lexis is not. It is messier. It keeps on growing. Lexicography turns out to be a science poorly suited to exact measurement. English, the largest and most widely shared language, can be said very roughly to possess a number of units of meaning that approaches a million. Linguists have no special yardsticks of their own; when they try to quantify the pace of neologism, they tend to look to the dictionary for guidance, and even the best dictionary runs from that responsibility. The edges always blur. A clear line cannot be drawn between word and unword.

So we count as we can. Robert Cawdrey’s little book, making no pretense to completeness, contained a vocabulary of only 2,500. We possess now a more complete dictionary of English as it was circa 1600: the subset of the OED comprising words then current.♦ That vocabulary numbers 60,000 and keeps growing, because the discovery of sixteenth-century sources never ends. Even so, it is a tiny fraction of the words used four centuries later. The explanation for this explosive growth, from 60,000 to a million, is not simple. Much of what now needs naming did not yet exist, of course. And much of what existed was not recognized. There was no call for transistor in 1600, nor nanobacterium, nor webcam, nor fen-phen. Some of the growth comes from mitosis. The guitar divides into the electric and the acoustic; other words divide in reflection of delicate nuances (as of March 2007 the OED assigned a new entry to prevert as a form of pervert, taking the view that prevert was not just an error but a deliberately humorous effect). Other new words appear without any corresponding innovation in the world of real things. They crystallize in the solvent of universal information.

What, in the world, is a mondegreen? It is a misheard lyric, as when, for example, the Christian hymn is heard as “Lead on, O kinky turtle …”). In sifting the evidence, the OED first cites a 1954 essay in Harper’s Magazine by Sylvia Wright: “What I shall hereafter call mondegreens, since no one else has thought up a word for them.”♦ She explained the idea and the word this way:

When I was a child, my mother used to read aloud to me from Percy’s Reliques, and one of my favorite poems began, as I remember:

Ye Highlands and ye Lowlands,

Oh, where hae ye been?

They hae slain the Earl Amurray,

And Lady Mondegreen.

There the word lay, for some time. A quarter-century later, William Safire discussed the word in a column about language in The New York Times Magazine. Fifteen years after that, Steven Pinker, in his book The Language Instinct, offered a brace of examples, from “A girl with colitis goes by” to “Gladly the cross-eyed bear,” and observed, “The interesting thing about mondegreens is that the mishearings are generally less plausible than the intended lyrics.”♦ But it was not books or magazines that gave the word its life; it was Internet sites, compiling mondegreens by the thousands. The OED recognized the word in June 2004.

A mondegreen is not a transistor, inherently modern. Its modernity is harder to explain. The ingredients—songs, words, and imperfect understanding—are all as old as civilization. Yet for mondegreens to arise in the culture, and for mondegreen to exist in the lexis, required something new: a modern level of linguistic self-consciousness and interconnectedness. People needed to mishear lyrics not just once, not just several times, but often enough to become aware of the mishearing as a thing worth discussing. They needed to have other such people with whom to share the recognition. Until the most modern times, mondegreens, like countless other cultural or psychological phenomena, simply did not need to be named. Songs themselves were not so common; not heard, anyway, on elevators and mobile phones. The word lyrics, meaning the words of a son
g, did not exist until the nineteenth century. The conditions for mondegreens took a long time to ripen. Similarly, the verb to gaslight now means “to manipulate a person by psychological means into questioning his or her own sanity”; it exists only because enough people saw the 1944 film of that title and could assume that their listeners had seen it, too. Might not the language Cawdrey spoke—which was, after all, the abounding and fertile language of Shakespeare—have found use for such a word? No matter: the technology for gaslight had not been invented. Nor had the technology for motion pictures.

The lexis is a measure of shared experience, which comes from interconnectedness. The number of users of the language forms only the first part of the equation: jumping in four centuries from 5 million English speakers to a billion. The driving factor is the number of connections between and among those speakers. A mathematician might say that messaging grows not geometrically, but combinatorially, which is much, much faster. “I think of it as a saucepan under which the temperature has been turned up,” Gilliver said. “Any word, because of the interconnectedness of the English-speaking world, can spring from the backwater. And they are still backwaters, but they have this instant connection to ordinary, everyday discourse.” Like the printing press, the telegraph, and the telephone before it, the Internet is transforming the language simply by transmitting information differently. What makes cyberspace different from all previous information technologies is its intermixing of scales from the largest to the smallest without prejudice, broadcasting to the millions, narrowcasting to groups, instant messaging one to one.

‹ Prev Next ›