Nabokov's Favorite Word Is Mauve

Home > Other > Nabokov's Favorite Word Is Mauve > Page 1
Nabokov's Favorite Word Is Mauve Page 1

by Ben Blatt




  Thank you for downloading this Simon & Schuster eBook.

  * * *

  Join our mailing list and get updates on new releases, deals, bonus content and other great books from Simon & Schuster.

  CLICK HERE TO SIGN UP

  or visit us online to sign up at

  eBookNews.SimonandSchuster.com

  Introduction

  1. Use Sparingly

  2. He Wrote, She Wrote

  3. Searching for Fingerprints

  4. Write by Example

  5. Guiltier Pleasures

  6. U.K. vs. U.S.

  7. Clichés, Repeats, and Favorites

  8. How to Judge a Book by Its Cover

  9. Beginnings and Endings

  Epilogue

  Acknowledgments

  About Ben Blatt

  Notes

  For my mother, Faith Minard.

  And for my friends at 44 Bow Street.

  Alexander Hamilton, James Madison, or John Jay?

  For more than 150 years, historians argued over the authorship of 12 essays in The Federalist Papers, founding documents in the American march toward democracy. Though the essays are world-famous hallmarks in the lexicon of American history, the specific authors of each one remained unknown. The question of which Founding Father penned the essays had sparked such endless debate that it had devolved into a popular parlor game among historians. Just who exactly wrote the stirring arguments upon which our governing structure was based?

  The answer was hidden in the words themselves—but to find them, scholars needed not a close reading, but a close counting. They needed to look only at the numbers.

  The mystery began in late 1787, when a series of essays advocating the ratification of the Constitution was published in New York newspapers under the pen name “Publius.” Shielding the true identities of the authors with the patriotic nom de plume was a somewhat farcical endeavor. In fact, of the near 4 million people living in the United States in 1787, all but three could be eliminated from contention.

  It was an open secret that Hamilton, Madison, and Jay were the authors, but none of the three wanted to step forward and admit to writing any particular essays. Each had political ambitions, later rising to the ranks of Secretary of the Treasury, President, and Chief Justice of the Supreme Court, respectively, so they weren’t without good reason. But their excess of caution left the mystery of authorship intact, titillating history professors and armchair enthusiasts alike for many years to come.

  You might think that the scholars and astute politicos of the day would have been able to determine the authorship on their own. There were only three potential candidates, after all, each with his own political slant and style of communication. It would have been the equivalent of an anonymous editorial in the New York Times, penned by Barack Obama, Hillary Clinton, or Bernie Sanders. Or an unsigned manifesto by George W. Bush, John McCain, or Donald Trump. All might be coming from the same side, but they were certainly not all identical.

  In 1804, a solution finally seemed to emerge. Hamilton wrote a letter to his friend Egbert Benson listing the author of each essay. Hamilton was preparing to duel Aaron Burr. He sensed both the historical significance of The Federalist Papers and the chances of his survival. He decided not to let his knowledge of the authorship die with him.

  This should have been the end of the mystery. A nation of curious observers had no reason to doubt Hamilton’s firsthand knowledge. Yet 13 years later, soon after ending his second term as President, Madison put out his own list of authorship—one that differed from Hamilton’s. Twelve of the essays that Hamilton claimed to have written were also claimed by Madison.

  This reopened the debate with a new fervor, fueling spats among historians for more than a century. In 1892, future senator Henry Cabot Lodge wrote on the topic siding with Hamilton, while noted historian E. G. Bourne went with Madison.

  Most historians tried to tease out the authors based on the political ideology presented in each essay. Would Madison really have argued for a central bank in those certain terms? Would Hamilton have supported limits on Congress so freely? Or maybe that’s something John Jay would have written?

  It wasn’t until 1963, two centuries later, that the mystery was at long last solved. The definitive answer came from respected professors Frederick Mosteller of Harvard University and David Wallace of the University of Chicago. However, unlike the many professors who had attempted to solve the question before them, Mosteller and Wallace were not historians. They were not known for their scholarly work on early America. They had never published a paper on historical figures at all. Mosteller and Wallace were statisticians.

  One of Mosteller’s most noteworthy papers dealt with the World Series and whether or not seven games was enough to statistically find the best baseball team. Just a few years prior to looking into the authorship problem, Wallace had published a paper named “Bounds on Normal Approximations to Student’s and the Chi-Square Distributions,” which probably sounds as close to nonsense to you as the thought of probability functions solving a historical mystery sounded to history professors in 1963.

  Mosteller and Wallace’s methodology for ending the authorship debate had nothing to do with politics or ideologies. Instead, they were two of the first statisticians to leverage word frequency and probability.

  Their process was in some ways complex, featuring equations with factorials, exponents, summations, logarithms, and t-distributions. But the heart of their methods was strikingly simple:

  • Count the frequency of common words in essays that we know either Hamilton or Madison wrote.

  • Count the frequency of those same words in essays where the author is unknown.

  • Compare these frequencies to determine the author of the disputed essays.

  Even before any of the fancy probabilistic equations come into play, the results of the statisticians’ approach seem wonderfully obvious in retrospect. In The Federalist Papers, Madison used the word whilst in over half the essays in which his authorship had been confirmed—but he never once used the word while. Hamilton, meanwhile, used the word while in about one-third of his essays but never once used whilst.

  Mosteller and Wallace did not rely on a single word for their analysis, however. That would not have been statistically sound. Instead, they systematically chose dozens of basic words and then found the frequency of each in the disputed essays. Many words, entirely nonpolitical in meaning, turned out to have drastically different usage rates between the two authors. For example, Madison used also twice as often as Hamilton, while Hamilton used according much more frequently than Madison.

  Mosteller and Wallace had falsifiability on their side. They could show that by using the same methods on papers where the author was known, they could determine the authorship with perfect results. Of the 12 disputed essays, Mosteller and Wallace concluded that James Madison was the actual author of all 12.

  In the written summary of their results the two mathematicians proceeded with caution, perhaps out of fear of angering historians who had been scratching their heads for generations. The numbers presented in their experiment showed a different story; the two had complete confidence in the method. It was flawless in all the test cases where authorship was known, and its results were consistent in the essays with unknown authorship. Hamilton’s claim of authorship was wrong.

  Today, after countless more studies of the papers in both statistical and nonstatistical manners, Mosteller and Wallace’s findings—that Madison was the author—have become the consensus among statisticians and historians alike. Mosteller and Wallace were ahead of their time. Their study, though it involved some formulaic complexity, relied essentially on c
ounting words. With today’s computers, word counts and frequencies are trivial pursuits. In 1963, this was not the case.

  Word counts were done by hand; to find the number of times the word upon appeared in each of the essays, for example, they tallied the usage page by page. To understand what Mosteller and Wallace went through (or at least what their research assistants went through), I printed out a complete collection of The Federalist Papers and set out to count the number of times upon appeared. After 30 minutes I was only one-eighth of the way through—about 40 pages—and had counted 37 instances of the word upon. It wasn’t long before my eyes were pounding and my brain went numb. “Where’s Upon?” was like a devilish version of “Where’s Waldo?”

  I gave up on pretending I was in 1963. Instead I did some counting only possible with twenty-first-century technology: I went to Google, searched “Federalist Papers Complete Text File,” downloaded a link from the first result, and opened the file in Microsoft Word. After a grand total of two minutes, a “Find All” on upon turned up 46 occurrences of the word in the section I had covered. Not only was the computerized method 28 minutes faster, it was far more accurate than my weary eyes could be.

  Even more staggering: Though the amount of time needed for a person to scan The Federalist Papers in full for an additional word would hover around four hours, scanning via computer for all words would take a negligible amount of time. Doing a similar analysis on the complete works of Shakespeare, the Bible, Moby Dick, or even the corpus of English literature would have been unfathomable to Mosteller and Wallace. Today, using computers to count the instances of a single word in a large text is a task mastered by most teenagers.

  In the fifty years since Mosteller and Wallace published their study, the field of computer-aided text processing has grown rapidly. Google uses text analysis both in its search results and in deciding what ads to show you. Researchers have tried to use text analysis to determine what makes a tweet go viral, while media outlets often run similar versions of the same headline with slight tweaks in wording to maximize page views. But the uses thought up so far by tech companies are only one possible route.

  Mosteller and Wallace used statistics to investigate a singular question of authorship. The success of their experiment was more profound. Writers have distinct styles that are both consistent and predictable. As it turns out, it’s not just eighteenth-century politicians that leave a stylistic fingerprint. Authors of all books, whether they be popular and renowned or obscure and reviled, repeat their words and structure over decades of writing.

  The question Mosteller and Wallace asked and answered was limited in its scope, but text analysis can answer a huge range of questions that have intrigued curious writers and readers for generations. Did Ernest Hemingway actually use fewer adverbs than other writers? How does reading level affect the popularity of a book? Do men and women write differently? Do writers follow their own advice, and is that advice any good? What, besides superficial spellings, distinguishes American and British novelists? From Vladimir Nabokov to E L James, what are our favorite authors’ favorite words?

  While there has been a slowly growing movement in academia to investigate the writing patterns of successful authors, there are still enormous questions that have yet to be explored. And for everyone from the casual reader to the literature major to the aspiring writer, these questions are both fascinating and useful. You probably don’t care about the Poisson distribution or the parsing programs used to decipher parts of speech, but you probably do want to know how your favorite author writes—and what that might mean about you as a reader.

  The analytical approach to writing can be amusing and informative and often downright funny. Moreover, it can teach us about the writers we read every day and the words we use in our own writing. That’s what we’ll delve into in this book, devoting each chapter to a new literary experiment.

  The research won’t be painfully complex. It doesn’t need to be, and shouldn’t be, in order to be worthwhile. Many obvious and intriguing questions about classic literature or the modern bestseller can be viewed through a statistical lens but just haven’t been framed that way yet. This book is about tackling these simple yet unique questions in a new way. It’s a book about words that is, paradoxically, written with numbers.

  The road to hell is paved with adverbs.

  —STEPHEN KING

  In literary lore, one of the best stories of all time is a mere six words. “For sale: baby shoes, never worn.” It’s the ultimate example of less is more, and you’ll often find it attributed to Ernest Hemingway.

  It’s unclear whether it was in fact Hemingway who penned these words—the story of its creation did not appear until 1991—but it’s natural that writers and readers would want to attribute the story to the Nobel winner. He’s known for his economical prose, and the shortest-of-short stories is, at the very least, emblematic of his style.

  Hemingway’s simple style was an intentional choice. He once wrote in a letter to his editor, “It wasn’t by accident that the Gettysburg address was so short. The laws of prose writing are as immutable as those of flight, of mathematics, of physics.” He believed that writing should be cut down to the bare essentials and that extra words end up hurting the final product.

  Ernest Hemingway is far from alone in this belief. The same idea is raised in high-school classrooms and writing guides of every variety. And if there’s one part of speech that’s the worst offender of all, as anyone who’s ever had an exacting English teacher will know, it’s the adverb.

  After listening to enough experts and admirers, it’s easy to come away with the impression that Hemingway is the paragon of concision. But is this because he succeeded where others were tempted by extraneous language, or is he coasting on reputation alone? Where does Hemingway rank, for instance, in his use of the dreaded adverb?

  I wanted to find out if he lived up to the hype. And if not, who does use the fewest adverbs? Which author uses them the most? Moreover, when we look at the big picture, can we find out whether great writing does indeed hew to those efficient “laws of prose writing”? Do the best books use fewer adverbs?

  * * *

  I looked around and found that no one had ever attempted to determine the numbers behind these questions. So I sought to find some answers—and I started by analyzing the almost one million words in Hemingway’s ten published novels.

  If Hemingway believes that the “laws of prose writing are as immutable as those of flight, of mathematics, of physics,” then I’d like to think he’d find this mathematical analysis equal parts illuminating and outlandish.

  It’s outlandish at first glance because of the way we study writing. Many of us have spent days in middle school, high school, and college English classrooms dissecting a single striking excerpt from a Hemingway novel. If you want to study a great author’s writing, their most remembered passages are often the best place to start. Looking at a spreadsheet of adverb frequencies, on the other hand, won’t teach you much in the way of writing a novel like Hemingway.

  But from a statistician’s point of view, it’s just as outlandish to focus on a small sample and never look at the whole picture. When you study the population of the United States, you wouldn’t look at just the population of a small town in New Hampshire for an understanding of the entire country, no matter how emblematic of the American spirit it may seem. If you want to know how Hemingway writes, you also need to understand the words he chooses that have not been put under the microscope. By looking at adverb rates throughout all his books, we can get a better sense of how he used language.

  So instead of digging through snippets of Hemingway’s text and debating specific spots where he chose to use or shirk adverbs, I used a set of functions called Natural Language Toolkit to count the number of adverbs in all of his novels. The toolkit relies on specific words and the relationships between them to tag words with a part of speech. For example, here’s how it processes the previous sentence:
/>   It’s not 100 % perfect—so all the numbers below should be seen with that wrinkle in mind—but it’s been trained on millions of human-analyzed texts and fares as well as any person could be expected to do. It’s considered the gold standard in sussing out if a word is an adjective, adverb, personal pronoun, or any other part of speech.

  So what do we find when we apply the toolkit to Hemingway’s complete works?

  In all of Hemingway’s novels, he wrote just over 865,000 words and used 50,200 adverbs, putting his adverb use at about 5.8 % of all words. On average, for every 17 words Hemingway wrote, one of them was an adverb.

  This number without context has no meaning. Is 5.8 % a lot or a little? Stephen King, an outspoken critic of adverbs, has a usage rate of 5.5 %.

  It turns out that by this standard King and Hemingway are not leaps and bounds ahead of other writers. Looking at a handful of contemporary authors who one might assume (based on stereotype alone) would use an abundance of adverbs, we see that King and Hemingway are not anomalous. E L James, author of the erotica novel Fifty Shades of Grey, used adverbs at a rate of 4.8 %. Stephenie Meyer, whom King has called “not very good,” used adverbs at a rate of 5.7 % in her Twilight books, putting her right between the horror master and the legendary Hemingway.

  Expanding our search, Hemingway used more adverbs than authors John Steinbeck and Kurt Vonnegut. He used more adverbs than children’s authors Roald Dahl and R.L. Stine. And, yes, the master of simple prose used more adverbs than Stephenie Meyer and E L James.

  All the sentences above are true—but they also need a giant asterisk next to them and a full explanation. Because the answer is not as simple as the numbers above first suggest.

  Those tallies are counts of total adverb usage. An adverb is any word that modifies a verb, adjective, or another adverb—and no adverbs were excluded or excused. But when Stephen King says, “The adverb is not your friend,” he’s not talking about any word that modifies a verb, adjective, or another adverb. In the sentence “The adverb is not your friend,” the word not is an adverb. But not is not King’s issue. Nobody reads “For sale: baby shoes, never worn” and thinks never is an adverb that should have been nixed.

 

‹ Prev