How to Wreck a Nice Beach
Page 3
Bell Labs described the vocoder in terms of analysis and synthesis. It divided voice frequencies into ten bands at 300 hertz each, one-tenth of the bandwidth required for phone conversations at the time. Each filter measured the voltage required for its speech frequency range. This information was low-pass filtered at 25 hertz and transmitted to the receiving end, which determined a fundamental pitch frequency (in the eleventh channel) and an unvoiced noise signal to synthesize the voice. Bell Labs favored synthesize over reconstruct because the latter implied flawless duplication.
“The first telephone cable in 1956 had fourteen telephone channels,” said Manfred Schroeder, hired by Bell Labs in 1954. “With the vocoder we could’ve made this into four hundred. That would’ve been wonderful. Between England and the United States, just fourteen channels wasn’t enough. By the time we were ready, they had their satellites up with hundreds and thousands of channels. And by the time we had things that would’ve augmented the satellites, they had fiber-optic cables running through the Atlantic. So there was never much technical interest in what we did. Bandwidth was no longer a concern—this thing we had worked on for all our lives. But then came the Internet and cell phones. Now you are using compressed speech. That is Linear Prediction vocoder technology.”
Teeth sibilant sketches from Homer Dudley’s lab notebook. Dudley had been researching the early mechanical throats of Kratzenstein (1779, Russia), W.R. von Kempelen (Vienna) and Abbe Mical (France). (Courtesy AT&T Archives and History Center) (illustration credit 1.10)
Bell Labs’ classified “secrecy system” vocoder bible from 1939-1945, with redactions done by razor blade. “One scarcely realizes how many sounds are of a random nature until he starts giving voice to them,” wrote Homer Dudley, who spent much of the Cold War studying the speech mechanism of parrots. (Courtesy Ralph Miller) (illustration credit 1.11)
The Bell Labs vocoder bible, with memos typed on tracing paper, leaving tiny windows of “subtext.” Includes testing for the Whisper Condition for SIGSALY, the first transmission of digital speech. (Courtesy Ralph Miller) (illustration credit 1.12)
Linear Predictive Coding originated as a computer model of the vocal tract shape. “Compression was the starting point of what you see in cell phones,” says Bishnu Atal, an innovator of LPC who worked closely with Dudley at Bell Labs in the Fifties. “The idea [of compression] was dormant until 1980, when we started applying it to cell phones. The vocoder was a dirty narrow word because, after thirty years, it still couldn’t carry a telephone conversation. But now it’s all about compression—Dudley’s original idea.”
SHOE BENCH
Homer Dudley, a physicist who kept bees, joined the Bell Labs Acoustics Research Department in 1921, the year the word robot was first placed in circulation by the Czechoslovakian writer Karel Capek, in his play R.U.R. Ralph Miller remembers Bell Labs being somewhat insulated from the Depression. “One of the things about Bell Labs in those days [was] if you had ideas about something that might help the telephone, they let you go off in the corner and work on it. When Homer got out of the hospital he started making proposals.”
SMILE, SNEEZE AND PREACH
The Euphonia was a mechanical speech keyboard invented in 1835 by Joseph Faber, a hypochondriac land-surveyor for the emperor of Vienna. Controlled by sixteen keys and a foot pedal for its lung bellows, Faber’s “verse-grinding machine” was well received in London, such that the press suggested replacing the House of Commons with Automatons. The Euphonia would later appear in the Fred Perkins story “Manufactory,” published in 1873. Perkins’ robot Patent Ministers could smile, sneeze and preach “The Discourse of Lukewarmness and Zeal.” One Presbyterian android malfunctions during a sermon and explodes, causing “permanent derangement” in the congregation. (illustration credit 1.13)
At Bell Labs, located at 463 West Street in Manhattan, the tongue was referred to as a lumped impedance structure and the letter k was a miniature explosive impulse. There, human speech was generated from the most inhuman of noises. Donkeys talked, storms howled sentences, and church bells scolded, “Stop! Stop! Don’t do that!” The witches of Macbeth cackled “deeper than the deepest bass.” The roar of the surf, Niagara Falls, submarine engines, tap dancing, birds, disturbed leaves—all had something to say. The vocoder had an anthropomorphic ball, often on reality’s dime. Memos typed on tracing paper addressed problems with “buzz-saw quality” and “jars of severity.” The term “shoe bench” was subjected to rigorous testing and spectral analysis, and ultimately recorded to vinyl. Other times, a simple “shh” would do, as if engineers were shushing the machine. Wrote Dudley in 1944: “One scarcely realizes how many sounds are of a random nature until he starts giving voice to them.”
In 1933, Homer Dudley was promoted to head of the Bell Labs Acoustics Research Department. Not being a people person, he suffered a nervous breakdown and was diagnosed with acute colitis, an inflammation of the colon that triggered a near-fatal effluxion. So the beekeeper who tore speech to pieces was back in the hospital, staring at the ceiling and thinking about the vocoder, waiting for his blood to coagulate, while the goldfish twitched about in its bowl without a memory to occupy itself.
BUZZ HISS DRAGON
In Robert Heinlein’s 1951 book, Between Planets, the Voder appears strapped to the chest of a seven-eyed dragon from Venus. (Pictured above with the glassed-in tricycle bug coffin.) The dragon pounds the keyboard with his tendrils, as if to clear his throat, and thus discovers his favorite word: Shucks. (illustration credit 1.14)
MY TEACHER’S SCREWY
According to a lab notebook entry from 1929, Dudley’s vocoder was influenced by the work of Karl Willy Wagner, a German pioneer in automated speech research. Wagner developed a vowel synthesizer in 1936, a year after Dudley filed his signal transmission patent for the vocoder. In 1936, the vocoder made its public debut at Harvard University’s Tercentenary Celebration in Cambridge, Massachusetts. Standing before his seven-foot tower of dials, Homer Dudley conducted paralinguistic cues, exaggerating the pitch and jumping between English and Swedish, often landing on the last syllable with both feet. He called it “The Greta Garble Effect,” a joke on coherence. He then dropped into a power monk drone, as if trying to inhale the audience gasp. He harmonized with his “electrical doubles,” did the Trans-Homer Express with a steam engine, and rasped an Exorcist version of Suzy Seashells. There were vocoder recordings of Protestant hymns, forlorn ballads, and an ad for Silly Willie Toothpaste (“You can use it to polish your car!”). For “Barnacle Bill,” the pitch bottomed out like an ogre suffering from moat throat. “Happy Birthday” sounded like Mr. Bill in drag, pitched so high that the Hopewell Herald would later write that “a colortura soprano canary would have burst its vocal plumbing.”
At another vocoder demonstration at the Franklin Institute in Philadelphia, Dudley had a power line humming as “the voice of electricity,” lighting houses and claiming to be power itself, before wartime blackouts revived the candle. Thirty-six years later, Düsseldorf electronic group Kraftwerk—named after a power plant—would cover this plug tune in German, through a vocoder.
In 1939, while the Voder played the ham at the World’s Fair, a screenwriter named Gilbert Wright discovered his electric razor could speak by essentially putting his Adam’s apple on vibrate. This shaving epiphany would be patented as the Sonovox, a device that would give machines—and products—a voice. When held to the throat, the Sonovox’s vibrating speaker cones could channel phonograph recordings of airplanes, engines and vacuum cleaners from the studio console, allowing humans to pantomime noise into speech. Ideal for radio spots and jingles—and ultimately counting down the hits with Casey Kasem on American Top 40—the Sonovox’s major debut would be in a wheezing steam engine in Dumbo.
At the Acoustical Society of America in Manhattan, Homer Dudley would hype the vocoder’s commercial appeal for cartoon voices, something beyond the helium quacks of Disney. In the summer of 1939, while the Sonovox appeared in
Time magazine, Dudley visited MGM Studios in Hollywood, offering the vocoder as “a scientific aid to movie stars,” claiming his invention could revive silent stars canned by the advent of talkies. Actors could essentially swap larynges, enunciating the pitch provided by a “surrogate throat.” With its overdubbing potential, the vocoder could airbrush defective voices and create the illusion that actors could sing. This pitch-doctoring anticipated the Auto-Tune software in pop music today. Gee-whizzing in the Los Angeles Times, Philip Scheuer claimed the vocoder could transform any voice into “The Voice True”: “a squeak into an oratorio, a bumpkin into Barrymore, a hash-slinger into Lily Pons.”
At the Hollywood demo, Dudley’s associate Charles Vadersen sang “How Dry I Am” through the vocoder, multiplexing his voice, and then became an airplane taking a nosedive. Dudley then “fluttered” the pitch controls and triggered a domestic squabble with himself, spanning three generations. Father scolds daughter (“Nevermind that flip talk!”), daughter back-talks (“My teacher’s screwy, Daddy!”), Grandpa Gizzard warbles in, and mother takes a hit of scotch. Said the LA Times, “Anything so wondrous, so stupendous, so complicated and so confusing must find a place in movie-making.”
BIRD WINGS AND FROGS, LINE ONE
In his 1910 book, The History of the Telephone, Herbert Casson describes the telephone as a “living, conscious being, half human and half machine.” In addition to picking up transmissions from the deceased, the phone also acquired “noises! … Spluttering and bubbling, jerking and rasping, whistling and screaming. There were the rustling of leaves, the croaking of frogs, the hissing of steam, and the flapping of bird wings … scraps of talk from other phones, and curious little squeals that were unlike any known sound.”
WORD’S FEAR
Hollywood would have to wait. All things wondrous, stupendous, complicated and confusing must report to the army first. Though the World’s Fair could make claims on the future, the military officially had dibs on tomorrow. Long before the vocoder played the voice of a missile-happy Cold War supercomputer in 1970’s Colossus: The Forbin Project, it held an underground desk job, scrambling the phone calls of the army’s triple-chinned brass. Patriotic orders to fill, eggs to scramble. Things to come, things to do.
Writing in the New Yorker, Martian-mongerer H.G. Wells predicted that the World’s Fair would introduce teleconferencing, a snooze button of a prophecy but less dooming than the atomic conflict he foresaw in his 1914 book The World Set Free. Ray Bradbury, the loud blond dreamer, was terrified. No topless squid lady could distract him from the prospect of the sky above whistling straight to hell.
Those at the Fair who eavesdropped on Bradbury’s free call to Los Angeles probably just admired the clarity, marveling at voices shooting across time zones. Perhaps they mistook his modulated quaver for homesickness, not the fear that he’d never again see his parents. I love you. I miss you. I’m broke.
“We were a few weeks away from World War Two,” he tells me. “The sense then was that in a few months the world was going to destroy itself. The world then proceeded to kill forty million people. I thought I might be destroyed too. I looked up into the sky, smelled gun powder and saw the war coming.” That night, July 4th, standing in the glow of the fireworks, the world’s blindest stegosaurus fan saw the sky on fire and cried.
Charles Vadersen may be the first vocoder singer. Here he demonstrates the “Greta Garble Effect,” circa 1936. Vadersen would also sing vocoder versions of “Old Man River” and “Barnacle Bill.” (Courtesy AT&T Archives and History Center) (illustration credit 1.15)
It is not advisable to be without SIGSALY service.
— General Franklin E. Stoner
Ralph Miller’s patents for artificial speech reconstruction and encoding, photographed in 2009. Miller also engineered a condensed, mobile version of SIGSALY called “Junior X,” which used an eight-channel vocoder that fit in a van but was never deployed. (Courtesy Ralph Miller) (illustration credit 2.1)
PLAYOFEEN CRINKONOPE
While Bell Labs thought the vocoder could rehabilitate shell-shocked mutes, the voice of Winston Churchill—England’s prolific speech synthesizer—was caught hurtling through the ether along the coast of South Holland. In 1941, a German radio outpost near the beach in Noordwijk had been intercepting the prime minister’s phone calls, unscrambling them as if they were no more than a boy scout flipping blinds. The Bell Labs scrambler of choice—the A-3—had been compromised by Hitler’s post office (Forschungsstelle). “High-placed people in the government had been using the Trans-Atlantic Telephone,” says Ralph Miller, assigned by Bell Labs to resolve the problem. “They thought it was secret. It was secret after a fashion. You split speech up into bands and mess it all up by scrambling. But it wasn’t hard to undo, either, and apparently the Germans had done just that.”
I’m sitting in Ralph Miller’s living room in Concord, Massachusetts, among piles of documents, many of which haven’t seen daylight since the Eisenhower Administration. The floor is covered with patents. Transmission and Reconstruction of Artificial Speech. Conference Calls Between Vocoder and Analogue Stations. Determination of Pitch Frequency. There is a diagram for synchronizing turntables on opposite sides of the world.
Ralph glances at the vocoder scheme under his penny loafer and shakes his head. “When did I do all of that?”
It’s been a while. At the age of 102, Ralph Miller can still tell you how to subdivide human speech. Dressed sharply in blue houndstooth pants and a putting-green cardigan, Ralph introduced himself to me as the guy who “once controlled the frequency of New York City.” He doesn’t own a computer, but his memory is beholden to the technology he helped innovate, including Pulse Code Modulation (PCM), a conversion of analog to digital signals now applicable to cell phones, computers and synthesizers. PCM’s digitization of the voice would be key in the sampling and coding aspect of the SIGSALY system Miller helped design during World War II. Among the documents on his floor is a letter from an AT&T attorney, dated 1976, issuing Miller a SIGSALY patent, finally declassified thirty-five years after it was filed. “[Bell Labs] had found eighty patents for speech scrambling,” Ralph says. “Not one of them was any good. They all could be unscrambled by scientific people.” One Bell memo described a speech inverter that turned the word “telephone” into “playofeen crinkonope.” Even before German spies were cracking codes like beer nuts, rumors circulated about people eavesdropping on London stockbrokers and calling New York before breakfast to score on a hot tip when the market opened.
Ralph Miller (on phone) talks seashells through the vocoder, circa 1953. (Courtesy AT&T Archives and History Center) (illustration credit 2.2)
On September 30, 1941, US Army Chief of Staff General George Marshall wrote Bell Labs requesting help with the military’s “communications problems.” Three months later, on the morning of December 7, after Marshall had received an intercept of a Japanese ultimatum, he notified Pearl Harbor by Western Union, due to his distrust of the phone. His warning would spend its day of infamy in transit. By the time the teletype message finally arrived in Honolulu, Pearl Harbor had been transformed from a base to a motive. As the US publicly entered the war, there was a cry for “Indestructible Speech,” illocution that could withstand the codebreaker.
During that first week of December in 1941, an article titled “Methods for the Automatic Scrambling of Speech” appeared in the Swiss journal Brown Boveri Review, describing a device that was a vocoder in all but name. The National Defense Research Committee (NDRC), then co-chaired by Bell Labs president O.E. Buckley, took note, and a test unit designed by Ralph Miller, O.O. Gruenz and others was ready by March 1942.
In the spring of 1942, the crowd at the New York Yankees season opener heard a dive bomber growling “Remember Pearl Harbor!” and “Slap a Jap” over the PA system. It was the Sonovox, the vocoder’s Hollywood rival. That year, General Electric’s WGEO had been broadcasting Sonovox messages to Allied forces over shortwave radio. The wind w
hispered, “Give us revenge” and a bomb screamed, “Kill, kill, kill!” Back home, NBC’s signature doorbell chimed, “Buy war bonds!” while the naval air school used Gilbert Wright’s invention to teach Morse Code by enunciating signals into their corresponding letters.
By November 1942, the NDRC had commissioned Bell Labs and its manufacturing contractor Westinghouse to begin work on an unbreakable phone system under the supervision of Ralph K. Potter and Robert Mathes. Construction took place on the twelfth floor of the Graybar Building at 180 Varick Street in Manhattan and at West Street, where the windows were painted black. At the time, 80 percent of the Bell Labs budget was funding military research, “new instrumentalities for warfare,” including radars, sonic deception research, missile guidance systems, ghost armies on vinyl, bazookas, magnetic mines, cameras that photographed bullets in mid-flight, and the “Water Heater”—a twenty-one-foot acoustic torpedo equipped with a 500-watt sound system. At a preordained moment, the speaker would pop out of the torpedo’s nose and play tape recordings of “tactical sounds” to distract enemy ships before self-destructing. A more practical acoustic device was the Laryngophone, a sensory receiver that attached to the throat and transmitted laryngeal vibrations directly into the radio, allowing pilots and tank commanders to be heard over “noisy mechanized forces.”
The vocoder, on the other hand, conveyed intelligence as deformation. As Ralph Miller had reported in Bell’s preliminary testing, the machine committed various atrocities on the voice, reducing speech to a “series of miserable grunts.” “Badly mutilated,” he wrote in the Bell Technical Journal. A clumsy homily soon began making the rounds: “Thirty kilowatts of power for one milliwatt of poor quality speech.”