by DAVID KAHN
Among the great code compilers of this era were John Charles Hartfield, who published The Merchants Code of 15,000 dictionary words in 1877, and followed it with eleven others before 1890, when he was joined by his son, John W. Hartfield, in the business; Henry Harvey, who published 21 codes or lists of codewords between 1878 and 1899; and Benjamin Franklin Lieber, compiler of eight codes, one of them becoming widely used and being translated into French and German. In France there was F. J. Sittler, whose four-digit code sold widely; Bazeries and de Viaris published codes as well. Italy had its Baravelli. These Continental codes, mostly numerical, which lent themselves easily to superencipherment, seemed to aim quite as much at secrecy as at economy, in contradistinction to the American public codes, which emphasized dictionary words as affording greater savings than code-numbers.
But the use of dictionary words, chosen at random, of varying length and irregular construction as regards placement of vowels and consonants, and often closely resembling one another, entrained difficulties. The words were subject to phonetic, orthographic, and telegraphic errors which, unlike errors in plain language, could not be corrected from context. For example, in codetexts spoken aloud, as they were in the days when the mirror galvanometer served in cable telegraphy and one operator watched its movements and called out the signals to an operator who wrote them down, codewords like ACCEPT and EXCEPT or SERIAL and CEREAL would be confused. In handwritten codetexts, JEERING might be confused with PEERING, or MORNING with MOANING. The most prolific source of errors came from the telegraphic transmissions themselves. A telegraph company’s records showed that fully half its errors stemmed from the loss of a dot in transmission, and another quarter by the insidious false spacing of signals. These errors often turned one word into another. For example, dropping the single dot that represents E would convert the French verb CITERONS (“[we] shall point out”) to the French word for “lemons,” CITERONS. AMENDING might become ATTENDING if the two dashes of its M sounded, not as a single letter, but as two separate dashes to make two T’s (−). With two spacing errors in a single word, the result might bear almost no similarity to the original, as BANEFUL might become DUTIFUL.
These errors sometimes transmuted a codeword into one whose decode made sense, or, because telegrams were often only partially encoded, into what the recipient took for a plain-language word. When the recipients acted upon the basis of this erroneous information, financial losses sometimes ensued. The senders of the messages then sued the telegraph companies to recover these losses, on the ground that the faulty transmission had caused the loss. The classic case went to the Supreme Court of the United States.
In June of 1887, Frank J. Primrose, a Philadelphia wool dealer, sent William B. Toland as his agent out to Kansas and Colorado with instructions to buy 50,000 pounds of wool and then await further instructions. Toland did just this, exchanging many messages with Primrose in their telegraphic code during the course of his buying. On June 16, Primrose encoded the following message to Toland: Yours of the 15th received; am exceedingly busy; I have bought all kinds, 500,000 pounds; perhaps we have sold half of it; wire when you do anything; send samples immediately, promptly of purchases. He wrote out the codetext in his own hand: DESPOT AM EXCEEDINGLY BUSY BAY ALL KINDS QUO PERHAPS BRACKEN HALF OF IT MINCE MOMENT PROMPTLY OF PURCHASES. He gave it to Western Union, which transmitted it correctly to the relay station at Brookville, Kansas, but added a dot between Brookville and Ellis, Kansas. The extra dot changed the A (·−) of BAY into a U (··−), and so when the message reached Toland at Waukeney, instead of reading the I have bought that BAY represented, he interpreted BUY as another plain-language word. He consequently bought 300,000 pounds of wool. Primrose, in settling with the sellers, lost more than $20,000 because of that one dot. He sued the Western Union Telegraph Company for this amount, on the ground that they had been negligent in performing their contract with him to transmit the message correctly. But the Supreme Court, in a 33-page decision, ruled that Primrose could not recover more than the cost of the message, as the terms printed on the back of the message blank stipulated, because he had not requested that the message be repeated back to him, which could have made Western Union liable. The telegram had cost $1.15.
Even before this landmark decision, however, code compilers had begun to recognize the danger of promiscuously using any dictionary words as codewords. They employed experienced telegraphers to eliminate words telegraphically too similar. They deleted words that might make sense in the business in which the code was used. Most important, they included only words that differed from one another in spelling by at least two letters. Thus, if MORNING were admitted to the code, MOANING, which differed from it by only one letter, would not be, but LOANING, which differed from MORNING by two letters, would be. This principle became known as that of the “two-letter differential.” Finally, although eight languages were allowed in cable traffic, some American codemakers deemed foreign words too hard for Americans to spell and to telegraph and struck them out as well. All these restrictions so limited the number of usable words that code compilers made up codewords by tacking English suffixes onto English words, even though the suffixes made no sense. For example, to the word NIGH, one code added 49 suffixes, resulting in such strange neologisms as NIGHANT, NIGHBAKE, NIGHCAST, and so on. The compilers justified these on the very practical ground that both code clerks and telegraphers found them easier to handle than many legitimate foreign words, such as AARDMIJTEN, and this was undoubtedly true.
These were among the first artificial codewords. Others were created by hooking code syllables, each with a particular meaning, onto dictionary words to modify them. In one such code, for example, the syllable FI meant you or yours, TI meant it, MI meant me, I, or mine, ZI meant they, them, theirs, and so on. The codeword ACCESA meant What do—advise—to do? and the addition of the syllables FI and ZI, making the codeword ACCESAFIZI, filled in the blanks to make the completed plaintext What do you advise them to do? Some codes provided syllables that the user could combine into an entire artificial codeword that included several ideas. Usually each syllable stood for a variation of a particular idea, as the FI, TI, MI … series of pronouns. But the syllable systems did not conform to the principle of the two-letter differential and, the dangers of transmission error rendering such systems too risky, code compilers moved to the root-and-terminal system. Instead of just using two-or three-letter syllables, they provided groups of four or five letters to indicate different ideas. The code clerk would combine two of these into a single artificial codeword. For example, in one root-and-terminal system, the root APARL meant We order 1500 at 28 shillings, the terminal ANFRO meant 140 jute sacks Duluth Imperial, net c.i.f. London, and the codeword for the entire order was APARLANFRO. The terminal ANERE changed the destination to Liverpool.
Still another, and perhaps the most voluminous, source of artificial codewords was the code condenser. A condenser converts figure codegroups into letter groups, usually resembling artificial words. Because there are more letters than numbers, it is possible to reduce a seven-figure group to a five-letter group (265, or 11,881,376, being greater than 107, or 10,000,000), but most condensers reduce only six figures to five letters because they want to retain a certain alternation of vowels and consonants to keep its letter groups pronounceable. Condensers are essentially tables of letter-number equivalents. In one condenser, the code clerk would convert the group 484704 into ILIKE by finding that 04 is E on the first page of the condenser, and, using the tables on that page, substituting IL for 48 and IK for 47. To reverse the process, he determines the vowel-consonant pattern of the first two syllables. Since it is vowel-consonant, vowel-consonant, or UCUC, he goes to the first page and reads off the equivalents. If the combination were UCCU, the codeword would have been taken from the second page and accordingly those tables would have been used, if CUUC, the third page, and if CUCU, the fourth. Condensers offered several advantages. Words usually cost less to cable than figures. They are
less subject to error. Condensers further compress messages—twelve 5-digit codegroups could be reduced to ten 5-letter codewords. Moreover, each 5-digit codegroup usually has counted as a single cable word, while for codewords a 10-letter group usually constituted the unit of charge. This would cut the toll in half. A final advantage sacrificed economy for accuracy. To ensure correct reception, code clerks would add up the five digits of a codegroup and tack on the units digit of the result as a “check digit.” If the codegroup was 18250, the check digit would become 6, and the clerk would then pass 182506 through his condenser. If the codeword was mutilated in transmission, the failure of the check digit to confirm the total would alert the recipient to the error, and he could request a retransmission.
The code compilers strove constantly to find new ways of reducing cable tolls for users—this was, after all, their raison d’être. Consequently, many of their innovations can be best understood as efforts to circumvent the tariff regulations of the International Telegraph Union, to which most of the nations of Europe belonged. In 1875, the Union’s conference in St. Petersburg reduced the maximum length of a word in extra-European traffic from seven syllables—a regulation that had given rise to considerable abuses, such as CHINESISKSLUTNINGSDON, which had 21 letters but only six syllables—to ten letters. Four years later, the London conference promulgated two regulations that occasioned innumerable disputes, which, in turn, eventually led to the creation of the modern commercial code. Article 8 of the convention stated, in part: “In the extra-European regime code-language telegrams can contain only words belonging to the German, English, Spanish, French, Italian, Dutch, Portuguese, or Latin languages. Every telegram can contain words taken from all of the aforementioned languages.” Article 9 stated that “The following are considered as telegrams in cipher language: (a) those which contain a text in figures or in secret letters; (b) those which include either series or groups of figures or letters, the significance of which is not known to the office of origin; or of words or names, or of groups of letters not complying with the conditions for plain language or code language.” This article threw into the high-priced category of cipher language all the systems employing artificial and invented words, and the code-using public at once began violating it.
But though the counter clerks of the government-owned communication monopolies of European states contested these evasions, the privately owned cable companies did not fight them too hard—for if they did, the user would simply take his business to a more complaisant company. This tendency was aggravated by the fact that domestic telegraph companies in the United States—which did not adhere to the International Telegraph Union—counted any pronounceable group or any dictionary word as a single word. American codes had come to use these artificial groups, and American users saw no reason why they should not use them outside the United States just as they did within. Moreover, the telegraph personnel themselves often found the artificial words, composed as they were of fairly regular alternations of vowels and consonants, simpler to handle than the clusters of consonants sometimes found in English or German.
To end the increasing number of abuses, the Union’s Paris conference of 1890 provided for an official code-language vocabulary. Within Europe, all code-language words would have to come from this vocabulary, but it would be optional on the Europe-America cables. This did not make much sense, since nearly all the abuses occurred in the transatlantic traffic. Nevertheless, the International Telegraph Bureau, the secretariat of the Union, compiled the vocabulary, consisting of 256,740 words of from five to ten letters in the eight authorized languages, and published it in an edition of 15,000 copies in 1894. It met with a clamor of opposition, primarily because it would eventually outlaw many existing codes at great financial loss. So the Budapest conference in 1896 authorized the Bureau to approve or disapprove the words in existing codes. Submitted were 218 codes, containing more than 5,750,000 codewords. The Bureau actually completed its herculean task and published four gigantic volumes in 1900 and 1901 with 1,174,864 words, plus a small appendix, bringing the total of approved words to 1,190,000. But all that immense labor went for nought. The London conference of 1903 dropped the entire idea of an official vocabulary, and, bowing to the pressures of business and to common sense, authorized the use of artificial words. These were to be “formed of syllables capable of being pronounced” in one of the eight standard languages and were to be no more than ten letters long. The Union had in mind words of from five to ten letters that, by alternations of vowels and consonants, would resemble real words. It was in for a shock.
In February of 1904, four months before the new regulations were to go into effect, there appeared in England Whitelaw’s Telegraph Cyphers: 400 Millions of Pronounceable Words. The volume consisted of 20,000 codewords, or “cyphers,” all of five letters each—FREAN, LUFFA, FORAB, LOZOJ—without phrases attached to them. Whence the 400,000,000? Since the maximum permissible length of codewords was ten letters, and since each of Whitelaw’s five-letter words was pronounceable, any one could be combined with any other one in a single ten-letter word, making 20,000 × 20,000, or the 400,000,000. Through this loophole, unforeseen by the Union, the combining of two codewords into one made it possible to halve cable tolls. Whitelaw’s gimmick was immediately adapted by many private firms. In 1905 Ernest Lungley Bentley, 45, who had revised the private code of a shipping agency where he was private secretary to a partner, founded a code company, which the following year published the compact, well-constructed, moderately priced Bentley’s Complete Phrase Code, first of the modern five-letter codes. It has sold well—about 100,000 copies—and remains today perhaps the best known and most widely used of commercial codes. Bentley, a plump, jovial man of medium height, who had a good baritone and always sang in the choir of the church he attended, including St. Paul’s Cathedral’s honorary evening choir, saw this success, living until 1939. The cut in cable costs that the five-letter codes made possible led to an upsurge in cable traffic and inspired the publication of many new codes. Within half a decade, the new five-letter codes had swept the dictionary-word type from the field.
Eventually, codes were compiled for virtually every industry that was not strictly local. A list of even some of them suggests the incredible diversity of modern commerce. There were codes for automobile dealers, bankers, brokers, canned goods, clothing, coal, coffee, commission merchants, cotton, cottonseed, dry goods, electric supplies, flour, fruit, fur, grain, groceries, hay, insurance, iron and steel, leather, liquor, livestock, lumber, meat packing, mining, oil, papermaking, phonographs, potatoes, produce, railroads, rice, rubber goods, the sash-door-and-blinds trade, seeds, ship brokers, shipping, sugar, tailors, textiles, theaters, ticket brokers, tobacco, transportation, travelers, vegetables, wastes, wool. In addition, private firms published their own codes in the fields of butter and cheese, boots and shoes, cordage, dentists’ supplies, drugs, elevators, fire insurance, flaxseed, harness, hides, hops, lead, lime, machinery, millinery, peanuts, printing ink, smelting and refining, soap, spices, steam and gas fittings, steam engines, steamboats, suretyship and guaranty, tanning, tea, wagons, and yarn.
To open these books is to feel the life pulse of the business. The Waste Merchant’s Standard Code offers a consignment of cast iron scrap, excessively rusty with IQUA. Using Tilton’s Income Tax Code, the taxpayer declares firmly MIRASOL for I (we) will not pay—and the tax advisor retorts promptly NASA (The penalty is…). An airline pilot regretfully wires VAOIK (Forced landing account engine trouble) using the Avico Aviation Code, and a lawyer sternly advises IYGWG (habeas corpus) in the Legal Telegraphic Code, which is even bound to resemble a law book. A U.S. immigration agent, using the Telegraphic Code of the Immigration and Naturalization Service, embarrassedly telegraphs his chief GAXEW (…Escaped after being placed on shipboard for deportation). A missionary seeks out HAUCD in the 724-page The Missions Code to sadly report to his home church that (Mission) property (at-----) has been destroyed, and then adds a hopef
ul SWAMK (Join us in prayer for funds). Sometimes the codebooks reveal not just the life of an organization or industry, but also its very soul. Thus the Cinema-code of 1923 has under the heading Picture: is a charming love story = EPWCY, is a classic production = EPWMI, is a country life drama = EPWOK, is a detective story = EPWSO, … is a marvelous, vivid drama = EPXOX, is a spectacular production = EPXUD. But even the Hollywood fairyland met with brutal reality at times, and the compiler, Richard Poillon, felt compelled to include EPXIR (is a great disappointment).
In the 1920s, the explosion of international commerce that had been bottled up by the World War created the golden age of the commercial code. More codes were produced in the five or six years after the war than in the 20 years before it. Many of the great commercial codes date from this era—the ABC 6th edition, the Acme, the Boe, Farquhar’s, the Lombard, the Rudolf Mosse, Peterson’s, the United Telegraph, the Western Union. They were large tomes of hundreds or even a thousand pages, comparing favorably in poundage with a Webster’s Unabridged, and costing in the neighborhood of $25. Many of the codes of this period were produced by the world’s handful of code compilers, nearly all Americans, representing the second generation of workers in this recondite field: John C. Hartfield, son of John W. Hartfield, C. Bensinger, Ernest F. Peterson, Thomas C. Wilwerth, Cyrus F. Tibbals, Cosmo Farquhar, and William J. Mitchel. At least two made fortunes—Peterson and Tibbals.
A trilingual commercial code: The Marconi International Code