by Kōji Suzuki
Given that the mysterious string of bases had only been found in the virus drawn from Ryuji, it was safe to assume that he was the one sending the code. It was an undeniable fact, of course, that Ryuji had died and his body been reduced to ashes, but a sample of his tissue still remained in the lab. A countless number of instances of his DNA, the blueprint for the individual entity that was Ryuji, still remained in the cells in that tissue sample. What if that DNA had inherited Ryuji’s will, and was trying to express something in words?
It was a nonsensical theory completely unworthy of an anatomist like Ando. But if he did succeed in making the string of letters yield words by means of substitution, then that would trump all other readings of the situation. Theoretically, it was possible to take DNA from Ryuji’s blood sample and use it to make an individual exactly like Ryuji—a clone. This assemblage of DNA sharing the same will had exerted an influence over the virus that had entered its bloodstream, inserting a word or words. Ando could suddenly sense Ryuji’s cunning and sheer genius behind this. Why had he inserted the message only into the virus, an invader, and not into his red blood cells? Because, with his medical background, Ryuji knew that there was no chance that DNA from the other cells would be sequenced. He’d known that he could only count on the virus responsible for the cluster of deaths being run through a sequencer, and so he’d concentrated his efforts on the virus’s DNA. So that the words he sent would be received.
All of which finally led Ando to one conclusion. Since this code looked to him like a code, it was no longer functioning, in essence, as a code should. Rather, it was just that Ryuji’s DNA had no other way to communicate with the outside. The DNA double helix was composed of four bases represented by the letters ATGC. Ando couldn’t think of any other way for it to make its will known but by combining those four letters in various ways. It had chosen this way because there was no other available to it. It was the only means Ryuji had at his disposal.
Suddenly all the despair Ando had felt a few moments ago was gone, replaced by a buoying confidence.
Maybe I’ll be able to decipher this after all.
He felt like shouting. If Ryuji’s will, lingering in his DNA, was trying to speak to Ando, then it stood to reason that the words it used would be ones easy for Ando to decode. Why should they be more difficult than they needed to be? Ando went back and checked his line of reasoning to see if there were any holes in his deductions. If he started off on the wrong foot, he could wander around forever without finding the answer.
He no longer saw what he was doing as merely a way of killing time. Now that he felt that he would actually be able to decipher the message, he couldn’t wait to find out what it said.
The rest of the morning, until lunchtime, Ando spent working on two approaches.
The sequence he had to work with was:
ATGGAAGAAGAATATCGTTATATTCCTCCTCCTCAACAACAA
The first question was how to divide the letters up. He tried dividing them up in twos and in threes.
First, by twos:
Taking a pair of letters as one unit, the four letters available yielded a possible sixteen different combinations. He wondered if each combination might represent one letter.
But this immediately led him to another problem: what language was this message written in?
It probably wasn’t the Japanese syllabary. There were nearly fifty characters in that, far more than the sixteen allowed by the pair method. The English and French alphabets both had twenty-six letters, while Italian only used twenty. But he also knew he couldn’t overlook the possibility that the message was in romanized Japanese. Identifying the language of a code is sometimes half the battle.
But this was a problem that had already been solved for Ando. The fact that he’d been able to replace the numerals 178136 with the word “ring” could probably be taken as a hint from Ryuji that the present code would also yield something in English. Ando was sure of this point. And so the question of language was as good as settled.
The forty-two base letters could be split into twenty-one pairs. But several pairs were identical: there were four AA’s, three TA’s, three TC’s, and two CC’s. There were only thirteen unique pairings. Ando jotted these numbers down on a piece of paper and then paged through a book on code-solving until he found a chart showing the frequency of appearance in English of different letters of the alphabet.
He knew that although the English alphabet contains twenty-six letters, not all of them occur in equal numbers in everyday use. E, T, and A, for example, are common, while Q and Z might appear only once or twice per page. Most handbooks on code-breaking will include various kinds of letter frequency charts in the back, among other statistical references. Using such tables and statistics made it easier to determine the language a coded message was in.
In this case, what the figures told him was that in an English phrase of twenty-one letters, the average number of different letters used was twelve. Ando clicked his heels. What he had was thirteen different letters, not far off the average at all. This told him that, statistically speaking, there was nothing wrong with him dividing the sequence into twenty-one pairs and assuming that each pair stood for a letter.
Putting that possibility on hold for a moment, Ando next tried dividing up the sequence into sets of three:
ATG GAA GAA GAA TAT CGT TAT ATT CCT
CCT CCT CAA CAA CAA
This produced fourteen trios, or seven unique varieties: GAA, TAT, CGT, ATT, CCT, and CAA. The charts told him that an English phrase of fourteen letters contained an average of nine different letters. Not far off from the seven he had.
Ando immediately noticed that there was a lot of overlap produced by this system. GAA, CCT, and CAA each occurred three times, and TAT appeared twice. But what really bothered Ando was the fact that GAA, CCT, and CAA each appeared three times in a row. If he assigned each triplet a single letter of the alphabet, there were three separate cases in this short passage of the same letter being repeated three times. He knew enough English to know that double letters were not at all uncommon. But he couldn’t think of any English words with triple letters. The only possibility he could think of was situations in which one word ended with a double letter and the next word began with the same letter, e.g., “too old” or “will link”.
He picked up an English book he happened to spy nearby and started examining a page at random to see just how often the same letter occurred three times in succession. He’d gone through four or five pages before he found a single instance. The chances of it happening three times in one fourteen-letter sequence were basically nil, he concluded. By contrast, dividing up the forty-two letters into pairs produced just one double letter. As a result, he decided that statistically it made more sense to go with the first option and divide the bases into pairs of letters.
He’d narrowed down the possibilities. From here he could proceed through trial and error.
The AA pair appeared four times, which meant it must correspond to a letter used with great frequency. Consulting another chart, Ando confirmed that the most frequently used letter in English is, of course, E. So he hypothesized that AA stood for the letter E. The second most common pairs in his sequence were TA and TC, occurring three times each. He also noticed that AA was followed by TA once, while TC was followed by AA once. This might be important, since there were also statistics for various combinations of letters. He started trying out various possibilities for TA and TC, constantly referring to his charts.
As far as letters which often follow the letter E and which are also common in and of themselves, the letter A seemed like the best candidate, which meant that TA could stand for A. By the same logic, he thought that TC might correspond to the letter T. Further, by the way it combined with other letters, he guessed that CC might be N. Thus far the statistics seemed to be serving him well. At least, he hadn’t run into any problems.
This is what he had:
_ _E_ _EAT_AA_NT_ NTE_ _E
What had once s
eemed a random jumble of letters now seemed to be taking on the aura of English. Next he tried filling in the blanks based on what he knew of consonant-vowel combinations, always consulting the charts.
SHERDE ATYAALNTINTE CME
The first three letters seemed to form the word “she”, but the rest of it didn’t form words no matter how he divided it up. He tried switching the positions of the E’s, A’s, T’s, and N’s, and changed other letters around on hunches. When it became too time-consuming to write down the possibilities on paper, he tore sheets out of his notebook, first to make twenty-six cards, one for each letter. It was beginning to feel like a game.
THEYWERBORRLNBINBECME
When he hit on this combination, the first thing that popped into Ando’s mind was the phrase “they were born”. He knew the spelling was a bit off, but maybe it wasn’t too much of a stretch. And the meaning struck a chord with him somehow. But he had a feeling there was a better match out there somewhere, so he kept at the game.
After about ten minutes of playing around, Ando thought he could guess what the result would be, and he stopped. If he had a computer with him, things would be much easier, he thought. The third, sixth, eighteenth, and twenty-first letters were the same. The seventh, tenth, and eleventh were the same. The eighth, fourteenth, and seventeenth were the same. The thirteenth and the sixteenth were the same. The phrase was twenty-one letters long. If he fed those conditions into a computer it would probably come up with the answer, provided he made the proper adjustments for frequency of letter usage. But the computer would undoubtedly come up with several possible solutions. There had to be an infinite number of meaningful phrases in English that satisfied those conditions. How would he be able to tell which one was Ryuji’s message to him? Only if there was something about the right answer that would tell him at first glance that it was from Ryuji, like a signature at the end of a letter. But if there wasn’t, he’d be lost.
Ando realized he was at a dead end. He hung his head, feeling stupid that it had taken him this long to notice. Back in his student days, when his code-breaking intuition had been more finely honed, he would have caught on to this impasse in a minute or two. He’d have to change the way he thought about this. He needed a new hypothesis.
Ando was so absorbed he hadn’t noticed the passage of time. He looked at his watch now to find it was nearly one in the afternoon. He realized he was hungry. He stood up, thinking to go have lunch in the cafeteria on the fourth floor. A change of surroundings would do him good. Trial-and-error and inspiration: he was going to need both if he was going to come up with a solution. And he often got his inspirations while he ate.
The answer to this is going to have to be obvious.
He whispered it almost like an incantation as he headed for the fourth floor.
4
As he ate the set lunch, Ando gazed out the window at the trees down below, and at the kids playing on the swings and the seesaws in the park. It was past one now. The cafeteria had been packed when he arrived, but now there were empty seats here and there. The printout with the base sequence sat on the table next to his aluminum tray, but he wasn’t looking at it.
One wall of the cafeteria had floor-to-ceiling windows, so there was nothing to obstruct his view of the children playing. It was like watching a silent movie. Whenever he saw a boy of about five, Ando’s gaze was riveted to him. Without even realizing it, he’d stare at the child, and it would take him several minutes to snap out of it.
He’d come to this library with his son once. It was a Sunday afternoon two years before, when they were living in the South Aoyama condo. Ando had suddenly realized he needed to look up some data for a presentation he was scheduled to give at a research conference, so he decided to come to the library. He took Takanori along for the walk. But when they got there, a sign at the entrance said NO ONE UNDER 18 ADMITTED. He couldn’t very well make the boy wait outside while he did his research, so he gave up and they played in the park instead. He could remember standing behind the swings, pushing Takanori; he could remember the rhythm of the swing. That same swing was in motion now, under the golden gingko leaves. He couldn’t hear a sound, couldn’t even see the expressions on the faces of the children as they alternately stretched out their legs and tucked them in. But in his mind’s ear he could hear his son’s voice.
But he was getting off track. He brought his gaze back to the page and picked up his pen.
It was time to get back to the basics of code-breaking. There was no other way to crack this kind of code but to come up with several hypotheses, and then pursue each one of them in turn. When it became clear that one theory wasn’t working out, the best thing to do was abandon it with alacrity and move on to the next one. For a message of only twenty-one letters, he wouldn’t be able to rely solely on frequency charts and letter-combination rules. In fact, if the code was complicated enough to require a specific conversion key, it ran the risk of being too hard, in which case it wouldn’t be able to convey what it wanted to. No, he needed to simply work through a bunch of theories by trial and error. If an idea wasn’t working, he needed to abandon it, that was all.
There was one hypothesis that Ando thought he had abandoned too soon, though. It occurred to him that the code might be an anagram.
He returned to the reading room and once again split the forty-two letters into groups of three.
ATG GAA GAA GAA TAT CGT TAT ATT CCT CCT CCT CAA CAA CAA
He’d abandoned this approach because it resulted in triple repetitions of the same letter, a very unusual thing in English. But what if the letters themselves needed to be rearranged? He thought of an example he’d read once, where the phrase “Bob opened the door” had been encoded as OOOOEEEBBDDTPNHR. As it was, the sequence contained far too many letter repetitions to make sense as English, but when rearranged according to a certain set of rules, it yielded a perfectly normal sentence.
This might work, he thought.
But just as he was about to get to work, he stopped. He could see where this was going, too. If he not only had to decide what letters each triplet stood for, but also had to figure out how to rearrange the letters, the task suddenly became a mammoth one. And it wasn’t just a question of time. Without a key of some sort, he’d end up with the same sort of problem he had run into earlier: a plethora of possible solutions with no way to choose among them. He thought of the numbers that had led him to “ring” and wondered if they might somehow be that key, pointing him toward the right order in which to arrange the letters now. But first he’d have to figure out what letters the triplets stood for.
Another dead end.
You need a fresh angle on this, Ando told himself. He was trying to proceed by trial and error, but he felt like he was just trying the same thing over and over. Maybe he was too fixated on the idea of making each set of two or three bases correspond to one English letter.
The solution has to be something unambiguous, something that I can figure out without going through a long, complicated process.
He felt his concentration faltering, his eyes wandering away from the page. He suddenly realized he was staring at the hair of a young woman seated at the other end of the same table. With her head down like that, she looked like Mai Takano, especially her forehead.
Where is she now?
He worried about her safety, especially when he considered that she used to be Ryuji’s lover.
Could Ryuji be trying to tell me where she is with this code?
He considered the possibility for a moment, but then discarded it with a derisive laugh as being too comic-book. How adolescent, to imagine himself as the famous detective out to save the heroine from mortal danger. Suddenly the whole thing seemed foolish to Ando. This probably wasn’t a code at all. There was probably a perfectly scientific explanation for how that sequence of bases got into the virus’s DNA. And once Ando admitted that possibility, he could feel his passion for code-breaking vanish. He was just killing time anyway, right? He was wor
king awfully hard at it.
The setting sun was turning the hairs on his upper arm golden. All the intensity he’d had that morning was gone now. He thought about moving to another seat, where the sun didn’t hit him, and started to get up. Looking around, though, he saw he was surrounded by kids, college students or high school kids studying for entrance exams, all dozing behind mountains of books. Moving wouldn’t help him get his concentration back. The entire reading room was enveloped in drowsiness. Ando sat back down where he was.
Think about it logically, he told himself. There has to be a formula.
He sat up straight. He’d been trying to assign letters of the alphabet to trios of bases, but that didn’t work out to a formula. If he could get it down to a one-to-one function, or even a several-to-one function, then the answer would become obvious. One-to-one, perhaps several-to-one … There had to be a formula like that to be discovered.
He stood up. Logically speaking, there was no other way. His intuition told him that he’d moved one step closer to a solution, and the realization blew away his torpor, spurring him to action.
He went to the natural sciences section, found a book on DNA, and started flipping madly through the pages. As his excitement mounted, his palms grew sweaty. What he was looking for was a chart that gave what amino acid each trio of bases formed.
Eventually he found one. He took the book back to his table and laid it out flat, opened to the chart, next to the coded message.
When a trio of bases, a codon, forms a protein, the codon is translated into an amino acid. The principles by which the translation takes place were contained in the chart Ando had found. There are twenty varieties of amino acid. There are four bases, meaning there are sixty-four separate combinations of three that can be formed. With sixty-four combinations standing for only twenty amino acids, it meant there was quite a bit of overlap. It was several-to-one mapping. Each trio of bases signified one amino acid or another (or a stop).