The first part of their talk was given over to their techniques for translating characters of plain, written text into quaternary (base four) numbers, and then mapping each of the quaternary numbers onto one of the four chemical bases of DNA. For example, the letter H in plain text would correspond to the number 1020 in the quaternary number system, which in turn would be expressed as TACA in DNA (where 1 is represented by T, 0 by A, and 2 by C). Using such correspondence principles, the plain text word “hello” would become 10201211123012301233 in quaternary encoding and TACATCTTTCGATCGATCGG in DNA encoding. (A program for converting any given text message into quaternary numbers and then into DNA nucleotides can be found at 2010.igem.org/Team:Hong _Kong-CUHK/Model.)
The students described additional techniques for data compression, for deleting repeated sequences, and for ensuring the accurate representation of the message by means of a checksum algorithm. Using these and other means, the team members had calculated that it would take eighteen individual bacterial cells to reliably store the full Declaration of Independence in E. cryptor.
Their offering ended with the conjecture that, since essentially any type of information can be digitized, it will one day be possible to reliably store not only text but also pictures, music, and even video—in bacteria. Even among iGEM projects, which are characterized by bold, out-of-the-box thinking, this scheme was notably ambitious.
There were some precedents for writing human language text and images into DNA, however. In 2009 Claes Gustafsson wrote a paper for Nature, “For Anyone Who Ever Said There’s No Such Thing as a Poetic Gene.” In it, he described how his company, DNA2.0, Inc., during the Christmas season of 2005 gave away free synthetic DNA that encoded the first verse of “Tomten,” a poem by Viktor Rydberg. The verse amounted to fifty words, about eight hundred base pairs long. The protein sequence was back-translated to DNA using the codon bias of reindeer (Rangifertarandus; Well, it was Christmas!). Gustafsson’s final claim was: “To our knowledge, this is the first example of an organism that ‘recites’ poetry.”
An even earlier precedent was set by Joe Davis in 1984–1988 (described in a 1996 article), who cloned a 28-mer length of DNA representing a 5x7 pixel line drawing. W. Wayt Gibbs in his 2001 Scientific American piece, “Art as a Form of Life,” also described the process and gave additional insight into Joe’s work as well as various earlier Nobellevel molecular engineering pranks.
Going forward, we could amplify libraries of 200 mers by carefully minimizing variation in abundance using flanking universal primers. The design would have extensive adjacent sequence overlaps. This can be made much easier to read than a shotgun human genome, for example, by using (1) precise overlaps instead of ragged, random overlaps and (2) extra base pairs to disambiguate repeats. (3) Compression algorithms (as used for Internet images) can also be used to reduce the number of repeats. (4) Check-bits and other tricks mitigate synthesis and PCR errors. The primers at the ends are designed to enable immediate plug-and-play compatibility with standard next-generation sequencing. We will skip two of the hard steps—assembly of oligos at the beginning and the fragmentation and library-making at the end. High synthetic redundancy and similar levels of analytic (sequencing) redundancy help ensure low error rates.
Some hints on what to digitize in an intentionally representative earth message can be seen from the 200 megabyte golden record on the Voyager spacecraft that was launched in 1977. We can store millions of copies of data sealed in small (optionally phosphorescent) plastic time capsules around the world. Nearly every major company, person, and copyright holder would want to be represented in these time capsules.
So, putting pictures, music, and even video into bacteria, as the Hong Kong iGEMites suggested, is not out of the question. As Donald Trump once said, “If you’re going to be thinking at all, you might as well think big” In that spirit, Wikipedia all-languages version amounts to 53 GBytes, the DNA version of which would be 90 billion base pairs long. At 1 bp per cubic nm this would fit into a 5 micron diameter sphere (the size of a human red blood cell) at 100X redundancy and would cost about $1 per 105 copies. By comparison, Blu-ray video disk digital storage is on a 12 cm diameter CD that is 1.2 mm thick, and contains 50 GBytes (which is fully a billionfold bulkier than DNA per bit). At $1 per disk in bulk, the disks would be 100,000-fold more expensive than DNA. Compared to what you can do with biology, Blu-ray data storage is a real waste of money and space! Believe it or not!
Presentations continued throughout the next day, Sunday, starting at 9:30 AM and ending at 5:00 PM. That night, there was an enormous party (the iGEM 2010 Jamboree Social Event), held at Jillian’s, an epic food, drink, billiards, bowling, sports, dancing, and entertainment paradise adjacent to Fenway Park, located across the Charles River in Boston. (Free shuttle bus service was provided to and from the saturnalia.)
The awards ceremony was held the next day, Monday, starting at 9:00 AM, in the Kresge Auditorium of Building 46, which was a curvaceous, glassed-in structure designed by Eero Saarinen and resembled what a hangar for flying saucers probably ought to look like. The auditorium seated 1,226 people, approximately equal to the total number of students, faculty advisers, visitors, photographers, videographers, and other members of the media who were in attendance.
Randy Rettberg, the director of iGEM, was master of ceremonies. After the ritual return of the BioBrick Trophy from the prior year’s Grand Prize winner (the University of Cambridge), the judges announced the six finalists whose names were projected on a screen: the University of Bristol, Cambridge, Imperial College London, Peking University, Slovenia, and the University of Technology, Delft.
Clearly Slovenia was the team to beat. Based at the University of Ljubljana, Team Slovenia had been the grand prize winner at two previous iGEM competitions in 2006 and 2008, with impressive projects in each case: in 2006, a scheme for preventing infection of human cells by HIV, and in 2008, the development of a synthetic vaccine for Helicobacter pylori.
The six finalists now had to repeat their presentations before all the competitors and all of the judges. This took nearly three hours, under hot lights, general restlessness, and rising ambient temperatures. After a break, the assembled iGEMites, dazzling in their team colors, gathered together for the rip-roaring “iGEM from Above” group photo (2010.igem.org/Main_Page).
The results were announced in reverse order of importance, accompanied by the distribution of Best Application and other “Best” prizes, prizes for the runners-up, and so on, until, finally and at last, the grand prize winner was revealed to be . . . Team Slovenia.
And appropriately so, for their project this year recapitulated, on the level of DNA engineering, the way in which assembly-line mass production had improved upon the prior system of individual craftsmen working by themselves. Whereas craftsmen turned out their products slowly and laboriously, the assembly line had organized the production of a given object into an orderly, fast, and efficient process that furthermore occurred in a single place, such as in a factory.
Similarly, DNA expression normally took place as a result of enzymes randomly bumping into and around the molecule and then fitting in when and where possible. This was wasteful and inefficient, and could be improved. And so the Slovenians vowed to make DNA engineering more like an assembly-line process. First the team members created a custom DNA molecule by putting specific sequence blocks together in a predefined order. Then it enhanced a group of enzymes by means of proteins that enabled the enzymes to bind selectively to given, predefined sites on the molecule. The custom DNA and the newly enhanced enzymes significantly increased the speed and yield of the DNA expression reaction. Finally, by changing the order of the DNA sequence blocks, the team could change the reaction’s output. The result, they announced, was “DNA guided assembly lines.”
Essentially, it was the industrial revolution all over again, but played out this time on the level of molecules. In the course of doing all this, Team Slovenia submitted 151 new biobricks to the r
egistry.
Team Citadel, as it turned out, did not figure in the rankings. The three team members, Brian Burnley, Patrick Sullivan, and Hunter Matthews, had taken their turns, giving an abbreviated presentation on Sunday, the last day of the event, in the last time slot of the day. Their talk was abbreviated because, owing to delays in getting their design synthesized, they hadn’t completed the construction of their appetite-suppressing bacterium. The students learned the hard way that advertised DNA synthesis times were decidedly on the optimistic side, for in their case what was supposed to take three days instead took three weeks. In consequence, their obesity-controlling bacterial devices—which they had since named Appetuners—did not meet all of the criteria required for formal recognition at the jamboree.
The Chinese University of Hong Kong, by contrast, received a Gold Medal for its “living data storage system.” And so did Team Valencia for its scheme to terraform Mars with strains of dark-colored yeast.
Team Harvard also received a Gold Medal for its iGarden project. Its aim was to enable gardeners, either with or without a scientific background, to genetically engineer their own plants—to “personalize” them according to their own individual tastes and preferences. Team members had taken a poll of regular folk gardeners and found that the most commonly requested plant improvement was to increase the nutritional value of fruits and vegetables. This could be done, for example, by enhancing the gene for lycopene or beta carotene production.
To prevent an engineered plant from spreading beyond its intended boundaries, the Harvard team members advanced the idea of building a molecular “fence” around an iGarden. This would take the form of a “death gene” that would be activated if and when the engineered plant escaped from the lab. They hoped that this strategy would help allay some people’s negative attitudes toward genetically modified foods. (Of course, it might also cause them to run screaming from their greenhouses.)
With the end of 2010 competition, iGEM reached a turning point. It had become so popular, and participation had become so large and unwieldy, that the jamboree could no longer be held all at one venue. IGEM 2011, therefore, would be split into three regional divisions, Asia, Europe, and the Americas, with separate competitions and regional jamborees held within each division. The winning teams would then advance to the World Championship Jamboree at MIT. (The grand prize winner at the 2011 iGEM was the University of Washington, whose team project had the dual goals of engineering E. coli to produce diesel fuel and engineering an enzyme to break down gluten in the digestive tract.)
The growth of the iGEM competition surpassed the wildest hopes of its creators. What it all meant was that, increasingly, some of the world’s most imaginative, significant, and potentially even the most powerful biological structures and devices were now coming not from biotech firms or from giant pharmaceutical companies, but from the ranks of university, college, and even secondary school students, who were doing it mainly in the spirit of advanced educational recreation. Proof of the power and allure of redesigning life.
CHAPTER 9
- 1 YR, HOLOCENE
From Personal Genomes to Immortal Human Components
The Holocene epoch has lasted from about 10,000 years ago to the present. During this period human civilization as we know it arose and blossomed. One of the most important elements of civilization, across all cultures, times, and regions, has been the development of medicine, a system of beliefs intended to explain and influence the course of the main stages and changes that characterize human life: birth, death, health, and disease. Back in the most ancient eras, these events and conditions were often regarded as the work of spirits, demons, gods, and various astral or mystical phenomena. (Belief in faith healing is still with us today.)
The history of medical folkways is unsavory, gory, and even gruesome, filled with wild beliefs, horrifying practices and devices, and useless nostrums and potions, all thought to promote health or cure illness. Bloodlettings, libations, purgations, emetics, piercings, shamans, the ritual sacrifice of animals or humans, concoctions featuring eye of newt and/or other ground-up animal parts, various and sundry snake oils, brews, spices, tinctures, tonics and elixirs, syrups, salves, rubs, pills, drops, lotions, airs, diets, spells, incantations, rites, sacred dances, trances, chants, sexual practices, special bodily ornaments and charms, tricks with magnets, and the like—all of these and more are treasured elements of medical (mal)practice through the ages.
Western medicine, as a science (as opposed to folklore), is commonly viewed as starting with Hippocrates, who was associated with a medical school, the Asklepieion, on the Greek isle of Kos. Unlike those who attributed disease to occult entities or forces, Hippocrates offered physical, rational, and empirically founded explanations of health and disease, which he furthermore viewed as physical states or conditions. The subsequent history of medical science is the story of reducing health, disease, and bodily functions to their natural and material underpinnings. Well-known medical milestones include Vesalius’s description of human anatomy, William Harvey’s theory of blood circulation, and Pasteur’s germ theory of disease. But as important as they are, all of these advances are dwarfed by the advent of molecular genetics, and in particular by the decoding of the human genome.
The human genome is the recipe for building and maintaining a human being, and its component genetic structures, discrete sequences of DNA, are the underlying molecular sources of disease and health. More than three thousand human diseases are known to be caused by a specific gene or a combination of them (acting in concert with the environment), and nowadays hardly a week goes by without an announcement of the discovery of a new gene for a given disease or condition, everything from asthma to zoophobia.
To speak of a “gene for a disease” is actually to speak of a mutation of a gene that would otherwise not play a role in causing illness. The gene for cystic fibrosis, for example, was discovered in 1989 by a team headed by Francis Collins (whom we have already met as head of the Human Genome Project). In cystic fibrosis, cysts and fibrous scars in the pancreas block the pancreatic ducts, restricting the flow of digestive enzymes. The intestines and lungs can also be involved, in the latter case resulting in respiratory infections. Collins and his team located a long DNA sequence on chromosome 7, and the sequence later became known as the CFTR gene. It normally codes for a protein that transports salt and water across the cell membranes of various organs. The Collins group found that the deletion of just three bases, CTT, from the normal DNA sequence of the CFTR gene was sufficient to cause the disease. (Other researchers later discovered more than 1,000 other mutations that can also play a role in causing cystic fibrosis.)
Other illnesses are caused by the difference of as little as a single base in an otherwise normal genetic structure. Sickle-cell anemia, the classic example, is the result of the substitution of just one letter of the HBB gene sequence that codes for normal hemoglobin. The HBB gene for normal hemoglobin includes the triplet GAG, which is the three-letter code for glutamic acid. In mutant hemoglobin, however, the triplet GTG, which codes for the amino acid valine, appears where GAG should be. The occurrence of the base T (thymine) instead of the proper base A (adenine), a difference of a single molecular structure on the gene, is enough to induce the malformation of the blood cells characteristic of sickle-cell anemia. For want of a nail good health is lost, here in the form of a severe, chronic, and generally incurable condition. (The variation of a single base from what’s normal is known as a single nucleotide polymorphism, or SNP.)
As these examples show, we can now pinpoint, with literally atomic accuracy, the molecular basis of many human pathologies. For many such cases, we are reaching a major goal, in the reduction of health and disease states to their ultimate physical foundations in the human genome. The discovery of “dark matter” in the genome, material distinct from known genes that plays a causal role in some sicknesses, is of ongoing interest.
Thus the importance of the Human Genome Project
as well as that of its latter-day descendant, the Personal Genome Project.
Our ability to trace states of disease and health to their atomic underpinnings is a manifestation of the fifth industrial revolution, which focused on atoms. This revolution originated in the discovery and applications of strange quantum phenomena in physics and chemistry. In a single year (1905) a twenty-six-year-old patent clerk, unable to get a job in academia, published four papers (all in the same journal, Annalen der Physik). Any one of these papers could have earned him a Nobel Prize, and indeed the first one did do so. The topics were the photoelectric effect (in which light photons dislodge electrons, the outer components of atoms, from a solid or liquid surface), Brownian motion (in which small particles are buffeted by the thermal vibration of molecules), special relativity (which pertains to motion at nearly maximal speeds), and the mathematical relation between mass and energy. This last paper included what is without a doubt the most well-known equation of all time, E = mc2. It is famous both because of its brevity and for its Promethean recipe for the release of earth-shattering amounts of energy resulting from the splitting or fusion of small concentrations of atomic nuclei. Understanding these phenomena required measurements of matter interacting with light (atomic spectra) and other kinds of particle beam radiation.
The quantum revolution has made an enormous impact on chemistry and materials sciences. These effects can be roughly divided into nuclear and electronic phenomena. The radius of an atomic nucleus is about 100,000 times smaller than the radius of the surrounding electron cloud that defines most of the atom’s chemical properties. Quantum-mechanical breakthroughs concerning electronic bonding have been fundamental to understanding the nature and strength of chemical reactions and interactions. Exploration of the nuclear realm enabled the introduction of radioisotopes, which have been crucial for many advances in biochemistry and medicine as well as the dating of ancient specimens, and in DNA sequencing. The use of stable isotopes and mass spectrometry extended such studies considerably. The diffraction of X rays from material objects ranging from simple salts to organelles like the ribosome, as well as information from the interactions of nuclear spins in nearby atoms in NMR, have provided a window into the world at atomic resolution, paving the way for precise molecular engineering.
Regenesis Page 22