When We Are No More

Home > Other > When We Are No More > Page 16
When We Are No More Page 16

by Abby Smith Rumsey


  In 2000, an experimental physicist at the Lawrence Berkeley National Laboratory accidentally learned about the at-risk audio problem. Carl Haber was at work “developing devices to image the tracks of subatomic particles which would be created at the Large Hadron Collider.” What about “playing” the disc by creating a map of the surface of the disc and converting the image into sound waves? No contact, no damage. The sound could even be cleaned up, amplified, and noise reduced. Working with sound recording engineers and preservationists, Haber and his research partners developed a cluster of technologies, called IRENE (Image, Reconstruct, Erase Noise, Etc.), that image sound in both 2-D and 3-D (the latter useful for cylinders). It is now possible to hear the voices of Native Americans performing tribal rites that had been lost to their descendants and the voice of Alexander Graham Bell himself as he tests out his new machine. The forensic imagination means that there are now almost limitless possibilities for the extraction of information from any piece of matter, no matter how fragile.

  BIG SCALE AND CONTESTED RIGHTS

  The old paradigm of memory was to transfer the contents of our minds onto a stable, long-lasting object and then preserve the object. If we could preserve the object, we could preserve our knowledge. This does not work anymore. We cannot simply transfer the content of our minds to a machine that encodes it all into binary script, copy the script onto a tape or disk or thumb drive (let alone a floppy disk), stick that on the shelf, and expect that fifty years from now, we can open that file and behold the contents of our minds intact. Chances are that file will not be readable in five years, and certainly far less if we do not check periodically to see that it has not been corrupted or that the data need to be migrated to fresher software. The new paradigm of memory is more like growing a garden: Everything that we entrust to digital code needs regular tending, refreshing, and periodic migration to make sure that it is still alive, whether we intend to use it in a year, one hundred years, or maybe never. We simply cannot be sure now what will have value in the future. We need to keep as much as we can as cheaply as possible.

  But what is possible? Such a model defies the fundamental proposition that whatever we do in the digital realm needs to be able to scale up almost infinitely. To be able to store petabytes and exabytes and other orders of magnitude of data for long periods of time, we will have to invent ways to essentially freeze-dry data, to store data at some inexpensive low level of curation, and at some unknown time in the future be able to restore it—to add water and grow, not unlike the little party favors made of tightly wadded pieces of colored paper that when you put in a glass and fill with water expand to become villages or buildings or gardens or animals. Although this represents a serious challenge for computer science, it may be possible to store not the full file, but the instructions on how to re-create the file, just as the genome does not store an animal or plant itself, but the instructions on how to grow a particular animal or plant. Until such a long-term strategy is worked out, preservation experts focus on keeping digital files readable by migrating data to new hardware and software systems periodically. Even though this looks like a short-term strategy, it has been working well for relatively simple text, image, and numeric formats like books, photographs, and tables for three decades and more.

  The need for scale presents a second problem, of course, and that is how we make sense of all these data. This is a hard problem, but not insuperable. It will be solved by machine intelligence that identifies patterns in a set of data and thereby determines the context in which the data “make sense.” (This is essentially how brains figure out what they perceive, by comparing real-time perceptions with those stored in memory to determine what something means and what its significance is.) Only machines can read computer data, and only lots of machines can read data at scale. But machines are extensions of us, tools that are simply the means to ends of our choosing. We are the ones who will design, build, program, and run the machines. We will decide how they get used, by whom, and for what purposes. And we will be the ones who must make sense of the results they give us. That said, we can imagine making sense of declarative memory—facts, figures, and whatever can be rendered in binary code—as immensely complicated but still tractable. How machines encode affective memory, denoting emotional salience, ambivalence, ambiguity, even something as simple in natural language as a double entendre, is a different order of complexity.

  Beyond the problem of sheer scale, there are formidable social, political, and economic challenges to building systems that effectively manage an abundance of data, of machines, and of their human operators. These are not technical matters like storage and artificial intelligence that rest in the hands of computer scientists, engineers, and designers. They are social. Digital infrastructure is not simply hardware and software, the machines that transmit and store data and the code in which data are written. It comprises the entire regime of legal and economic conditions under which these machines run—such as the funding of digital archives as a public good, creating a robust and flexible digital copyright regime, crafting laws that protect privacy for digital data but also enable personal and national security, and an educational system that provides lifelong learning for digital citizens. We need to be competent at running our machines. But much more, we need to understand how to create, share, use, and ultimately preserve digital data responsibly, and how to protect ourselves and others from digital exploitation.

  In similar fashion, each organization, society, laboratory, photography studio, medical practice, architectural firm, law practice, financial services company, and so on is now obliged to manage its own corporate files. In professions with legal and fiduciary obligations to maintain specified types of data, data management and archiving systems are standard operating procedure now. Commercial firms—particularly those in the business of selling “creative content” (music, films, video games) protected by copyright—have complete control over the long-term fate of what they produce, despite what is clearly in many cases the strong public interest in preserving that content and making it available after its commercial value is fully exhausted. At present, such companies have no financial incentives to hand off their cultural assets to an institution that can ensure their long-term public access. Economic and tax policies can be used in these cases to ensure the continued growth and support of public domain cultural materials.

  The copyright law in the U.S. Constitution was created by the founding legislators to provide incentives for creators to circulate their ideas in the marketplace. In 1787, the authors of the copyright law made provisions “to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” They introduced financial incentives to publish by granting to the copyright holder exclusive rights to disseminate their works for fourteen years. Monetary incentives were deemed more democratic than the patronage of the church, the aristocracy, or royalty. Copyright law then coevolved with new technologies and new institutions, usually a step or two behind the facts of innovation. The roles of public access to information through libraries and archives also evolved and became more critical as the United States matured and absorbed more immigrants into the population. In light of the increasing importance of libraries to the economic, political, and cultural life of the nation, the copyright code was updated to grant exemptions allowing libraries to lend copyrighted content to the public and to create preservation copies to ensure continued access.

  Yet in the digital age, the fundamental mission of libraries and archives to preserve and make knowledge accessible is at risk because there is no effective exemption from copyright law that covers the specific technical preservation needs of digital data. It is unrealistic to assume that in market capitalism, publishers, movie studios, recording companies, and other commercial enterprises will preserve their content after it loses its economic and commercial value or becomes part of the public domain. This is what lib
raries do. Nor is there a provision in the copyright code, comparable to the one that exists now for printed materials, that allows for libraries to lend their books, films, and audio recordings digitally.

  The World Wide Web is not a library. It is a bulletin board. That was how it was originally designed by the computer scientist Tim Berners-Lee—to be a neutral medium of exchange—and that is what it continues to be. It will be a challenge to re-create the traditional public library online, because a public library exists in large part to provide access to contemporary copyrighted materials. The current copyright law, in particular the provisions that allow public libraries to lend their materials, does not apply to digital books or any other content in digital form. This means that any material that is under copyright, even if held by the library, cannot be put online without express permission of the copyright owner. This includes for all intents and purposes everything, in all formats, including audiovisual material, that has been published in the United States since 1923. Given the recent history of copyright extension, this means the twentieth century may be dark for quite a long time. Any material created since 1923 and still under copyright protection in 1998, when the Sonny Bono Copyright Term Extension Act was passed, will not enter the public domain until at least 2019 and possibly later than that.

  The scope of the necessary modifications to the legal regime of copyright and contract law is clear enough, though they will be slow to effect because they require balancing the interests of private parties and the public domain, an action that the U.S. Congress must take to effect the changes. Far less clear are the economic models that will support the value of information—both short-term and long-term—including the cost of creating, using, and keeping it available for use. Much of the content on the open web, for example, has been contributed “for free,” which really means that the user pays no transaction fee directly to the creator to use the content. It does not mean that the expert who contributes an article to Wikipedia, the scholar who writes a blog, and the member of the general public who joins the crowdsourcing project to curate old menus put online by the New York Public Library do not contribute valuable time and labor. The very norms that decide who owns what information—even which information is public and private—are still a matter of great debate. Some have suggested that each person who contributes information to the open web should be reimbursed by some micropayment every time it is used for commercial purposes. In truth, there is no consensus on how to determine the economic value of online information yet (other than the reliable rule that data are worth what someone will pay for them). This is particularly true of the kind of data contributed by users who upload information freely for one purpose—contributing to a review site such as Yelp or placing a classified ad on a bulletin board such as Craigslist—to have that information used by a third party for different purposes, such as demographic analysis that might be marketed. The privacy policy of these sites will tell users what they can expect to happen with the data they contribute, so the choice to add data or not is in the hands of the contributor in the end. That said, the scope for abuse is vast and may over time discourage the contribution of content for public dissemination.

  IN THE MEANTIME …

  Building resilient and ubiquitous digital memory systems will take time. It will require concerted investments of human and financial capital to model and test approaches. There will be near misses along the way, but failure can be very instructive. There will be social, political, economic, and legal wrangling as people and corporations scramble to secure rights and revenue streams before they even know which business models will support growth and which will crush it. We are still in the early days of the digital era. The best, if not the only, way to understand the powers and limitations of the technology is to use it. In the meantime, until the arrangements work themselves out, the opportunity for individuals to make a difference will be almost unlimited.

  Because of the distributed nature of the web, it is easier than ever to be a collector. There are no culturally agreed-upon norms regarding which content has value in the digital age and which does not, so individuals and small organizations that collect at scale today will help to determine the value and authenticity of today’s content in the future. There are new populations of digital historians, librarians, and archivists who are aggressively collecting and preserving online information they think has long-term value. One of the earliest examples of such foresight began within days after September 11, 2001. People were rushing online to express their feelings and share information about events as they witnessed them. A pioneering group of historians at the Center for History and New Media at George Mason University immediately set up a site to solicit personal testimonies about the attacks of 9/11. This includes raw, rare, and invaluable eyewitness testimonies. This early example of crowdsourced documentation was gathered and put in good order by historians at the center, then transferred to the Library of Congress for posterity. This was the first digital collection acquired by the Library of Congress, most fittingly a collection born of allying private and public purposes and entrusted to the stewardship of a publicly supported library.

  Great collections are made by great collectors. Between 1996 and 2014, the Internet Archive collected and preserved over 450 billion web pages. To a large extent the archive serves as a preview of what research libraries in the future may become, collecting and preserving for future access vast amounts of publicly available digital content. In addition to preserving significant portions of web communications, the Internet Archive enables people to archive personal digital collections. It also allows free uploading of digital files, as well as scanning books, films, television, and all manner of analog materials to broaden access to them. National libraries, archives, and research institutions around the globe that are in the collecting business have created a consortium to coordinate and broaden their scope of digital collecting. But so far no other organization has accomplished what this small nonprofit organization has. The Internet Archive is a classic start-up—nimble, opportunistic, driven by strong ambition and breathtaking vision, the very model of the Internet culture itself in its not-for-profit mode.

  Crowdsourced collecting depends on the open web, and like the American West in the nineteenth century, what was once a frontier of open range is closing fast. Not only is the web being fenced in by commercial entities, but to an alarming degree, it is being ignored and altogether sidelined as more digital content is distributed through closed proprietary systems and apps that are wedded to specific operating systems, software, and hardware, a model pioneered by Apple for music on mobile devices and quickly imitated by Amazon for e-book readers. To some extent, these changes are part of a battle for market share among several major technology companies, and the market will eventually sort out which services consumers want the most, how they want them delivered, and what they are willing to pay for them. Already we have seen mobile devices take the place of desktop computers, laptops, cameras, music players, landline telephones, maps, atlases, watches, journals, and newspapers for delivering information.

  In Jefferson’s vision, access to organized knowledge is necessary to promote the progress and well-being of humanity. In the developed world, market capitalism plays important roles, but long-term investment in the public good is not one of them. Google boasts that they organize knowledge for the world. But the vision of Jefferson and the Founders proposes that the organization of access to knowledge is to be a public utility, wholly owned by the people for the purpose of self-government. Unless there is a handoff made between private entities that have the power to create, disseminate, and own content on the one hand and the long-lived nonprofit institutions capable of providing stewardship on the other, it will be hard to avoid collective amnesia in the digital age.

  PART THREE

  WHERE WE ARE GOING

  At the present moment, something new, and on a scale never witnessed before, is being born: humanity as an elemental force conscious
of transcending Nature, for it lives by memory of itself, that is, in History.

  —CZESLAW MILOSZ, “ON HOPE,” 1982

  CHAPTER TEN

  BY MEMORY OF OURSELVES

  Time and accident are committing daily havoc on the [documents] deposited in our public offices. The late war has done the work of centuries in this business. The last cannot be recovered, but let us save what remains; not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident.

  —THOMAS JEFFERSON TO EBENEZER HAZARD, 1791

  THE PREDICTABLE UNPREDICTABILITY OF THE FUTURE

  Much as Thomas Jefferson had expected, today scientists, engineers, and technologists play leading roles in advancing knowledge and making it useful in our daily lives. In the twenty-first century, we command powers to release massive amounts of energy by splitting an atom and alter the script of life by splicing genes. Just as Jefferson anticipated, every advance in knowledge has brought with it new powers and new responsibilities. He built a library to ensure access to knowledge so that we may better govern ourselves. But we outgrew Jefferson’s library, with its driving ambition to comprehend all of human knowledge by gathering printed volumes into one place, over a century ago. As the volume of information demanded by technology proliferates, it is increasingly difficult for us to know what we know, let alone take responsibility for it. How will digital memory shape our world in the next fifty years?

 

‹ Prev