Book Read Free

Info We Trust

Page 4

by R J Andrews


  Numero-ideographic tablets used in ancient Sumerian accounting represented discrete transactions. Each circle equals 10 in this 5,000-year-old Uruk example: 12 + 19 = 31.

  Crops beget surpluses, which beget markets. Trade depends on people trusting one another. Individually, people are likely to forget (or manipulate) what exactly was traded. Arguments between traders do not make good markets. So, to ease friction, traders recorded agreements by documenting them at the time of the trade.

  Memories were frozen by pressing them into clay. Then, these clay tablets were made public, or published. Abby Smith Rumsey highlights that these tablets, unlike traders, were unable to lie or forget. She calls them objective witnesses, truth tokens, and warrants of trust.

  Clay tablets are a curiously durable method of information storage; a library fire simply hardens them.

  An army of bureaucrats grew to manage clay records. Tablets helped tax subjects, sustain communities through drought, raise armies, trade with neighbors, and glorify civilization. These records of counting were stored in vast centralized library warehouses. And data just sat in racks, curated by castes of scholars, for a long time.

  The Way of the Future

  In January of 1662, a London peddler named John Graunt published Observations on the Bills of Mortality. Hundreds of years later, we now look back and see that his book was a notable surge in a deluge that has since swallowed up the world.

  The application of figures of arithmetic to “the condition and prospects of society,” dates from those early times when families first clustered together and grew into tribes, and tribes into nations; when cities and fortified places came into existence; and men, impelled by very natural motives, took to measuring their wealth and strength. This they did by counting their tents, their herds and flocks, their camels, horses, cattle, sheep, goats, and slaves; their warriors, arms, and war chariots; their money and articles of barter. They even traveled so far on the road to development as to take censuses of their populations, and to make muster-rolls of their fighting men; and they were precise in their statements of the number of victims by their plagues and pestilences.

  WILLIAM A. GUY, 1885

  In the 1600s, London suffered from incessant plagues of disease. John Graunt went to work to help King Charles II understand what was going on. First, Graunt gathered proxy birth and death data from a mess of records scattered across London's parishes. His book began with neat tables of local christenings and burials going back to the prior century. The cause of death was included when available. After organizing the newly uniform demographic data, excerpted below, Graunt included some written observations.

  A TABLE of the

  CHRISTENINGS and MORTALITY

  For the Year 1645 and 1646

  Weeks.

  Days of the Month.

  Christ.

  Bur.

  Plague

  1

  Dec. 25. 125

  205

  6

  2

  January 1. 143

  217

  2

  3

  8. 161

  171

  0

  4

  15. 138

  241

  2

  5

  22. 165

  200

  0

  6

  29. 160

  202

  1

  7

  Febr. 5. 146

  192

  0

  8

  12. 152

  188

  5

  9

  19. 157

  171

  1

  10

  26. 128

  155

  1

  The word state is all wrapped up in how we talk about the condition of things: the political state, status, stature, standing, statement. State is a perfect root word for the field that works with data: statistics, from Italian statista, one skilled in statecraft.

  Graunt's Observations on the Bills of Mortality was a success. The book propelled its author into the prestigious Royal Society. The data tables established a foundation for future demographers. A new science of population study emerged by analyzing Graunt's data and refining his approach. Probability, the mathematics of chance, was once transfixed by the odds of gambling dice and playing cards. Now, it had a new domain to explore. Probability had empirical numbers from the real world. Probability had data.

  John Graunt's tables are a notable wave in an ever-surging flood of data. Since then, data has swept many domains of life into modernity. Insurance funds emerged from studying population records. Physical sciences materialized from the numbers and diagrams in laboratory logbooks. Performance measurement rapidly improved mechanical contraptions. These machines helped an economic revolution roar out of English mines. Statecraft hatched a new political science called statistics. This is just a sample of what data helped create, all before 1800. Data, the capture and organization of facts for analysis, has ever since helped advance civilization.

  As knowledge increases amongst mankind, and transactions multiply, it becomes more and more desirable to abbreviate and facilitate the modes of conveying information from one person to another, and from one individual to the many.

  WILLIAM PLAYFAIR, 1786

  We can no longer rely on the skills we have honed over millennia to manage our knowledge by managing physical objects, be they papyrus scrolls or paperback books. Instead we must learn to master electrical grids, computer code, and the massive machines that create, store, and read our memory for us.

  ABBY SMITH RUMSEY, 2016

  Information inventions create periods of information inflation. These technology impulses include Mesopotamian writing, Greek libraries, and European movable type. Each new technology outsourced more of our memory to “ever more durable, compact, and portable” objects. Storage evolved from tablets, to scrolls, to books, to microfilm. Every time, the information surge overpowered our ability to manage information. Archaic institutions groaned under the weight of information inflation. When traditional institutions failed, people became disoriented. The public had to figure out for itself what information was trustworthy. Crises of authority ensued.

  The cost of producing and caring for clay tablets (or scrolls, books, or microfilm) used to limit the rate of data production. Objects always forced us to decide what we wanted to save. Today, the latest digital information inflation has uncoupled data from physical objects. Today, we have arrived again at a world with more data than we know how to manage.

  The utility of new information, in a strict sense, is quantifiable. It is a foundational pillar of statistics according to Stephen Stigler who questions, more evidence is better than less, but how much better? The central limit theorem, first clearly defined by Pierre-Simon Laplace, implies a root-n relationship where if you wished to double the accuracy of an investigation, it was insufficient to double the effort; you must increase the effort fourfold.

  New data often arrives to us by way of tremendous exertion. At the dawn of the 1900s, Annie and Walter Maunder voyaged around the world to observe solar eclipses. One hundred years later, their efforts are mirrored by the struggle to put the enormous James Webb Space Telescope into orbit. On the ground, remote corners are photographed for street maps. Cosmic telescopes and streetview-harvesting vehicles are both visibly impressive data producers. But efforts are not always so showy. Smartphones quietly archive a trove of data about every user: what you browse, where you go, and who you know.

  A river cannot, we are told, rise above its source.

  DARRELL HUFF, 1954

  Incentives nudge us toward better knowledge of the world. New data is at the heart of much value creation. Today, tech entrepreneurs create new data to connect additional slices of reality to the Internet. Interesting research papers often apply rote analysis to clever data sets. Journalists fight hard for a good scoop, that is, first access to data. Unique, untouched, never-before-seen data is t
he foundation from which new insights and competitive advantages flow. These advantages can be secured if you generate new data yourself. I have spent years working on new datasets before attempting any serious visualization. Storytelling icing is delicious, but only if you are frosting the finest data cake.

  Immersive Data

  Objects and anecdotes that survive from the first data pioneers afford only a glimpse into their worlds. History lost most of the context that surrounded early data like Graunt's tables. Today, we can consider a much richer context. We have the power to expand what gets considered when we analyze modern data.

  If, for example, it is a gasoline account, read books on oil geology and the productions of petroleum products. Read the trade journals in the field. Spend Saturday mornings in service stations, talking to motorists. Visit your client's refineries and research laboratories.…

  DAVID OGILVY, 1985

  Do not expect data to tell you all the questions it has answers to. You must seek many perspectives if you are to exploit every line of inquiry. The entire world the data arrives from is a player in the story, whether we acknowledge it or not. Invite it in. Trust it as a collaborator. Let the world speak. Let it guide you to a more truthful understanding. To probe data origins is to develop a feel for the world the data comes from, sometimes before even opening the dataset. Probing reveals who is involved and how they think about their world. It also gives insight to what did not make it into the data set. What hard decisions had to be made to capture this data? How was it recorded?

  We live in an era of social science, and have become accustomed to understanding the social world in terms of “forces,” “pressures,” “processes,” and “developments.” It is easy to forget that those “forces” are statistical summaries of the deeds of millions of men and women who act on their beliefs in pursuit of their desires. The habit of submerging the individual into abstractions can lead not only to bad science (it's not as if the “social forces” obeyed Newton's laws) but to dehumanization.

  STEVEN PINKER, 2014

  Ideally, you are able to talk to the people who know the data and the data's world. A conversation might begin by trying to understand their needs and concerns. Questions that invite interviewees to visualize help me learn. If I were to follow you, what would I see? Could you show me how you think about this?

  People who know the data's world always have something powerful to teach. It is my responsibility as interviewer to unearth the lesson. Pursue incomplete responses with curiosity. Are there other reasons? Can you give me an example of that? Why do you think, feel, need? Probing and open questions with no easy yes/no answer encourages interviewees to reveal more context. Can you tell me something you know to be true that the data does not show? What do you know that you wish your boss's boss knew? That is all I have to ask—is there anything you would like to add?

  Human centered design [emphasizes] solving the right problem, and doing so in a way that meets human needs and capabilities.

  DON NORMAN, 2013

  To know the human side of data is to fortify yourself against cold summaries that dehumanize it. I try to observe a real patient's experience whenever I work with health-care data. These visits are always worth the effort. I get vivid encounters with patients, clinicians, and facilities. They help correct my misconceptions and move me to better understand how the data works. Stay too long behind a computer screen and the reality behind each anonymous patient ID begins to fade. An in-person journey into data origins gives a sensory experience that you can reference across data work.

  We are ready to question the impersonality of a merely technical approach to data, and to begin designing ways to connect numbers to what they really stand for: knowledge, behaviors, people.

  GIORGIA LUPI, 2017

  Unfortunately, these journeys are not always practical. You cannot always experience a day in the life of your Census, social media, or consumer data. But you can make some effort. Watch videos online, pick up the phone, read a few trending articles. Get to know the data's world however you can.

  Trips into data origins go by many labels. “Design thinking” encourages one to become emotionally intimate with a problem's environment. Described as human centered, its curiosity for new insights powers energetic observations, exchanges, and iterations. The design thinker darts through the world of the problem. A journalist's trip into data origins is part of their investigative reporting. Reporting on the shifting economy might include interviews with people affected. Their testimony can give a nuanced sensitivity to what macroeconomic trends actually mean. Similarly, a consumerproduct manager with widget performance data might test similar gadgets in their own daily life. To them, this is competitive research. Whatever you label it, consider threading your own life experiences through the world the data comes from. You will learn more.

  Engineers and businesspeople are trained to solve problems. Designers are trained to discover the real problems. A brilliant solution to the wrong problem can be worse than no solution at all: solve the correct problem. Good designers never start by trying to solve the problem given to them: they start by trying to understand what the real issues are.

  DON NORMAN, 2013

  Toward Information

  Imagine a messy pile of books in the middle of a classroom floor. Curious, you approach the pile, pick one book out, and leaf through its pages. Setting it back down, you begin to wonder what other titles are lurking in the confused jumble. Then, you notice a nearby empty bookshelf, and decide to clean up the mess. But how should the books be arranged? They could be simply placed on the shelf in the order you pick them off the floor. But perhaps they should be sequenced by author name or date of publication. Finally, you decide to catalog them by topic and dive in. Once done, you stand back to admire your work.

  The foundation of an edifice is of vast importance. Still, it is not the foundation but the structure built upon the foundation which gives the result for which the whole work was planned. As the cathedral is to its foundation so is an effective presentation of facts to the data.

  WILLARD C. BRINTON, 1914

  Facts are incomplete without context. … “compared to what?” gives it power.

  HOWARD WAINER, 1997

  The original messy pile of books was random, unordered, and formless. Shelved, the individual books did not change at all. But their arrangement has changed quite a lot. You have put the books in formation. Through sorting the books, you learned a lot about what the pile contains. Now, you can read all the spines. The titles are more accessible. You can also see which topics are better represented by seeing which shelves are more full. This recognition and inspection would be impossible if we were to continue rummaging through the messy pile book by book. The shelved arrangement creates a more navigable collection for anyone to explore.

  To see an object in space is to see it in context.

  RUDOFF ARNHEIM, 1969

  We can imagine an almost infinite number of messy piles of the same collection of books, all equally frustrating. But there are comparably few useful arrangements that afford access and show patterns. Information involves uncommon arrangements. Unlike any messy heap, the shelved form is not likely to come about by chance. Similarly, a pile of stones is not a house. The arrangement matters.

  So far, we have only discussed data conceptually. We haven’t looked at data directly yet because that requires something more. To see data we must have information, data in a form consumable by people. So, let's begin with some data.

  x4ex61x6dx65x2cx32x30x30x30x2cx32x30x31

  x30x0ax41x6ex63x72x61x6dx2cx31x35x31x33

  x2cx31x35x37x33x0ax41x75x73x74x65x72x6c

  x69x74x7ax2cx31x34x35x33x2cx31x36x35x34

  Name,2000,2010

  Ancram,1513,1573

  Austerlitz,1453,1654

  The above snippet of data is encoded in Unicode, the global standard for handling text. Unicode was designed in the late 1980
s to be backwards compatible with the previous encoding standard, ASCII, which developed from telegraph code. Why bother stripping all the way down to coded data? It lets us distinguish data from information. The Unicode is data, but it is not in a form we can consume. Form underpins our ability to interact with data. We take a big first step out of data chaos by decoding that data to text. The little squiggle symbols of letters and numerals snap data into information. Thankfully, our programs use a lookup table to decode data like this into familiar forms. These data are populations in the county I grew up in, recorded in the 2000 and 2010 U.S. Census. Spaces can make it even easier to appreciate this information.

  Name   2000 2010

  Ancram  1513 1573

  Austerlitz 1453 1654

  Spacing helps the eye distinguish each value and keep track of which ones are located in the same column. Just like that, visual design begins to help better inform. Tabulated data is hardly the ultimate design, but it is a basic reminder of how form is elemental to any information. Better information is possible because better forms for data are possible. We can keep going.

  The word table references the stone tablets that information used to be inscribed upon, often in compact lists, such as in a book's table of contents. Tabulate means to put in the form of a table. The typewriter's tabulator helped make indentations, or tabs.

 

‹ Prev