Book Read Free

The Internet of Us

Page 14

by Michael P. Lynch


  The other reason educators have been wary of MOOCs is that some see them as hastening what we might call the Walmarting of the university. As I noted above, a hallmark of the global economy is cheaper goods, produced and sold by poorly compensated workers, made possible by amazing models of distribution. This trend has been dominating education as well. According to a leading study of the American professoriate, in 1969 over three-quarters of faculty at American colleges and universities were in reasonably well-paid and stable tenure-track positions.17 By 2009, that number had almost flipped, with only about one-third of faculty now being tenure-track. In short, most students are now taught by temporary workers who are largely not unionized and paid well below the minimum wage. The worry that many have had about MOOCs is that it will only exacerbate this process, should universities (as some initially proposed to do) replace their own course offerings with MOOCs purchased for their students from other entities.

  Whether that will come to pass is hard to say. MOOCs are in their infancy, and their path is hard to predict. But it’s doubtful that MOOCs are the biggest changes looming in education due to technology. Instead, those changes will likely come more directly via the Internet of Things. As I stated at the outset of this section, the big questions concern the more obvious fact: what do you make of education when people have all the “facts” at hand? If you had neuromedia, you’d be able to access tons of information about history, philosophy, mathematics, art, etc. You’d have dates and names at your disposal, just as you do now on your phone. You’d Google-know all sorts of stuff—that is, you’d have potential receptive knowledge, as I’ve put it. And the more Google-knowledge we have, the greater the “room” in students’ minds, you might think, for more important stuff.

  This isn’t only a modern problem. The use of technology to outsource mental activities is hardly new. At one time, calculators were verboten in math classrooms; not any more. Similarly, students today routinely access the Internet during instruction, and often do so in an interactive way designed and monitored by the instructor. (I’ve done this in my own courses.) Let’s also remember that libraries have long provided huge riches of knowledge for those who want them. Thus the question, “Why go to college if you have neuromedia?” is not much different than the question (one I took seriously myself as a know-it-all youth), “Why go to college when you have a library?”

  You already know the answer to that one. In the ideal world, if not always the reality, we go to college to find pilots who can guide us across the vast seas of knowledge. We need them to tell us what is already charted and what is left to chart still. Such guides shouldn’t make us more receptive knowers; they should aim to make us more reflective, reasonable ones and, what’s more, they should help us to understand.

  8

  Understanding and the Digital Human

  Big Knowledge

  Google knows us so well that it finishes our sentences. This program, known to any user of the Internet, is called Google Complete. Search as I just did for “Web 3.0 and . . .” and Google will suggest “big data” and “education”; search for “knowledge and . . .” and you might get “power” and “information systems.” Complete is a familiar, if rather gentle, form of big data analysis. It works because Google knows not only what much of the world is searching for on the Web, but also what you’ve been searching for. That data is useless without Google’s propriety analytic tools for transforming the numbers and words into a predictive search. These predictions aren’t perfect. But they are amazingly good, and getting better all the time.

  Google has done more than perhaps any other single high-profile company or entity to usher in the brave new world of big data. As I noted in the first chapter, the term “big data” can refer to three different things. The first is the ever-expanding volume of data being collected by our digital devices. The second is analytical tools for extracting information from that data. And the third is the firms like Google that employ them.

  One of the lessons of previous chapters is that big data and our digital form of life, while sometimes making it easier to be a responsible and reasonable believer, often makes it harder as well—while at the same time setting up conditions that make reasonable belief more important than ever before. The same thing could be said for understanding—except even more so. And that’s important, because understanding is what keeps the “human” in what I earlier called the digital human.

  The End of Theory?

  In 2008, Chris Anderson, then editor of Wired, wrote a controversial and widely cited editorial called “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete.” Anderson claimed that what we are now calling big data analytics was overthrowing traditional ways of doing science:

  This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves. . . . Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.1

  Traditional scientific theorizing aims at model construction. Collecting data is just a first step; to do good science, you must explain the data by constructing a model of how and why the phenomenon in question occurred as it did. Anderson’s point was that the traditional view assumes that the data is always limited. That, he says, is the assumption big data is overthrowing.

  In 2013, the data analytics expert Christian Rudder (and cofounder of the dating website OkCupid) echoed Anderson’s point. In talking about the massive amount of information that OkCupid (and other) dating sites collect, Rudder writes:

  Eventually we were analyzing enough information that larger trends became apparent, big patterns in the small ones, and even better, I realized I could use the data to examine taboos like race by direct inspection. That is, instead of asking people survey questions or contriving small-scale experiments, which was how social science was often done in the past, I could go and look at what actually happens, when, say, 100,000 white men and 100,000 black women interact in private.2

  Anderson and Rudder’s comments are not isolated; they bring to the surface sentiments that have been echoed across discussions of analytics over the last few years. While Rudder has been particularly adept at showing how huge data gathered by social sites can provide eye-opening correlations, and data scientists and companies the world over have been harvesting a wealth of surprising information using analytics, Google remains the most visible leader in this field. The most frequently cited, and still one of the most interesting, examples is Google Flu Trends. In a now-famous journal article in Nature, Google scientists compared the 50 million most common search terms used in America with the CDC’s data about the spread of seasonal flu between 2003 and 2008.3 What they learned was that forty-five search terms could be used to predict where the flu was spreading—and do so in real time, as they did with some accuracy in 2009 during the H1N1 outbreak.

  Google Flu Trends—which we will look at again below—is really only an extension of the design methods behind Google’s main search engine. Its algorithms (and their creators) don’t know why one page is what you want rather than another; they just apply mathematical techniques to find patterns in incoming links. That’s all. Similarly, Google Flu Trends doesn’t care why people are searching as they do; it just correlates the data. And Walmart doesn’t care why people buy more Pop-Tarts before a hurricane, nor do insurance companies care why certain credit scores correlate with certain medication adherences; they care only that they do. As Viktor Mayer-Schönberger and Kenneth Cukier put it, “predictions based on correlation
s lie at the heart of big data. Correlation analyses are now used so frequently that we sometimes fail to appreciate the inroads they have made. And the uses will only increase.” 4

  Does the use of big data in this way however, really signal the end of theory, as Anderson alleged? The answer is no. And, as we’ll see, that is a very good thing.

  Start with Rudder and Anderson’s remarks. As Rudder puts it, big data seems to allow us to investigate by direct inspection. We don’t have to look through the lens of a model or theory; we can let the numbers speak for themselves. Big data brings us to the real-life correlations that exist, and because those correlations are so perfectly . . . well, correlated, we can predict what happens without having to worry about why it happens.

  But can we ever look at the “data in itself” without presupposing a theory? In The Structure of Scientific Revolutions, Thomas Kuhn famously argued that you cannot: data is always “theory-laden.” His point was that there is no direct observation of the world that isn’t at least somewhat affected by prior observations, experiences and the beliefs we’ve formed as a result. These beliefs in turn set up expectations. In short, theory permeates data.

  This operates even at the level of deciding what experimental techniques or devices to employ. As Kuhn put it, “consciously or not, the decision to employ a particular piece of apparatus and to use it in a particular way carries an assumption that only a certain sort of circumstances will arise. There are instrumental as well as theoretical expectations.”5 In support of the claim, Kuhn cited the now-classic 1949 article by Bruner and Postman on perceptual incongruity. Bruner and Postman showed their subjects playing cards, some of which had abnormalities (a spade card was red, for example).6 What they found was that, primed with ordinary cards, respondents identified the abnormal cards as perfectly normal; their expectations seemingly affected what they saw. The last seventy-five years of psychology have only underlined the lesson (if not necessarily the letter) of Bruner and Postman’s experiment. What you believe can affect what you observe.

  Rudder and Anderson may well protest that they don’t mean to deny Kuhn’s point. They aren’t worried about perceptual observations but mathematics. But even when it comes to mathematical correlations detected by mindless programs, our theoretical assumptions will matter: they will determine how we interpret those correlations as meaningful and, most importantly, what we do with them.

  A trivial example of how assumptions can matter in this way occurs in Rudder’s book. When discussing a well-known data map that tracks the “emotional epicenter” of an earthquake by looking at Twitter reactions, Rudder notes that “Knowing nothing else about a quake, if it were your job to distribute aid to victims, the contours of the Twitter reaction would be a far better guide than the traditional shockwaves around an epicenter model.”7 Maybe so; and the data map, and others like it, certainly are interesting. But Rudder’s point here rests on some key assumptions. First, it assumes that aid workers won’t be concerned about aftershocks (which will be better predicted by models employing geological and geographical data). Second, it assumes that all types of quakes generate equally explicable Twitter reactions. (What happens, for example, if people are too injured to type?) Third, it not only assumes that people have equal access to smartphones, but that their first priority is to tweet rather than rescue the injured. In an extreme quake, the emotional epicenter as charted by Twitter may be far away from the point of truest need. My point is not to overhype what is a passing remark in a much longer work; it is to show that data correlations themselves are useful only under certain background assumptions. And where do those assumptions ultimately come from? Theory.

  Another word for this is context. Without it, correlations can be as misleading as they are informative. A recent and extremely striking example is art historian Maximilian Schich’s video map of cultural history (reported in Science, with a following video posed by Nature).8 Schich and his colleagues, employing data gleaned by Freebase (a huge set of data owned by, who else, Google), used the mathematical techniques of network analysis to map what they referred to as the development of cultural history. After collecting data about the locations and times of the births and deaths of 150,000 “notable” people over 2,000 years, they made a video map of the data (with births in blue and deaths in red). What resulted showed how, over time, “culture” moved and migrated—sometimes, it seems according to the map, clustering around certain cities (Paris, a center of red) and sometimes more widely distributed. The video is arresting (if you haven’t seen it, google “Schich and cultural history”). The idea, Schich said, was to show that you could do in history what is done in the sciences: use data to show actual correlations rather than relying on armchair theorizing.

  But Schich’s data map relies on a host of assumptions. A good deal of discussion of the video on Twitter and elsewhere following its release concerned the Eurocentric nature of the map. The notable figures chosen by Schich were almost entirely white, European and male (and in many cases, possessing some wealth). This makes the widely viewed video—which talks about cultural history simpliciter—not just striking but strikingly cringeworthy at points. In fairness, Schich was well aware of this bias; the researcher’s point, as he noted, was to use the available data to discover patterns in broadly European cultural history.

  Yet Schich’s assumptions don’t stop at race, gender and ethnicity, nor are they all the products of available data. Some of his assumptions are about how to define “culture.” Schich’s map suggests that culture is driven by notable figures (from scientists to movie stars). But is what used to be called “the big man” theory the only or best way to understand what shapes cultural change? What about economics or politics, for example? Other assumptions concern how the drift of culture is measured. Why think that where someone died has more predictive value for cultural development than where they spent their most productive years? Descartes, for example, died in Sweden, but he spent most of his productive life in France. Once again, theoretical assumptions drive work in big data as much as they do in any other field. Kuhn would not be surprised.

  None of this diminishes the importance of network analyses as tools for research, including fields typically not associated with data, like history. It’s a growing and exciting mode of research across anthropology, literature, the digital arts and the humanities. But as the historian and digital humanities scholar Tom Scheinfeldt has remarked, this work is only as good as the theoretical context in which we place it.9

  So, like it or not, we can’t do data analytics without theory. It’s what gives us the context in which to pose questions and interpret the correlations we discover. But we should like theory; the process of theorizing employs a composite of cognitive capacities, ones that when employed together make up understanding, another way of knowing that is important to human beings.

  Understanding Understanding

  Suppose you want to learn why your apple tree is not producing good apples. You google it and the first website you look at (for example, the ACME Apple Research Center) is a source of scientific expertise on apples. It tells you the correct answer, call it X. But there are many other websites (e.g., the nefarious Center for Apple Research) that came up during your search that would have told you the wrong answer, and many others (e.g., MysticApples.com) that would have given you the right answer but for the wrong reasons. So you could have easily been wrong about X or right about it but for the wrong reasons.10

  Silly as it is, this example replicates how we now know much of what we know, as I’ve been pointing out in this book. We know by Google-knowing. Not that there is anything wrong with that. After all, in the above case, we are being responsible and believing X, based on an investigation and on the basis of a reliable source.11 In several ways, then, you could be said to know X. And for most purposes, that’s good enough. Yet it is clear that something valuable can sometimes go missing even when you go about the process responsibly. Sometimes we need to know more than t
he facts; sometimes we want to understand. And it is our capacity to understand that our digital form of life often undersells, and which more data alone can’t give us.

  Understanding is a complex form of knowing, one that has several facets or elements. The first is that understanding isn’t piecemeal; it involves seeing the whole. For example, think about the difference between knowing a pile of individual facts about some subject, theory or person and actually understanding that subject or theory or person. Understanding involves knowing not just the facts, but also the how or why something is the case. You understand more about the Civil War if you understand why and how it came about; you understand string theory if you understand why it predicts certain events; you understand a person to the extent you don’t just know that she is unhappy, but what makes her unhappy. In each of these cases you are going beyond mere data to grasp something deeper and more profound.

  The philosopher Stephen Grimm, who has thought as much about this topic as anyone recently, has pointed out that there is something in common between understanding how something is and why it is.12 In both cases, he argues, we “grasp” or “see” not just individual elements, but the structure of the whole. This sounds grand, and it can be, as when we understand how a particular equation works or why a great historical event occurred. But it can also happen on a smaller scale. Consider, for example, the lucky person who understands how her car works. She has this understanding in part because she has certain skills, skills that give her the ability to see how various parts of a mechanism depend on one another: you can’t get the car to move without the battery and the battery won’t be charged without the alternator. You understand when you see not just the isolated bits, but how those bits hang together. Similarly with understanding why. When we understand why something is the case, such as why a certain disease spreads or why your friend is unhappy, or why a given apple tree produces good apples, we grasp various relationships. These relationships are what allow us to see the difference between possibilities, between one hypothesis and another.

 

‹ Prev