A Beautiful Math

Home > Other > A Beautiful Math > Page 17
A Beautiful Math Page 17

by Tom Siegfried


  Using math to study networks is not entirely new, though. In fact, network math goes back at least to the 18th century, when the Swiss mathematician Leonhard Euler gave intellectual birth to the field with his analysis of a network of bridges in Königsberg, in eastern Prussia. In the mid-20th century, Paul Erdös and Alfréd Rényi developed the math to describe networks in their most abstract representation—essentially dots on paper connected by lines. The dots are known as nodes (or sometimes vertices); the lines are officially called edges, but are more popularly referred to as links. Such drawings of dots and lines are technically known as graphs, so traditional network math is known as graph theory.2

  A graph's dots and edges can represent almost anything in real life. The nodes may be any of various objects or entities, such as people, or companies, or computers, or nations; the links may be wires connecting machines, friendships connecting people, common film appearances connecting movie actors, or any other common property or experience. People, of course, belong to many different kinds of networks, such as networks of family, networks of friends, networks of professional collaborators. There are networks of people who share common investments, political views, or sexual partners.

  Traditional graph theory math does not do a very good job of describing such networks, though. Its dots and lines resemble real networks about as much as a scorecard resembles a baseball field. The scorecard does record all the players and their positions, but you won't get much of an idea of what baseball is like from reading the scorecard. Same with graphs. Standard graph math describes fixed networks with nodes connected at random, whereas in the real world, networks usually grow, adding new parts and new connections, while perhaps losing others—and not always at random. In a random network, every node is an equal among many, and few nodes get much more or less than a fair share of links. But in many real-world networks, some nodes possess an unusually high number of links. (In a network of sexual partners, for example, some people have many more "links" than average—an important issue in understanding the spread of HIV.) And real networks form clusters, like cliques of close friends.

  Erdös and Rényi knew full well that their dots and lines did not capture the complexities of real-world networks. As mathematicians, they didn't care about reality—they developed their model to help understand the mathematical properties of random connections. Describing random connections was a mathematically feasible thing to do; describing all the complexities of real-world networks was not. Nobody knew how to go about doing it.

  But then a paper appearing in the British journal Nature began to change all that. Looking back, the birth of network mania can be dated to June 4, 1998, when Duncan Watts and Steven Strogatz published a brief paper (taking up only two and a half Nature pages) called "Collective Dynamics of ‘Small-World' Networks."3

  NETWORK MANIA

  A few years later, when I met Strogatz at a complexity conference, I asked him why networks had become one of math's hottest topics in the late 1990s. "I think our paper started it," he said. "If you ask me when did this really start, I think it started in 1998 when our paper appeared in Nature on what we called small-world networks."

  So I quizzed Strogatz about that paper's origins. It really was a case of culture preparing the conditions for the advance of science.

  "When Watts and I started our work in 1995 or so, we were very aware of the whole Kevin Bacon thing, and we had heard of six degrees of separation, and the movie had come out of that play," said Strogatz. "So it was in the air."4

  Of course, Kevin Bacon didn't revolutionize science totally on his own. The Bacon game became famous just about the time that the public became aware of the Internet, thanks to the arrival of the World Wide Web.

  "I think the Web got us thinking about networks," Strogatz said. Not only was the Web a high-profile example of a vast, elaborate network, it made many other networks accessible for research. Web crawlers and search engines made it possible to map out the links of the Web itself, of course, and the Web also made it possible to catalog other large networks and store them for easy access (the database of movie actors being one prime example). Data on the metabolic reactions in nematode worms or the gene interactions in fruit flies could similarly be collected and transmitted.

  "Big databases became available, and researchers could get their hands on them," Strogatz observed. "People started to think about things as networks." Before that, he said, even actual networks weren't usually viewed in network terms—the electric power network was known as a grid, and you were just as likely to hear the term telephone "system" as telephone network. "We didn't think of them so much as networks," Strogatz said. "I don't think we had the visceral sensation of moving through a network from link to link."

  With the Web it was different. It was almost impossible to think of it as a whole. You had to browse, link by link. And the Web touched all realms of science, linking specialists of all sorts with network ideas. "In many different branches of science," Strogatz observed, "the kind of thinking that we call network thinking started to take hold."

  Still, the revolution in network math did not begin until after the Watts-Strogatz paper appeared in 1998. They showed how to make a model of a "small-world" network, in which it takes only a few steps on average to get from any one node of the network to any other. Their model produced some surprises that led to a flurry of media coverage and the subsequent network mania. But Strogatz thinks some of those surprises have been misrepresented as being responsible for network math's revival. Some experts would say, for example, that the Watts-Strogatz paper's major impact stemmed from identifying the small-world nature of some particular real-world networks. Others have suggested that "clustering" of links (small groups of nodes connected more than randomness would suggest) was the key discovery. "This is to me the bogus view of what was important about our paper," Strogatz said. "The reason it caught on, I think, is because we were the first to compare networks from different fields and find that there were similar properties across fields."

  In other words, diverse as networks are, many share common features that can be described in a mathematically precise way. Such commonalities gave people hope that network math could be more than a tedious chore of sorting out links in one kind of network and then moving on to the next. Instead, it seemed, general laws of networks might be possible, enabling accurate forecasts of how different kinds of networks would grow, evolve, and behave—chemical networks like proteins in cells, neural networks like nerve cells in brains, or social networks, such as actors in movies or stock traders in the economy.

  SMALL WORLDS

  One of the key common features of different networks is that many of them do in fact exhibit the small-world property. When a network's nodes are people, for instance, small worlds are the rule. So discovering the rules governing small-world networks may be the key to forecasting the social future.

  Watts and Strogatz uncovered the small-world nature of certain networks by focusing on networks intermediate between those that were totally regular or totally random. In a regular network (what is often called a regular lattice), the nodes are connected only to their nearby neighbors. For an ultra-simple example, think of a series of nodes arranged in a circle. The dots representing the nodes are connected to their immediate neighbors on either side by the line representing the circle.

  Regular network

  For a more elaborate (but still regular) network, you could connect the dots to their second-nearest neighbors as well. Each node would then be connected to four others—two neighbors on each side.

  In a random network, on the other hand, some nodes would be connected to many others, some maybe connected to only one. Some nodes would be linked only to other nodes nearby; some would be connected to nodes on the other side of the circle; some would be connected both to neighbors and to distant nodes. It would look like a mess. That's what it means to be random.

  In a random network, it is usually easy to find a relatively shor
t path from one node to any other, thanks to the random long-range links making connections across the circle. Regular networks, though, are not so easy to navigate. To get from one side of the circle to the other, you have to take the long way around, via linked neighbors.

  Random network

  But what happens, Watts and Strogatz wondered, with an "in-between" network—neither completely regular nor completely random? In other words, suppose you started with a regular network and then added just a few links at random between other nodes. It turned out that if even just a tiny percentage of the links created shortcuts between distant nodes, the new intermediate network would be a small world (that is, you could get anywhere in the network in a small number of steps). But that intermediate network retains an important feature of the regular network—its nearby nodes are still more highly connected than average (that is, they are "clustered"), unlike random networks where clustering is mostly absent.

  The mathematical existence of graphs combining these properties of random and regular networks was nice, if not necessarily important. But the fact that you needed very few shortcuts to make the network world small implied that small-world networks might be common in nature. Watts and Strogatz tested that possibility on three real-world examples: the film actors' network starring Kevin Bacon, the electrical power grid in the western United States, and the network of nerve cells in the tiny roundworm C. elegans.5 In all three cases, these networks exhibited the small-world property, just like the models of hypothetical networks that were intermediate between regular and random.

  Intermediate (small-world) network

  "Thus," Watts and Strogatz concluded, "the small-world phenomenon is not merely a curiosity of social networks nor an artifact of an idealized model—it is probably generic for many large, sparse networks found in nature."6

  If so (and it was), Watts and Strogatz had opened a new frontier for mathematicians and physicists to explore, where all sorts of important networks could be analyzed with a common set of tools. In just the way that statistical physics made it possible to tame the complexities of a jumble of gas molecules, mathematicians could use similar math to compute a network's defining properties. And just as all gases, no matter what kinds of molecules they contained, obeyed the same gas laws, many networks observed similar mathematical regularities. "Everybody pointed out, isn't this remarkable that these totally different networks have these properties in common—how would you have ever thought that?" Strogatz said.

  Several network features can be quantified by numbers analogous to the temperature and pressure of a gas, what scientists call the parameters describing a system. The average number of steps to get from any one node to any other—the "path length"—is one such parameter. Another is the "clustering coefficient"—how likely two nodes are to be directly linked if they are both linked to a third. A relatively high clustering rate is one of the counterintuitive features of small-world networks. The short path length in small-world networks is similar to the situation in random networks. On the other hand, the high clustering coefficient is completely unlike that in random networks, but is more similar to that in regular networks.

  This clustering property (you could call it the measure of "cliquishness") is especially of interest in social networks. Since my sister Sue, for instance, has a friend Debby and a friend Janet, it is more likely than average that Debby and Janet also know each other. (They do.) "There is a tendency to form triangles, and you wouldn't see that in random networks," Strogatz pointed out.

  Besides clustering coefficient and path length, another critical number is the average number of links connecting one node to another, known as the degree coefficient. (The "degree" of a node is the number of other nodes it is linked to.) As a node in the actor network, Kevin Bacon would be ranked very high in degree, being connected to so many others. Being well connected, after all, is what makes the average path length between Bacon and other actors so short. But in a shocking development, it turned out that Bacon is far from the most connected of actors. Taking the average number of steps to link to another actor as the gauge, he doesn't even rank in the top 1,000!

  It turns out, in fact, that Bacon's true importance for networks had nothing to do with how special he is, but rather how typical he is. Many actors, like Bacon, serve as "hubs" connecting lots of other members of the acting community. And the existence of such hubs turns out to be a critical common feature of many real-world networks.

  THE POWER OF SCALE FREEDOM

  As of mid-2004, the actor leading the list as "most connected" (based on the average number of steps to link him with all the other actors) was Rod Steiger, at 2.679 steps. Bacon, at 2.95, ranked 1,049th. (Still, as Bacon is more connected than 99 percent of all actors, he does qualify as an important hub.) In second place was Christopher Lee (2.684), followed by Dennis Hopper (2.698). Donald Sutherland ranked 4th. Among women, the most connected was Karen Black, in 21st place.

  It's such highly connected actors, or hubs, that make it easy to get from any actor to any other in just a few steps. Choose any two actors at random. You can probably connect them in three steps or less. And it would be unusual to need more than four. If you search and search, you can find a few who would require more steps, but that is likely only if you deliberately choose actors who would not be very connected, like someone who appeared in only one film. (And remember, I said you need to choose two at random.)

  So for example, just off the top of my head I'll say Basil Rathbone (because I saw a Sherlock Holmes movie last night) and Lindsay Lohan (no reason—I will not admit to having seen Herbie: Fully Loaded). These two actors are from entirely different eras, and Lohan is young and has been in relatively few films. But you can link them in only three steps. An aging Rathbone appeared in Queen of Blood (1966) with the pre–Easy Rider Dennis Hopper. Hopper was in Knockaround Guys with Bruce McFee, who appeared in Confessions of a Teenage Drama Queen (2004) with Lohan. The short path between Rathbone and Lohan was made possible by the hub provided by Hopper, who is, in fact, much more connected then Bacon.7 (Hopper is connected directly to 3,503 other actors, about 1,500 more than Bacon.)

  Hubs like Hopper make the actor network a small world. They make getting from one node to another easy, in much the way that major airport hubs like Chicago's O'Hare or Dallas–Fort Worth unite the smaller airports in airline networks so you can get from one town to another without too many plane changes.

  Such large hubs generally do not exist, though, in either random or regular networks. In a regular network, every node has the same number of links, so there are no hubs. In a random network, shortcuts exist, but they might be very hard to find because prominent hubs would be very rare. In random networks, any one node (actor or airport) is as likely to be linked as much as any other, so most of them are linked to about the same degree. Only a few would have a lot more links than average, or a lot less. If actors were linked randomly, their rankings by number of links would form a bell curve, with most of them close to the middle. But in many small-world networks, there is no such "typical scale" of the number of links.

  Such distributions—with no typical common size—are known as "scale free." In scale-free networks, many lonely nodes will have hardly any connections at all, some nodes will be moderately well connected, and a few will be superconnected hubs. To mathematicians and physicists, such a scale-free distribution is a sure sign of a "power law."

  In a groundbreaking paper published in Science in 1999, Réka Albert and Albert-László Barabási of Notre Dame University noted the scale-free nature of many kinds of networks, and consequently the usefulness of power laws for describing them. The revelation that networks could be described by power laws struck a responsive chord among physicists. (They "salivate over power laws," Strogatz says—apparently because power law discoveries in other realms of physics have won some Nobel Prizes.)

  Power laws describe systems that include a very few big things and lots of little things. Cities, for example. There
are a handful of U.S. cities with populations in the millions, a larger number of medium-sized cities in the 100,000 to a million range, and many, many more small towns. Same with earthquakes. There are lots of little earthquakes, too weak to notice; a fewer number of middling ones that rattle the dishes; and a very few devastating shocks that crumble bridges and buildings.

  In their Science paper, Barabási and Albert showed how the probability that a node in a scale-free network is linked to a given number of other nodes diminishes as the number of links increases. That is to say, scale-free networks possess many weakly linked nodes, fewer with a moderate number of links, and a handful of monsters—like Google, Yahoo, and Amazon on the World Wide Web. Nodes with few links are common, like small earthquakes; nodes with huge numbers of links are relatively rare, like huge earthquakes. And the rate at which the probability of links goes down is quantified by a power law, just like the math describing the distribution of city or earthquake sizes. In other words, a good theory of networks should explain not only how Kevin Bacon (or Dennis Hopper) can be so connected, but also why networks are analogous to earthquakes.

 

‹ Prev