The Theory That Would Not Die
Page 4
Wrestling with the mathematics of probability in 1773, he reflected on its philosophical counterpoint. In a paper submitted and read to the academy in March, the former abbé compared ignorant mankind, not with God but with an imaginary intelligence capable of knowing All. Because humans can never know everything with certainty, probability is the mathematical expression of our ignorance: “We owe to the frailty of the human mind one of the most delicate and ingenious of mathematical theories, namely the science of chance or probabilities.”7
The essay was a grand combination of mathematics, metaphysics, and the heavens that Laplace held to his entire life. His search for a probability of causes and his view of the deity were deeply congenial. Laplace was all of one piece and for that reason all the more formidable. He often said he did not believe in God, and not even his biographer could decide whether he was an atheist or a deist. But his probability of causes was a mathematical expression of the universe, and for the rest of his days he updated his theories about God and the probability of causes as new evidence became available.
Laplace was struggling with probability when one day, ten years after the publication of Bayes’ essay, he picked up an astronomy journal and was shocked to read that others might be hot on the same trail. They were not, but the threat of competition galvanized him. Dusting off one of his discarded manuscripts, Laplace transformed it into a broad method for determining the most likely causes of events and phenomena. He called it “Mémoire on the Probability of the Causes Given Events.”
It provided the first version of what today we call Bayes’ rule, Bayesian probability, or Bayesian statistical inference. Not yet recognizable as the modern Bayes’ rule, it was a one-step process for moving backward, or inversely, from an effect to its most likely cause. As a mathematician in a gambling-addicted culture, Laplace knew how to work out the gambler’s future odds of an event knowing its cause (the dice). But he wanted to solve scientific problems, and in real life he did not always know the gambler’s odds and often had doubts about what numbers to put into his calculations. In a giant and intellectually nimble leap, he realized he could inject these uncertainties into his thinking by considering all possible causes and then choosing among them.
Laplace did not state his idea as an equation. He intuited it as a principle and described it only in words: the probability of a cause (given an event) is proportional to the probability of the event (given its cause). Laplace did not translate his theory into algebra at this point, but modern readers might find it helpful to see what his statement would look like today:
where P(C|E) is the probability of a particular cause (given the data), and P(E|C) represents the probability of an event or datum (given that cause). The sign in the denominator represented with Newton’s sigma sign makes the total probability of all possible causes add up to one.
Armed with his principle, Laplace could do everything Thomas Bayes could have done—as long as he accepted the restrictive assumption that all his possible causes or hypotheses were equally likely. Laplace’s goal, however, was far more ambitious. As a scientist, he needed to study the various possible causes of a phenomenon and then determine the best one. He did not yet know how to do that mathematically. He would need to make two more major breakthroughs and spend decades in thought.
Laplace’s principle, the proportionality between probable events and their probable causes, seems simple today. But he was the first mathematician to work with large data sets, and the proportionality of cause and effect would make it feasible to make complex numerical calculations using only goose quills and ink pots.
In a mémoire read aloud to the academy, Laplace first applied his new probability of causes to two gambling problems. In each case he understood intuitively what should happen but got bogged down trying to prove it mathematically. First, he imagined an urn filled with an unknown ratio of black and white tickets (his cause). He drew a number of tickets from the urn and, based on that experience, asked for the probability that his next ticket would be white. Then in a frustrating battle to prove the answer he wrote no fewer than 45 equations covering four quarto-sized pages.
His second gambling problem involved piquet, a game requiring both luck and skill. Two people start playing but stop midway through the game and have to figure out how to divide the kitty by estimating their relative skill levels (the cause). Again, Laplace understood instinctively how to solve the problem but could not yet do so mathematically.
After dealing with gambling, which he loathed, Laplace moved happily on to the critical scientific problem faced by working astronomers. How should they deal with different observations of the same phenomenon? Three of the era’s biggest scientific problems involved gravitational attraction on the motions of our moon, the motions of the planets Jupiter and Saturn, and the shape of the Earth. Even if observers repeated their measurements at the same time and place with the same instrument, their results could be slightly different each time. Trying to calculate a midvalue for such discrepant observations, Laplace limited himself to three observations but still needed seven pages of equations to formulate the problem. Scientifically, he understood the right answer—average the three data points—but he would have no mathematical justification for doing so until 1810, when, without using the probability of causes, he invented the central limit theorem.
Although Bayes originated the probability of causes, Laplace clearly discovered his version on his own. Laplace was 15 when the Bayes-Price essay was published; it appeared in an English-language journal for the English gentry and was apparently never mentioned again. Even French scientists who kept up with foreign journals thought Laplace was first and congratulated him wholeheartedly on his originality.
Mathematics confirms that Laplace discovered the principle independently. Bayes solved a special problem about a flat table using a two-step process that involved a prior guess and new data. Laplace did not yet know about the initial guess but dealt with the problem generally, making it useful for a variety of problems. Bayes laboriously explained and illustrated why uniform probabilities were permissible; Laplace assumed them instinctively. The Englishman wanted to know the range of probabilities that something will happen in light of previous experience. Laplace wanted more: as a working scientist, he wanted to know the probability that certain measurements and numerical values associated with a phenomenon were realistic. If Bayes and Price searched for the probability that, on the basis of today’s puddles, it had rained yesterday and would rain tomorrow, Laplace asked for the probability that a particular amount of rain would fall and then refined his opinion over and over with new information to get a better value. Laplace’s method was immensely influential; scientists did not pay Bayes serious heed until the twentieth century.
Most strikingly of all, Laplace at 25 was already steadfastly determined to develop his new method and make it useful. For the next 40 years he would work to clarify, simplify, expand, generalize, prove, and apply his new rule. Yet while Laplace became the indisputable intellectual giant of Bayes’ rule, it represented only a small portion of his career. He also made important advances in celestial mechanics, mathematics, physics, biology, Earth science, and statistics. He juggled projects, moving from one to another and then back to the first. Happily blazing trails through every field of science known to his age, he transformed and mathematized everything he touched. He never stopped being thrilled by examples of Newton’s theory.
Although he was fast becoming the leading scientist of his era, the academy waited five years before electing him a member on March 31, 1773. A few weeks later he was formally inducted into the world’s leading scientific organization. His mémoire on the probability of causes was published a year later, in 1774. At the age of 24, Laplace was a professional researcher. The academy’s annual stipend, together with his teaching salary, would help support him while he refined his research on celestial mechanics and the probability of causes.
Laplace was still grappling with probabilit
y in 1781, when Richard Price visited Paris and told Condorcet about Bayes’ discovery. Laplace immediately latched onto the Englishman’s ingenious invention, the starting guess, and incorporated it into his own, earlier version of the probability of causes. Strictly speaking, he did not produce a new formula but rather a statement about the first formula assuming equal probabilities for the causes. The statement gave him confidence that he was on the right track and told him that as long as all his prior hypotheses were equally probable, his earlier principle of 1774 was correct.8
Laplace could now confidently marry his intuitive grasp of a scientific situation with the eighteenth century’s passion for new and precise scientific discoveries. Every time he got new information he could use the answer from his last solution as the starting point for another calculation. And by assuming that all his initial hypotheses were equally probable he could even derive his theorem.
As Academy secretary, Condorcet wrote an introduction to Laplace’s essay and explained Bayes’ contribution. Laplace later publicly credited Bayes with being first when he wrote, “The theory whose principles I explained some years after, . . . he accomplished in an acute and very ingenious, though slightly awkward, manner.”9
Over the next decade, however, Laplace would realize with increasing clarity and frustration that his mathematics had shortcomings. It limited him to assigning equal probabilities to each of his initial hypotheses. As a scientist, he disapproved. If his method was ever going to reflect the actual state of affairs, he needed to be able to differentiate dubious data from more valid observations. Calling all events or observations equally probable could be true only theoretically. Many dice, for example, that appeared perfectly cubed were actually skewed. In one case he started by assigning players equal probabilities of winning, but with each round of play their respective skills emerged and their probabilities changed. “The science of chances must be used with care and must be modified when we pass from the mathematical case to the physical,” he counseled.10
Moreover, as a pragmatist, he realized he had to confront a serious technical difficulty. Probability problems require multiplying numbers over and over, whether tossing coin after coin or measuring and remeasuring an observation. The process generated huge numbers—nothing as large as those common today but definitely cumbersome for a man working alone without mechanical or electronic aids. (He did not even get an assistant to help with calculations until about 1785.)
Laplace was never one to shrink from difficult computations, but, as he complained, probability problems were often impossible because they presented great difficulties and numbers raised to “very high powers.”11 He could use logarithms and an early generating function that he considered inadequate. But to illustrate how tedious calculations with big numbers could be, he described multiplying 20,000 × 19,999 × 19,998 × 19,997, etc. and then dividing by 1 × 2 × 3 × 4 up to 10,000. In another case he bet in a lottery only to realize he could not calculate its formula numerically; the French monarchy’s winning number had 90 digits, drawn five at a time.
Such big-number problems were new. Newton had calculated with geometry, not numbers. Many mathematicians, like Bayes, used thought experiments to separate real problems from abstract and methodological issues. But Laplace wanted to use mathematics to illuminate natural phenomena, and he insisted that theories had to be based on actual fact. Probability was propelling him into an unmanageable world.
Armed with the Bayes–Price starting point, Laplace broke partway through the logjam that had stymied him for seven years. So far he had concentrated primarily on probability as a way to resolve error-prone astronomical observations. Now he switched gears to concentrate on finding the most probable causes of known events. To do so, he needed to practice with a big database of real and reliable values. But astronomy seldom provided extensive or controlled data, and the social sciences often involved so many possible causes that algebraic equations were useless.
Only one large amalgamation of truly trustworthy numbers existed in the 1700s: parish records of births, christenings, marriages, and deaths. In 1771 the French government ordered all provincial officials to report birth and death figures regularly to Paris; and three years later, the Royal Academy published 60 years of data for the Paris region. The figures confirmed what the Englishman John Graunt had discovered in 1662: slightly more boys than girls were born, in a ratio that remained constant over many years. Scientists had long assumed that the ratio, like other newly discovered regularities in nature, must be the result of “Divine Providence.” Laplace disagreed.
Soon he was assessing not gambling or astronomical statistics but infants. For anyone interested in large numbers, babies were ideal. First, they came in binomials, either boys or girls, and eighteenth-century mathematicians already knew how to treat binomials. Second, infants arrived in abundance and, as Laplace emphasized, “It is necessary in this delicate research to employ sufficiently large numbers in view of the small difference that exists between . . . the births of boys and girls.”12 When the great naturalist Comte de Buffon discovered a small village in Burgundy where, for five years running, more girls had been born than boys, he asked whether this village invalidated Laplace’s hypotheses. Absolutely not, Laplace replied firmly. A study based on a few facts cannot overrule a much larger one.
The calculations would be formidable. For example, if he had started with a 52:48 ratio of newborn boys to girls and a sample of 58,000 boys, Laplace would have had to multiply .52 by itself 57,999 times—and then do a similar calculation for girls. This was definitely not something anyone, not even the indomitable Laplace, wanted to do by hand.
He started out, however, as Bayes had suggested, by pragmatically assigning equal probabilities to all his initial hunches, whether 50–50, 33–33–33, or 25–25–25–25. Because their sums equal one, multiplication would be easier. He employed equal probabilities only provisionally, as a starting point, and his final hypotheses would depend on all the observational data he could add.
Next, he tried to confirm that Graunt was correct about the probability of a boy’s birth being larger than 50%. He was building the foundation of the modern theory of testing statistical hypotheses. Poring over records of christenings in Paris and births in London, he was soon willing to bet that boys would outnumber girls for the next 179 years in Paris and for the next 8,605 years in London. “It would be extraordinary if it was the effect of chance,” he wrote, tut-tutting that people really should make sure of their facts before theorizing about them.13
To transform probability’s large numbers into smaller, more manageable terms Laplace invented a multitude of mathematical shortcuts and clever approximations. Among them were new generating functions, transforms, and asymptotic expansions. Computers have made many of his shortcuts unnecessary, but generating functions remain deeply embedded in mathematical analyses used for practical applications. Laplace used generating functions as a form of mathematical wizardry to trick a function he could deal with into providing him with the function he really wanted.
To Laplace, these mathematical pyrotechnics seemed as obvious as common sense. To students’ frustration, he sprinkled his reports with phrases like, “It is easy to see, it is easy to extend, it is easy to apply, it is obvious that. . . .”14 When a confused student once asked how he had jumped intuitively from one equation to another, Laplace had to work hard to reconstruct his thought process.
He was soon asking whether boys were more apt to be born in certain geographic regions. Perhaps “climate, food or customs . . . facilitates the birth of boys” in London.15 Over the next 30-odd years Laplace collected birth ratios from Naples in the south, St. Petersburg in the north, and French provinces in between. He concluded that climate could not explain the disparity in births. But would more boys than girls always be born? As each additional piece of evidence appeared, Laplace found his probabilities approaching certainty “at a dramatically increasing rate.”
He was refining hunch
es with objective data. In building a mathematical model of scientific thinking, where a reasonable person could develop a hypothesis and then evaluate it relentlessly in light of new knowledge, he became the first modern Bayesian. His system was enormously sensitive to new information. Just as each throw of a coin increases the probability of its being fair or rigged, so each additional birth record narrowed the range of uncertainties. Eventually, Laplace decided that the probability of boys exceeding girls was as “certain as any other moral truth” with an extremely tiny margin of being wrong.16
Generalizing from babies, he found a way to determine not just the probability of simple events, like the birth of one boy, but also the probability of future composite events like an entire year of births—even when the probability of simple events (whether the next newborn will be male) was uncertain. By 1786 he was determining the influence of past events on the probability of future events and wondering how big his sample of newborns had to be. By then Laplace saw probability as the primary way to overcome uncertainty. Pounding the point home in one short paragraph, he wrote, “Probability is relative in part to this ignorance, in part to our knowledge . . . a state of indecision, . . . it’s impossible to announce with certainty.”17
Persevering for years, he used insights gained in one science to shed light on others, researching a puzzle and inventing a mathematical technique to resolve it, integrating, approximating, and generalizing broadly when there was no other way to proceed. Like a modern researcher, he competed and collaborated with others and published reports on his interim progress as he went. Above all, he was tenacious. Twenty-five years later he was still eagerly testing his probability of causes with new information. He combed 65 years’ worth of orphanage registries, asked friends in Egypt and Alexander von Humboldt in Central America about birth ratios there, and called on naturalists to check the animal kingdom. Finally, in 1812, after decades of work, he cautiously concluded that the birth of more boys than girls seemed to be “a general law for the human race.”18