26
The question being addressed by Bayes and his subsequent followers is
27
simple to state, yet forbidding in its scope: How well do we know what we
28
think we know? If we want to tackle big- picture questions about the ulti-
29
mate nature of reality and our place within it, it will be helpful to think
30
about the best way of moving toward reliability in our understanding.
31
Even to ask such a question is to admit that our knowledge, at least in
32
part, is not perfectly reliable. This admission is the first step on the road to
33
wisdom. The second step on that road is to understand that, while nothing
34
is perfectly reliable, our beliefs aren’t all equally unreliable either. Some are
S35
more solid than others. A nice way of keeping track of our various degrees
N36
69
Big Picture - UK final proofs.indd 69
20/07/2016 10:02:40
T H E B IG PIC T U R E
01
of belief, and updating them when new information comes our way, was the
02
contribution for which Bayes is remembered today.
03
Among the small but passionate community of probability- theory afi-
04
cionados, fierce debates rage over What Probability Really Is. In one camp
05
are the frequentists, who think that “probability” is just shorthand for “how
06
frequently something would happen in an infinite number of trials.” If you
07
say that a flipped coin has a 50 percent chance of coming up heads, a fre-
08
quentist will explain that what you really mean is that an infinite number
09
of coin flips will give equal numbers of head and tails.
10
In another camp are the Bayesians, for whom probabilities are simply
11
expressions of your states of belief in cases of ignorance or uncertainty. For
12
a Bayesian, saying there is a 50 percent chance of the coin coming up heads
13
is merely to state that you have zero reason to favor one outcome over an-
14
other. If you were offered to bet on the outcome of the coin flip, you would
15
be indifferent to choosing heads or tails. The Bayesian will then helpfully
16
explain that this is the only thing you could possibly mean by such a state-17
ment, since we never observe infinite numbers of trials, and we often speak
18
about probabilities for things that happen only once, like elections or sport-
19
ing events. The frequentist would then object that the Bayesian is introduc-
20
ing an unnecessary element of subjectivity and personal ignorance into
21
what should be an objective conversation about how the world behaves, and
22
they would be off.
23
•
24
25
Our job here isn’t to decide anything profound about the nature of proba-
26
bility. We’re interested in beliefs: things that people think are true, or at
27
least likely to be true. The word “belief” is sometimes used as a synonym for
28
“thinking something is true without sufficient evidence,” a concept that
29
drives nonreligious people crazy and causes them to reject the word en-
30
tirely. We’re going to use the word to mean anything we think is true re-
31
gardless of whether we have a good reason for it; it’s perfectly okay to say “I
32
believe that two plus two equals four.”
33
Often— in fact all the time, if we’re being careful— we don’t hold our
34
beliefs with 100 percent conviction. I believe the sun will rise in the east
35S
tomorrow, but I’m not absolutely certain of it. The Earth could be hit by a
36N
speeding black hole and completely destroyed. What we actually have are
70
Big Picture - UK final proofs.indd 70
20/07/2016 10:02:40
l E A R n I n g A b Ou t t h E W O R l d
degrees of belief, which professional statisticians refer to as credences. If you 01
think there’s a 1 in 4 chance it will rain tomorrow, your credence that it will
02
rain is 25 percent. Every single belief we have has some credence attached to
03
it, even it we don’t articulate it explicitly. Sometimes credences are just like
04
probabilities, as when we say we have a credence of 50 percent that a fair
05
coin will end up heads. Other times they simply reflect a lack of complete
06
knowledge on our part. If a friend tells you that they really tried to call on
07
your birthday but they were stuck somewhere with no phone service, there’s
08
really no probability involved; it’s true or it isn’t. But you don’t know which
09
is the case, so the best you can do is assign some credence to each possibility.
10
Bayes’s main idea, now known simply as Bayes’s Theorem, is a way to
11
think about credences. It allows us to answer the following question. Imag-
12
ine that we have certain credences assigned to different beliefs. Then we
13
gather some information, and learn something new. How does that new
14
information change the credences we have assigned? That’s the question we
15
need to be asking ourselves over and over, as we learn new things about the
16
world.
17
18
•
19
Say you’re playing poker with a friend. The game is five- card draw, so you
20
each start with five cards, then choose to discard and replace a certain num-
21
ber of them. You can’t see their cards, so to begin, you have no idea what
22
they have, other than knowing they don’t have any of the specific cards in
23
your own hand. You’re not completely ignorant, however; you have some
24
idea that some hands are more likely than others. A starting hand of one
25
pair, or no pairs at all, is relatively likely; getting dealt a flush (five cards of
26
the same suit) right off the bat is quite rare. Running the numbers, a ran-
27
dom five- card hand will be “nothing” about 50 percent of the time, one pair
28
about 42 percent of the time, and a flush less than 0.2 percent of the time,
29
not to mention the other possibilities. These starting chances are known as
30
your prior credences. They are the credences you have in mind to start, prior
31
to learning anything new.
32
But then something happens: your friend discards a certain number of
33
cards, and draws an equal number of replacements. That’s new information,
34
and you can use it to update your credences. Let’s say they choose to draw
S35
just one card. What does that tell us about their hand?
N36
7 1
Big Picture - UK final proofs.indd 71
20/07/2016 10:02:40
T H E B IG PIC T U R E
01
It’s unlikely that they have one pair; if they had, they probably would
02
have drawn three cards, maximizing the chance that they would improve
03
to three or four of a kind. Likewise, if they had three of a kind to start, they
04
probably would have drawn two cards. But drawing one card fits very well
05
with the idea that they have two pair or four of a kind, in which case they
06
would want to hold on to all four of the relevant cards. It’s also somewhat
07
consistent with them having either four cards of the same suit (hoping to
08
draw to a flush) or four cards in a row (hoping to complete a straight). These
09
likely behaviors, sensibly enough, are called the likelihoods of the problem.
10
By combining the prior credences with the likelihoods, we arrive at up-
11
dated credences for what their starting hand was. (Figuring out what their
12
hand probably is after the drawing is complete requires a bit more work, but
13
nothing a good poker player can’t handle.) Those updated chances are nat-
14
urally known as the posterior credences.
15
Bayes’s Theorem can be thought of as a quantitative version of the
16
method of inference we previously called “abduction.” (Abduction places
17
emphasis on finding the “best explanation,” rather than just fitting the data,
18
but methodologically the ideas are quite similar.) It’s the basis of all science
19
and other forms of empirical reasoning. It suggests a universal scheme for
20
thinking about our degrees of belief: start with some prior credences, then
21
update them when new information comes in, based on the likelihood of
22
that information being compatible with each original possibility.
23
•
24
25
The interesting thing about Bayesian reasoning is the emphasis on those
26
prior credences. In the case of poker hands it’s not such a challenging idea;
27
the priors come directly from the chances of being dealt different cards.
28
But the concept enjoys a much wider range of applicability.
29
You’re having coffee with a friend one afternoon, and they make one of
30
the following three statements:
31
32
• “I saw a man bicycling by my house this morning.”
33
• “I saw a man riding a horse by my house this morning.”
34
• “I saw a headless man riding a horse by my house this
35S
morning.”
36N
72
Big Picture - UK final proofs.indd 72
20/07/2016 10:02:40
l E A R n I n g A b Ou t t h E W O R l d
In each of these three cases, you’re given essentially the same kind of
01
evidence: a statement uttered by your friend in a matter-of-fact tone. But
02
the credence, or degree of belief, you would subsequently assign to each
03
possibility is utterly different in the three cases. If you live in a city or the
04
suburbs, you are much more likely to believe that your friend saw a bicyclist
05
than a man on horseback— unless, perhaps, police officers in your neighbor-
06
hood frequently ride horses, or there is a traveling rodeo in town. Whereas
07
if you live out in the country where horses are frequent and the roads aren’t
08
paved, it might be easier to accept the horse than the bicycle. In either case,
09
you’re going to be much more skeptical that anyone was riding anything
10
while lacking a head.
11
What’s happening is simply that you have priors. Depending on where
12
you live, the prior credence you would assign to seeing bicyclists or horse-
13
back riders will be different, and no matter what, your prior for riders hav-
14
ing heads is much higher than your prior for riders lacking them. And that’s
15
perfectly okay. In fact, any Bayesian will tell you, there’s no way around it.
16
Every time we reason about the probable truth of different claims, our an-
17
swers are a combination of the prior credence we assign to that claim and
18
the likelihood of various bits of new information coming to us if that claim
19
were true.
20
Scientists are often in the position of judging dramatic- sounding claims.
21
In 2012, physicists at the Large Hadron Collider announced the discovery
22
of a new particle, most likely the long-sought- after Higgs boson. Scientists
23
around the world were immediately ready to accept the claim, in part be-
24
cause they had good theoretical reasons for expecting the Higgs to be found
25
exactly where it was; their prior was relatively high. In contrast, in 2011 a
26
group of physicists announced that they had measured neutrinos that were
27
apparently moving faster than the speed of light. The reaction in that case
28
was one of universal skepticism. This was not a judgment against the abili-
29
ties of the experimenters; it simply reflected the fact that the prior credence
30
assigned by most physicists to any particle moving faster than light was
31
extremely low. And, indeed, a few months later the original team an-
32
nounced that their measurement had been in error.
33
There is an old joke about an experimental result being “confirmed
34
by theory,” in contrast to the conventional view that theories are confirmed
S35
N36
73
Big Picture - UK final proofs.indd 73
20/07/2016 10:02:40
T H E B IG PIC T U R E
01
or ruled out by experiments. There is a kernel of Bayesian truth to the
02
witticism: a startling claim is more likely to be believed if there is a com-
03
pelling theoretical explanation ready to hand. The existence of such an ex-
04
planation increases the prior credence we would assign to the claim in the
05
first place.
06
&n
bsp; 07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35S
36N
74
Big Picture - UK final proofs.indd 74
20/07/2016 10:02:40
01
02
10
03
04
Updating Our Knowledge
05
06
07
08
09
10
O
11
nce we admit that we all start out with a rich set of prior cre-
12
dences, the crucial step is to update those credences when new
13
information comes in. To do that, we need to describe Bayes’s
14
Theorem in more precise terms.
15
Let’s return to our friendly poker game. We know what cards we have,
16
but we don’t know our opponent’s cards. This puts us in a situation where
17
there are various different “propositions” (assertions that something is
18
true), and we have a comprehensive list of all the possible propositions. In
19
this case, the propositions correspond to all the various cards our opponent
20
could start with in a poker hand (nothing; a pair; something better than a
21
pair). In other cases they could be the possible interpretations of an out-
22
landish claim a friend makes (they’re correct; they’re sincere but misguided;
23
they’re lying), or a set of competing ontologies (naturalism; supernatural-
24
ism; something more exotic).
25
To every proposition we consider, we assign a prior credence. To help
26
visualize things, we can represent our credences by dividing some grains of
27
sand among a collection of jars. Each jar stands for a different proposition,
28
and the number of grains of sand in each jar is proportional to the credence
29
assigned to that proposition. The credence for proposition X is just the frac-
30
tion, out of the grains in all the jars, that are in the jar labeled X:
31
32
Grains in jar X
Credence in X =
33
Grains in all jars
34
S35
Call this the grains-of-sand rule.
N36
75
Big Picture - UK final proofs.indd 75
The Big Picture Page 12