The Crowd and the Cosmos: Adventures in the Zooniverse
Page 28
ing it to the rest of us). Mike Walmsley has just finished working
on a specialized neural network that can find the faint structures
around galaxies which indicate a past merger. Looking for these
faint tails of stars is important if we want to understand how
normal galaxies react to collisions—it’s sort of the opposite tech-
nique to finding the bulgeless galaxies I discussed in Chapter 4.
Bigger collisions (those with more massive galaxies) leave more
debris, so at least in theory there’s also the chance of reconstruct-
ing the crash that led to stars being scattered out of the main gal-
axy itself, if only we can find them. The trouble is there are very
few surveys where experts have done the painstaking work of
sorting through the images themselves.
Nonetheless the results, despite the handicap of a small train-
ing set, are pretty good. The network is indeed capable of finding
galaxies which show signs of a merger. It’s not perfect, matching
expert classifications 80 per cent of the time, but that’s a huge
advance on where we were before. In the old days of 2007 or so,
we’d have set up a citizen science project to gather more training
data and to try and improve this figure. A few years ago we might
have looked at how to combine human and machine classifica-
tion, like we did in the supernova project. But in this machine-
optimistic scenario, another year or so’s work will break the back
of the problem, and we can expect the robots to win before too
long. If neural networks really can be adapted to deal with such
Three PaThs 233
small training sets, then we won’t need large numbers of classifi-
cations from volunteers.
Progress might come from more of Mike’s work, which uses a
new kind of neural network introduced to us by a colleague in
computer science, Yarin Gal. This network not only classifies
things, but can tell us how certain it is about its classifications.
It’s thus producing data which is of the same kind as that pro-
duced, collectively, by Galaxy Zoo volunteers. By the time you
read this, we’ll be running it alongside the main project, and
incorporating its results into our decisions about galaxies.
Another major area of research in machine learning is in
finding unusual objects. Actually, that’s not quite true. Finding
unusual objects—the images in the original Galaxy Zoo data set
of nearly a million galaxies that look least like the others, for
example—is not a hugely difficult problem. As I wrote earlier, the
difficult bit is finding unusual objects which are actually interest-
ing. It’s one thing to pick out the images where the camera mal-
functioned, where a bright star overwhelmed the chip or where
someone turned a light on by mistake, but quite another to find
the peas and the Voorwerp among that pile of images which are
occasionally visually interesting but mostly scientifically junk.
Still, progress is being made. Techniques which use ‘clustering’—
sorting similar images into piles—look promising. If you end up
with many piles with a few images in, it’s not a huge amount of
effort to decide which of these outliers are truly interesting.
Future surveys might do this as a matter of course, with their
professional astronomers presented with a few representative
objects from each class for consideration.
Perhaps this focus on the unusual is in any case wrong-headed.
If astrophysics is heading for a future where we produce truly
enormous data sets then we might have no choice but , like
Dr Strangelove, to stop worrying and learn to love the algorithm.
234 Three PaThs
Maybe we can get more insight from things that occur often than
from the odd weird exception. Particle physicists at the LHC are,
for the most part, already living in this future; as mentioned in
Chapter 1, if some completely unexpected cascade of particles
happens in this most massive and sophisticated of experiments,
it will be discarded by a system looking for specific triggers. The
LHC detectors simply couldn’t operate any other way without
being completely overwhelmed by noise.
Cosmologists, too, seeking to discover type 1a supernovae so
as to measure the effect of dark energy in the acceleration of the
Universe’s expansion (see Chapter 6) may not mind if explosions
that don’t fit the expected pattern are discarded. If you can find
enough supernovae of the right type, you may even get better
results by assembling a nice, well-behaved group rather than
including anything odd. For predictable science, where we’re
testing well-defined hypotheses—something that would fit well
into the science fair I described in Chapter 1—trusting the
machines and hoping we end up in this future might well be a
sensible way to go.
A second possible future is one in which, though machine
learning continues to improve, we never really break free from
the tyranny of the training set. The techniques that are driving
the artificial intelligence revolution simply are, like an easily distracted student, dependent on being walked through example
after example after example.
There are some ways of dealing with this. Techniques like
transfer learning, where a neural network or other solution is
trained on one survey before most of its guts are used to con-
struct a new network capable of dealing with a different data set,
do make things easier. A network trained to recognize animals in
the Serengeti will do pretty well when deployed on images of
wildlife in the US; though the species are different, the layers of
Three PaThs 235
the network that identify the animal amid the background will
be shared between the two problems.*
For a project like the LSST survey, where there are a thousand
different scientific investigations that all need access to the same, consistent data set and where rare objects matter, it’s less clear
what the solution is. After all, finding unusual and unexpected
objects is part of the reason we build telescopes like this; when-
ever we’ve done something fundamentally new, in this case
monitoring such a large area of sky this frequently with such a
powerful instrument, we have found new things.
And if LSST is going to challenge machine learning, then once
the data from the radio astronomers’ new toy, the SKA, starts
flooding in then we’ll really be in trouble. In this scenario, the
problems faced (and caused) by scientists in general and by
astronomers in particular are odd enough that whatever Silicon
Valley gets up to we’ll need help ourselves.† This means that well
into the next decade, we’ll need plenty of classifications from
humans and their expert pattern recognition systems. Indeed,
looking at what’s coming, the existing effort across all Zooniverse
projects won’t be enough to cope.
We need to get smarter if, in this reality, we’re going to pre-
serve a space for citizen science. Probably the easiest
way to do
this is to recruit more volunteers to help. (Despite this being a
vision of the future I’m making up, let’s assume that even in this
universe it’s not the case that millions of people have read this far
* This sort of work is being led for the Zooniverse by Lucy Fortson’s group at the University of Minnesota.
† This isn’t completely unrealistic; there aren’t too many cases where the most important things are the rarest objects, or where such precisely accurate classifications are required. If Facebook identifies the wrong friend in a photo, it’s at worst slightly embarrassing, and is unlikely to lead you to predict the wrong future for the Universe.
236 Three PaThs
so as to be inspired to rush to the keyboard and contribute). I’m
sure there’s more we could do,* but to really tackle the bulk of
LSST let alone SKA data we’ll need an enormous increase in the
amount of effort available.
The answer may be staring us in the face. If human beings are
game-playing creatures, then maybe we should build games
rather than citizen science projects. Indeed, the first moves in
this direction have already been made. Eyewire is a project run by
researchers at MIT, who want volunteers to help map the com-
plex structure of neurons in the brain. Volunteers see slices of the
complex tangle of cells and are asked to separate the structures
visible in the images from the background; additional help and
complication is provided by the fact that these are in fact three-
dimensional objects. It sounds complicated, but the team have
provided an engaging and interesting interface that has attracted
tens of thousands of volunteers to help, producing results that, in
a preliminary study, were impressively accurate.
Eyewire participants also chat to each other, and to helpful
chatbots which offer advice, in real time while they’re classifying.
It’s a much less isolated experience than our Zooniverse projects,
where the act of classification is performed in sacred solitude so
as to prevent groupthink (as we’ve seen, discussion and collabo-
ration through our forums happens after the initial classification
is recorded). Eyewire volunteers also score points for their par-
ticipation, and an ever-growing set of challenges and competi-
tions aims to make the game more engaging, and to bring
classifiers back for more.
A recent email newsletter sent to me and the worldwide net-
work of my fellow Eyewire volunteers gives you the idea. During
* Have you considered buying a copy of this book for a friend? Or three? Or for everyone you know?
Three PaThs 237
the summer of 2018, alongside the real thing in Russia there was
an Eyewire World Cup. Participants representing a country had
their effort counted towards their team’s total, and could win
‘buckets of points, six new badges and speciality swag [they’d]
only be able to get if [they] participate’.
These are the techniques of modern software development
and game design, being used here to drive people towards taking
part in a scientific project. I’m an enthusiastic participant in the
project, so please don’t think that I consider the idea of point col-
lecting and competitions beneath me. The reality is quite the
opposite; their techniques work especially well on me!*
Others have gone further, and made a game of the science
itself. Probably the best known of these projects is an old one,
predating even Galaxy Zoo. Fold.it asked volunteers to investi-
gate the three-dimensional structures of proteins. In many cases,
we know the basic chemistry of these important biological mol-
ecules in the sense of being able to write down what connects to
what. However, secondary effects as the atoms bond together
will cause the protein to twist and buckle in a way that is cur-
rently very hard to predict; it’s impossible to calculate, and any
automated search for a likely solution runs the risk of getting
stuck in a local minimum, a possible solution that looks plausi-
ble (technically, it’s likely better than any solution that is similar to it) but which has not been tested sufficiently to find out
whether it is overall the best.
Exploring a vast range of possibilities to find a good solution
to a problem like this is another type of task that humans have
evolved to be good at, just like the more basic pattern recognition
* I am, in fact, a sucker for this sort of thing. I have an enormous pile of coffee shop loyalty cards from places I will never again visit, and have used the Foursquare app to check in everywhere I’ve been since 2011.
238 Three PaThs
that we in Zooniverse have been using all this time. Once a struc-
ture is proposed, it is easy to calculate its energy, based on the
interactions between the various components. The game is to
look for the lowest energy structure, as we trust nature to have
found a way to fold proteins efficiently. All this effort is import-
ant because it is the three-dimensional shape of a protein that
determines how it interacts with other molecules, particularly in
the complex and not fully understood dance that is molecular
biochemistry.
The results from Fold.it have been great, with players often
outperforming the best computer science efforts at attacking the
same problem in large competitions and challenges designed to
test protein-folding methods. Sometimes the best players turn
out to be those with some sort of relevant expertise, but more
often the game finds people who turn out to have an instinct for
how to play. Because the ‘rules’—things like the angle at which
hydrogen atoms can be placed—are encoded in the game itself,
Fold.it players don’t need to know any chemistry at all.
It’s a neat solution, and the game is actually quite fun to play,
even if I can’t get past the first few levels. I’ve never been patient with puzzles, but it seems I’m not that typical. A few years ago,
when I visited the Fold.it team at the University of Washington,
they told me that at any one time a few people are deep enough
into the game that they’re providing real and useful results, while
most players are still learning. If the number of useful players
drops too far, the team will run competitions or advertise to
encourage a new cohort of Fold.it players to work their way
deeper into the system. The entire structure of the game is a con-
veyor belt designed to carry the best players onward to the point
where they’re working on scientifically useful data.
It would be possible to play Fold.it without realizing it had a
scientific purpose at all, though I doubt anyone does so. Other
Three PaThs 239
teams have gone even further, disguising citizen science projects
within existing games. Probably the most ambitious example is a
Swiss project that created a mission within the science fiction-
themed online multiplayer game, Eve Online. Players of the game
can choose to review data from Kepler in the hope of finding a planet, but also in order to receive rewards in the form of the
game’s inter
nal, online currency. The experience is noticeably a
bit odd, but in essentials indistinguishable from the experience
of completing one of the other missions within the game world
itself.
With millions of people taking part in such games, here, per-
haps, is the crowd we need in order to cope with the data sets of the future. In this imagined future, projects like those hosted on the
Zooniverse will become both more ubiquitous and almost com-
pletely invisible. In fact, the more invisible they become the better, as the more seamlessly they can be integrated into the games
we’re playing anyway the more people will take part. Instead of
having to make the choice to participate in science, something
which many people find intimidating, it will just happen.
Will this work? Maybe. Half a million people took part in the
Eve Online planet hunting experiment, though I haven’t seen any
discoveries come from it yet. That’s not too surprising, as these
things take time, but it will be the acid test of whether the project has succeeded. (A similar effort, which involved more than
300,000 players in the task of labelling features in high-resolution
images of cells, has recently produced a paper which shows that
the technique works, at least in this one case.) Even our modest
experiments with gamification in the original Old Weather
project (described in Chapter 4) seemed to work well. All we did
was give people a rank when they started transcribing records
from a ship, and yet it seems to have encouraged some people to
work very hard indeed. One ‘ship’ in the project was, I’m pretty
240 Three PaThs
sure, a building—a training facility given, as is normal in naval
tradition, a ship’s name. Despite the fact that it didn’t go any-
where, people dutifully worked their way through the log book.
(I haven’t followed this up, because the implications of being able
to inspire people to work their way through the log of a building
pretending to be a ship scare me a little.) With the help of games
designers, maybe we can hide enough tasks that citizen science
even at the scale needed for these big surveys will become pos-
sible, and all without anyone knowing they are participating.
This second future reality is efficient, and science gets done,
but I’m not sure I like it. Actually, I’m certain that I don’t. I’ve