The Crowd and the Cosmos: Adventures in the Zooniverse

Page 22

by Lintott, Chris

here. They’re an exceptionally thoughtful and careful bunch, and

Phil Marshall in particular—one of the three leading scientists

alongside Aprajita Verma and Anupreeta More—is one of the

nicest people you could ever hope to meet. As a result, the idea of

labelling volunteers, in all their human complexity, with a rating

derived from nothing more than a few clicks on a website was

anathema to Phil. In all the team’s papers, therefore, they set up a

system where they represent each volunteer by an ‘agent’. An

agent is a representation of the volunteer, but necessarily an

imperfect one, as the agent knows only about the volunteer’s

behaviour within the project. We can then label the agent, know-

ing they are a poor reflection of their human counterpart. I’m

less fastidious, and am happy to trust that you know I’m not

really reducing people in all their glorious complexity to their

performance in one project.

This sort of analysis is useful for checking on the progress of

the project; looking at the distribution of skill one sees that the

average volunteer is pretty good. While both the highly skilled

and the more confused contribute a few classifications, those

who go on to contribute tens of thousands of classifications are

all highly skilled. This data alone doesn’t tell you whether people

are learning as they go, so that their skill inevitably improves

over time, or whether those who are struggling are simply giving

up, but it does show that we’re not wasting people’s time.

178 From Supernovae to Zorill aS

The real power comes when we move beyond this simple,

single value. The SpaceWarps model sets up what’s known as a

confusion matrix for each volunteer, keeping track of four key

numbers. For each contributor, we estimate first the probability

that they will say that there is a lens when there is indeed one

there; second, the probability that they will say there is a lens

when there isn’t; third, the probability that they’ll say there is

nothing there when there isn’t; and finally (deep breath) the

probability that they’ll say there’s nothing there when there is

indeed a lens.

Armed with this information, we can find ways to get more

knowledge out of the system. There are four kinds of volunteer

to consider. There are those who are always, or nearly always,

right; the SpaceWarps team called these ‘astute’ volunteers,

and they are very welcome in any project. There are also those

who are always wrong, who miss lenses when they’re there

and who see them when they’re not. These people are just as

useful—someone who is wrong all the time provides just as much

information as someone who is right all the time, as long as you

know that they’re wrong.* So because we’re able to use the

simulations to measure how people are doing, we can increase

the amount of useful information we can get from the project.

There are two more categories of people. There are optimists,

who see a lens where there isn’t one but are reliable when they say

there’s nothing there, and pessimists, who miss lenses but are

accurate when they do identify one. Once we’ve spotted someone’s

proclivities, we can work out how seriously to take their opinions,

but we can also start to play games with who gets to see what.

Before we throw away an image, confident that there’s nothing

there, then perhaps we should make sure to show it to an optimist,

* You may find this a useful strategy for life in general.

From Supernovae to Zorill aS 179

just in case. If we think we’ve found a lens, then we should show it

to a pessimist—if even they reckon there’s something there, then

our confidence should grow sky high that we’re on to something.

Playing with task assignment in this way promises much more effi-

cient classification, and more science produced more quickly.

The only trouble is that this gets complicated fast. With tens of

thousands of people participating in even a small project, and

hundreds of thousands of images to view, the number of possible

solutions is unbelievably large. Even when we consider that our

choice is restricted by the fact that not everyone is online at the

same time, complex mathematics is required to work out what a

sensible path is. Work by Edwin Simpson of Oxford University’s

Department of Engineering showed quickly that clever task assign-

ment could produce results of the same accuracy with nearly one-

tenth of the classifications, an enormous acceleration and one that

is especially welcome when looking for the rarest of objects.

SpaceWarps is among the most sophisticated Zooniverse pro-

jects in how it treats its data, and in offering a faster route to science it seemed to be a template which we could apply in all of the

other fields that we’re working on. Plenty of work on this sort of

task assignment has been done by researchers in a field of com-

puter science known as human–computer interaction, typically

using Amazon’s Mechanical Turk system to connect researchers

with those who will complete tasks for small payments.

Yet things aren’t so easy with citizen scientists who are them-

selves volunteers, and a simple experiment with a project we ran

called Snapshot Serengeti shows why. Whenever I lecture on the

Zooniverse, one of the most common questions is whether we

really need humans given all the progress in machine learning. I’ve

hopefully dealt with this already, but the disease seems especially

acute around projects like this one, which uses motion sensitive

cameras to monitor wildlife in the Serengeti National Park.

180 From Supernovae to Zorill aS

The images the cameras produce are wonderful, beautiful, and

varied. Some would easily grace the cover of National Geographic,

while others are more quirky. The team’s favourite comes from a

camera programmed to take three photos in quick succession

once triggered. The first of this particular sequence shows a

hyena staring at the camera as the flash goes off. The second

shows the same hyena skulking innocently in the background,

but the third shows some sharp canines and the inside of the

hyena’s mouth. Apparently getting chewed by the local wildlife

was a common end for the project’s cameras (not a problem

Penguin Watch faced in Antarctica), and elephants using camera

stands as scratching posts didn’t help either.

Despite the immense variety in what the project’s cameras

capture, there seems to be something about the task of identi-

fying animals in images that seems to convince people they can

quickly write a script or produce an off-the-shelf machine-

learning solution that will solve the problem. It turns out it’s

harder than it looks. While we’ll share our data with anyone

who wants it, no one’s yet come up with a completely robust

solution yet. I have a soft spot for the attempts of a team we

worked with at the Fraunhofer Institute in Munic
h (home to the

inventors of the MP3, the format which encodes music on your

phone and other digital devices) who developed an especial dis-

like of ostriches, which thanks to their bendy necks and bandy

legs turn out to be able to twist into a computer-defying set of

shapes.

Nonetheless, some tasks are definitely easier than others.

Wildebeest are common enough to trigger complaints from

regular classifiers, and so building up a suitable training set for

them will be easier than, for example, doing so for the small,

skunk-like zorillas which appear in one in every three million

images (Figure 22). Easiest of all, though, is to identify the images

From Supernovae to Zorill aS 181

Figure 22 A rare image of a zorilla as captured by the Snapshot Serengeti cameras.

with precisely zero animals in them at all.* Almost three-quarters

of the data consisted of such images; either a camera would mal-

function, and take image after image of nothing until its memory

card was used up, or waving grass would do a good enough

impression of a passing lion that it too would be captured.

We know that volunteers care about getting science done, and

we hate wasting their time, so removing these animal-free images

was an obvious thing to do. What happened next was surprising.

As volunteers saw more and more images with animals in, the

total number of classifications the project received dropped.

People might like contributing to science, but in trying to make

* Notice I do not, as I would have done once upon a time, call these ‘blank’

images. I was cured of that when speaking to a room full of plant scientists.

Pointing at an image of a tree and grassland, I confidently told them there was

‘nothing there’ and saw the audience rise up as one. Apparently they call it plant blindness.

182 From Supernovae to Zorill aS

it faster for them to do so we’d done something that made the

experience less pleasing, and we weren’t quite sure what it was.

One theory suggested that there was a total amount of work that

people would be willing to invest in the project. It’s faster and easier to say that there is nothing in an image than it is to distinguish a

Thompson’s from a Grant’s gazelle, and so maybe by giving them

more to do we were using up people’s effort faster. I don’t think that’s the right explanation; we know that all else being equal encountering an animal in Snapshot Serengeti made people more, not less

likely to keep classifying, and so it seems to me that we would be at least as likely to encourage as discourage people from classifying.

Instead, I think we’d changed how exciting the project seemed

to people. Whereas before they’d seen nothing, then nothing,

then nothing again, nothing again, and then suddenly a zebra,

now they endured the apparent tedium of zebra followed by

zebra followed by wildebeest followed by yet another zebra.

While almost all the research on how to assign tasks for effi-

ciency uses paid subjects, who can be assumed to stay put regard-

less, our volunteers are free to walk away at any point. By trying

to make things better, we made their experience worse. The

choice between getting more science done and providing ‘fun’

online is stark, even with such a simple experiment.

This, of course, won’t be a surprise to any game designers who

are reading. Since the first computer games bleeped their way

into our collective consciousness in the 1970s and 1980s, players

have been participating in an enormous collective experiment to

find what will keep us clicking. While almost all games pay atten-

tion to this, it’s most obvious in simple phone games that occupy

so many commutes, most of which are optimized to produce

just the right level of micro-excitement to keep us clicking. I

don’t mean to sound snobbish about this, not least because I’m

currently about 500 levels into something called Two Dots. We, as

From Supernovae to Zorill aS 183

humans, are just wired to respond in this way, and we behave as

if we’re playing games even when it’s not deliberate. In the early

days of Galaxy Zoo, lots of people told us that their experience of

classifying galaxies was like eating crisps; you don’t mean to have

just one more, but you do, again and again and again, rewarded

with the next image each time you click on a galaxy.

The implications of this seem obvious. For all that Zooniverse

projects are scientific projects, they are also experienced as

games. We could, perhaps, make them much more popular by

manipulating the data so that a suitable fraction of animal-free

images were served without worrying about whether such classi-

fications were useful. We might take the most spectacular images

and make them appear more frequently, even if we already know

what they show, making further classification redundant. If

manipulating the data this way makes for more classifications

and hence more science overall, then perhaps there’s no harm.

Plenty of projects have taken this route, and walked much fur-

ther along it than we have. It feels like an obvious choice. If people like playing games, and are willing to contribute their time to do

science, then a game that lets you contribute to science feels like

the best of both worlds. But this feels like a step too far for me.

Our participants take part because they want to contribute to

science; it feels wrong to feed them images that we don’t need

help with. This kind of dilemma will only become more acute

once machines start picking up more of the slack, and we start

deciding what is really worth sending to classifiers.

At the end of the book, I want to use these ideas to talk about

where citizen science is going. First, though, I need to tell you

about what has clearly become the real strength of public

participation in Zooniverse projects—the ability to find the truly

unexpected, and to uncover stories of objects which would

otherwise remain forever hidden from view.

7

SERENDIPITY

In SpaceWarps and other projects, it’s clear that people, unlike machines, cope well with the unexpected. The example of the

red-lensed galaxy shows that nicely, but it’s a risky argument. As

training sets become larger, it’s going to become harder to sur-

prise a machine, and so taking this to its logical conclusion one’s

left with a vision of the citizen scientists of the next decade being chased from task to task as machines improve. Hunting the rarest of objects might still be a useful occupation, but the oppor-

tunities to make a real contribution will become scarce. Given all

the good that comes from projects that offer everyone a chance

to help science, that would be a shame.

It’s premature, I believe, to declare Zooniverse-style citizen

science a passing phase. There is a more interesting future in

store—one in which the line between the work done by ama-

teurs and professionals, and between the amateurs and the pro-

fessionals themselves, blurs still further. Evidence for this future

is found in stori
es from many projects, in discussions that spring

up around the unusual and the unexpected. I could give many

examples, but let me tell you about two that I was personally

close to. They’re stories of old-fashioned science, in which pro-

fessional and citizen astronomers used a bucketload of ingenuity

186 Serendipity

to work out the solutions to new mysteries. These stories involve

groups of people from a variety of backgrounds and with myriad

life experiences.

The first story dates back to the crazy first year of running

Galaxy Zoo. The forum had quickly become a busy place, with

posts about anything from astrophysical techniques to tea, but

among the creativity of that community there was plenty of chat

about what people were seeing on the site. A Dutch school-

teacher named Hanny van Arkel was the first to point to a blue

blob that appeared near an otherwise unremarkable galaxy in

one of the images.

The galaxy had a catalogue number—IC2497—and Hanny

named the blob the ‘Voorwerp’ (Plate 11). When, a little later, the

Galaxy Zoo team found her posts, I think we all assumed that it

was a Dutch technical term. It turns out to mean ‘object’, or

‘thingy’, but ‘Hanny’s Voorwerp’ is now the official name of the

blob, endorsed by several major journals. To be honest, at the

start the most interesting thing about the Voorwerp was prob-

ably the amusing story of the name, but Hanny wanted to know

what it was.

If I’d come across the blob while sorting through images

myself, I think I’d have ignored it, placing it to one side while getting on with more straightforward tasks, if I would even have

noticed it at all. Yet Hanny, the citizen scientist, was captivated by the discovery and pressed us to find out more. It was an early lesson that experts aren’t always right; not only do highly trained

professionals occasionally make silly mistakes, they also can’t

always be trusted to focus on what is truly interesting.

There are plenty of examples scattered across the scientific lit-

erature. Take the work of a group led by Trafton Drew at Harvard

Medical School, for example, presented in the journal Psychological Sciences with an arresting title straight from a horror film: ‘The

Serendipity 187

invisible gorilla strikes again’. (Astronomers need to have more

‹ Prev Next ›