Book Read Free

The Crowd and the Cosmos: Adventures in the Zooniverse

Page 22

by Lintott, Chris


  here. They’re an exceptionally thoughtful and careful bunch, and

  Phil Marshall in particular—one of the three leading scientists

  alongside Aprajita Verma and Anupreeta More—is one of the

  nicest people you could ever hope to meet. As a result, the idea of

  labelling volunteers, in all their human complexity, with a rating

  derived from nothing more than a few clicks on a website was

  anathema to Phil. In all the team’s papers, therefore, they set up a

  system where they represent each volunteer by an ‘agent’. An

  agent is a representation of the volunteer, but necessarily an

  imperfect one, as the agent knows only about the volunteer’s

  behaviour within the project. We can then label the agent, know-

  ing they are a poor reflection of their human counterpart. I’m

  less fastidious, and am happy to trust that you know I’m not

  really reducing people in all their glorious complexity to their

  performance in one project.

  This sort of analysis is useful for checking on the progress of

  the project; looking at the distribution of skill one sees that the

  average volunteer is pretty good. While both the highly skilled

  and the more confused contribute a few classifications, those

  who go on to contribute tens of thousands of classifications are

  all highly skilled. This data alone doesn’t tell you whether people

  are learning as they go, so that their skill inevitably improves

  over time, or whether those who are struggling are simply giving

  up, but it does show that we’re not wasting people’s time.

  178 From Supernovae to Zorill aS

  The real power comes when we move beyond this simple,

  single value. The SpaceWarps model sets up what’s known as a

  confusion matrix for each volunteer, keeping track of four key

  numbers. For each contributor, we estimate first the probability

  that they will say that there is a lens when there is indeed one

  there; second, the probability that they will say there is a lens

  when there isn’t; third, the probability that they’ll say there is

  nothing there when there isn’t; and finally (deep breath) the

  probability that they’ll say there’s nothing there when there is

  indeed a lens.

  Armed with this information, we can find ways to get more

  knowledge out of the system. There are four kinds of volunteer

  to consider. There are those who are always, or nearly always,

  right; the SpaceWarps team called these ‘astute’ volunteers,

  and they are very welcome in any project. There are also those

  who are always wrong, who miss lenses when they’re there

  and who see them when they’re not. These people are just as

  useful—someone who is wrong all the time provides just as much

  information as someone who is right all the time, as long as you

  know that they’re wrong.* So because we’re able to use the

  simulations to measure how people are doing, we can increase

  the amount of useful information we can get from the project.

  There are two more categories of people. There are optimists,

  who see a lens where there isn’t one but are reliable when they say

  there’s nothing there, and pessimists, who miss lenses but are

  accurate when they do identify one. Once we’ve spotted someone’s

  proclivities, we can work out how seriously to take their opinions,

  but we can also start to play games with who gets to see what.

  Before we throw away an image, confident that there’s nothing

  there, then perhaps we should make sure to show it to an optimist,

  * You may find this a useful strategy for life in general.

  From Supernovae to Zorill aS 179

  just in case. If we think we’ve found a lens, then we should show it

  to a pessimist—if even they reckon there’s something there, then

  our confidence should grow sky high that we’re on to something.

  Playing with task assignment in this way promises much more effi-

  cient classification, and more science produced more quickly.

  The only trouble is that this gets complicated fast. With tens of

  thousands of people participating in even a small project, and

  hundreds of thousands of images to view, the number of possible

  solutions is unbelievably large. Even when we consider that our

  choice is restricted by the fact that not everyone is online at the

  same time, complex mathematics is required to work out what a

  sensible path is. Work by Edwin Simpson of Oxford University’s

  Department of Engineering showed quickly that clever task assign-

  ment could produce results of the same accuracy with nearly one-

  tenth of the classifications, an enormous acceleration and one that

  is especially welcome when looking for the rarest of objects.

  SpaceWarps is among the most sophisticated Zooniverse pro-

  jects in how it treats its data, and in offering a faster route to science it seemed to be a template which we could apply in all of the

  other fields that we’re working on. Plenty of work on this sort of

  task assignment has been done by researchers in a field of com-

  puter science known as human–computer interaction, typically

  using Amazon’s Mechanical Turk system to connect researchers

  with those who will complete tasks for small payments.

  Yet things aren’t so easy with citizen scientists who are them-

  selves volunteers, and a simple experiment with a project we ran

  called Snapshot Serengeti shows why. Whenever I lecture on the

  Zooniverse, one of the most common questions is whether we

  really need humans given all the progress in machine learning. I’ve

  hopefully dealt with this already, but the disease seems especially

  acute around projects like this one, which uses motion sensitive

  cameras to monitor wildlife in the Serengeti National Park.

  180 From Supernovae to Zorill aS

  The images the cameras produce are wonderful, beautiful, and

  varied. Some would easily grace the cover of National Geographic,

  while others are more quirky. The team’s favourite comes from a

  camera programmed to take three photos in quick succession

  once triggered. The first of this particular sequence shows a

  hyena staring at the camera as the flash goes off. The second

  shows the same hyena skulking innocently in the background,

  but the third shows some sharp canines and the inside of the

  hyena’s mouth. Apparently getting chewed by the local wildlife

  was a common end for the project’s cameras (not a problem

  Penguin Watch faced in Antarctica), and elephants using camera

  stands as scratching posts didn’t help either.

  Despite the immense variety in what the project’s cameras

  capture, there seems to be something about the task of identi-

  fying animals in images that seems to convince people they can

  quickly write a script or produce an off-the-shelf machine-

  learning solution that will solve the problem. It turns out it’s

  harder than it looks. While we’ll share our data with anyone

  who wants it, no one’s yet come up with a completely robust

  solution yet. I have a soft spot for the attempts of a team we

  worked with at the Fraunhofer Institute in Munic
h (home to the

  inventors of the MP3, the format which encodes music on your

  phone and other digital devices) who developed an especial dis-

  like of ostriches, which thanks to their bendy necks and bandy

  legs turn out to be able to twist into a computer-defying set of

  shapes.

  Nonetheless, some tasks are definitely easier than others.

  Wildebeest are common enough to trigger complaints from

  regular classifiers, and so building up a suitable training set for

  them will be easier than, for example, doing so for the small,

  skunk-like zorillas which appear in one in every three million

  images (Figure 22). Easiest of all, though, is to identify the images

  From Supernovae to Zorill aS 181

  Figure 22 A rare image of a zorilla as captured by the Snapshot Serengeti cameras.

  with precisely zero animals in them at all.* Almost three-quarters

  of the data consisted of such images; either a camera would mal-

  function, and take image after image of nothing until its memory

  card was used up, or waving grass would do a good enough

  impression of a passing lion that it too would be captured.

  We know that volunteers care about getting science done, and

  we hate wasting their time, so removing these animal-free images

  was an obvious thing to do. What happened next was surprising.

  As volunteers saw more and more images with animals in, the

  total number of classifications the project received dropped.

  People might like contributing to science, but in trying to make

  * Notice I do not, as I would have done once upon a time, call these ‘blank’

  images. I was cured of that when speaking to a room full of plant scientists.

  Pointing at an image of a tree and grassland, I confidently told them there was

  ‘nothing there’ and saw the audience rise up as one. Apparently they call it plant blindness.

  182 From Supernovae to Zorill aS

  it faster for them to do so we’d done something that made the

  experience less pleasing, and we weren’t quite sure what it was.

  One theory suggested that there was a total amount of work that

  people would be willing to invest in the project. It’s faster and easier to say that there is nothing in an image than it is to distinguish a

  Thompson’s from a Grant’s gazelle, and so maybe by giving them

  more to do we were using up people’s effort faster. I don’t think that’s the right explanation; we know that all else being equal encountering an animal in Snapshot Serengeti made people more, not less

  likely to keep classifying, and so it seems to me that we would be at least as likely to encourage as discourage people from classifying.

  Instead, I think we’d changed how exciting the project seemed

  to people. Whereas before they’d seen nothing, then nothing,

  then nothing again, nothing again, and then suddenly a zebra,

  now they endured the apparent tedium of zebra followed by

  zebra followed by wildebeest followed by yet another zebra.

  While almost all the research on how to assign tasks for effi-

  ciency uses paid subjects, who can be assumed to stay put regard-

  less, our volunteers are free to walk away at any point. By trying

  to make things better, we made their experience worse. The

  choice between getting more science done and providing ‘fun’

  online is stark, even with such a simple experiment.

  This, of course, won’t be a surprise to any game designers who

  are reading. Since the first computer games bleeped their way

  into our collective consciousness in the 1970s and 1980s, players

  have been participating in an enormous collective experiment to

  find what will keep us clicking. While almost all games pay atten-

  tion to this, it’s most obvious in simple phone games that occupy

  so many commutes, most of which are optimized to produce

  just the right level of micro-excitement to keep us clicking. I

  don’t mean to sound snobbish about this, not least because I’m

  currently about 500 levels into something called Two Dots. We, as

  From Supernovae to Zorill aS 183

  humans, are just wired to respond in this way, and we behave as

  if we’re playing games even when it’s not deliberate. In the early

  days of Galaxy Zoo, lots of people told us that their experience of

  classifying galaxies was like eating crisps; you don’t mean to have

  just one more, but you do, again and again and again, rewarded

  with the next image each time you click on a galaxy.

  The implications of this seem obvious. For all that Zooniverse

  projects are scientific projects, they are also experienced as

  games. We could, perhaps, make them much more popular by

  manipulating the data so that a suitable fraction of animal-free

  images were served without worrying about whether such classi-

  fications were useful. We might take the most spectacular images

  and make them appear more frequently, even if we already know

  what they show, making further classification redundant. If

  manipulating the data this way makes for more classifications

  and hence more science overall, then perhaps there’s no harm.

  Plenty of projects have taken this route, and walked much fur-

  ther along it than we have. It feels like an obvious choice. If people like playing games, and are willing to contribute their time to do

  science, then a game that lets you contribute to science feels like

  the best of both worlds. But this feels like a step too far for me.

  Our participants take part because they want to contribute to

  science; it feels wrong to feed them images that we don’t need

  help with. This kind of dilemma will only become more acute

  once machines start picking up more of the slack, and we start

  deciding what is really worth sending to classifiers.

  At the end of the book, I want to use these ideas to talk about

  where citizen science is going. First, though, I need to tell you

  about what has clearly become the real strength of public

  participation in Zooniverse projects—the ability to find the truly

  unexpected, and to uncover stories of objects which would

  otherwise remain forever hidden from view.

  7

  SERENDIPITY

  In SpaceWarps and other projects, it’s clear that people, unlike machines, cope well with the unexpected. The example of the

  red-lensed galaxy shows that nicely, but it’s a risky argument. As

  training sets become larger, it’s going to become harder to sur-

  prise a machine, and so taking this to its logical conclusion one’s

  left with a vision of the citizen scientists of the next decade being chased from task to task as machines improve. Hunting the rarest of objects might still be a useful occupation, but the oppor-

  tunities to make a real contribution will become scarce. Given all

  the good that comes from projects that offer everyone a chance

  to help science, that would be a shame.

  It’s premature, I believe, to declare Zooniverse-style citizen

  science a passing phase. There is a more interesting future in

  store—one in which the line between the work done by ama-

  teurs and professionals, and between the amateurs and the pro-

  fessionals themselves, blurs still further. Evidence for this future

  is found in stori
es from many projects, in discussions that spring

  up around the unusual and the unexpected. I could give many

  examples, but let me tell you about two that I was personally

  close to. They’re stories of old-fashioned science, in which pro-

  fessional and citizen astronomers used a bucketload of ingenuity

  186 Serendipity

  to work out the solutions to new mysteries. These stories involve

  groups of people from a variety of backgrounds and with myriad

  life experiences.

  The first story dates back to the crazy first year of running

  Galaxy Zoo. The forum had quickly become a busy place, with

  posts about anything from astrophysical techniques to tea, but

  among the creativity of that community there was plenty of chat

  about what people were seeing on the site. A Dutch school-

  teacher named Hanny van Arkel was the first to point to a blue

  blob that appeared near an otherwise unremarkable galaxy in

  one of the images.

  The galaxy had a catalogue number—IC2497—and Hanny

  named the blob the ‘Voorwerp’ (Plate 11). When, a little later, the

  Galaxy Zoo team found her posts, I think we all assumed that it

  was a Dutch technical term. It turns out to mean ‘object’, or

  ‘thingy’, but ‘Hanny’s Voorwerp’ is now the official name of the

  blob, endorsed by several major journals. To be honest, at the

  start the most interesting thing about the Voorwerp was prob-

  ably the amusing story of the name, but Hanny wanted to know

  what it was.

  If I’d come across the blob while sorting through images

  myself, I think I’d have ignored it, placing it to one side while getting on with more straightforward tasks, if I would even have

  noticed it at all. Yet Hanny, the citizen scientist, was captivated by the discovery and pressed us to find out more. It was an early lesson that experts aren’t always right; not only do highly trained

  professionals occasionally make silly mistakes, they also can’t

  always be trusted to focus on what is truly interesting.

  There are plenty of examples scattered across the scientific lit-

  erature. Take the work of a group led by Trafton Drew at Harvard

  Medical School, for example, presented in the journal Psychological Sciences with an arresting title straight from a horror film: ‘The

  Serendipity 187

  invisible gorilla strikes again’. (Astronomers need to have more

 

‹ Prev