DES Uses Crowdsourcing to Improve Data Quality

Screen Shot 2016-10-10 at 2.11.33 PM

Figure 1: In this image from Galaxy Zoo, users categorize the shapes and other interesting features of galaxies from images from the Sloan Digital Sky Survey (SDSS). You can try for yourself at galaxyzoo.org.

There is power in numbers. If a large number of people apply their brainpower, creativity, or even physical effort toward one problem, it will probably be solved faster and more efficiently than if a single person gave it their best effort. Crowdsourcing, or obtaining labor, ideas, or funding from a large group of people, typically over the internet, has become a popular method of problem solving. A tech company may improve the usability of an app by accepting feedback from users. A snack company may take suggestions for flavors from their own customers. Crowdsourcing has also proved an effective tool in scientific projects (including the Dark Energy Survey) where a large volume of data analysis, not necessarily by professionals, is required. Galaxy Zoo, a popular example of crowdsourced science, came about when the morphology (classification by appearance) of over 900,000 galaxies imaged by the Sloan Digital Sky Survey (SDSS) needed to be classified by eye. With Galaxy Zoo, the public is able to classify SDSS galaxies as for example “spiral” or “elliptical” by viewing an image and answering a series of questions about the galaxy’s shape. This project led to the citizen science portal Zooniverse, which hosts crowdsourced applications that contribute to research in astronomy, biology, and climate science. The value of this work is made apparent by citations in over 80 scientific publications; the interest of amateurs, science lovers, and curious minds has resulted in real contributions to science

Large astronomical surveys, like DES, collect vast amounts of data in order to draw the scientific conclusions they seek to make (DES will amass around 100 TB of images). However, before these images can be used for science, the data must be reduced and processed to remove imperfections which are the result of occasional hardware malfunctions, satellite passes, and cosmic rays, among other things. The proper processing of data is critical to ensure the science results are valid. This process can be tedious, so data reduction and processing pipelines (computer algorithms) are employed to automatically remove these imperfections. However, perfect algorithms do not exist, and the results of these programs should therefore be checked by eye. If unwanted artifacts still persist after the automatic reduction process, it may be necessary to modify the code to address these problems.

Screen Shot 2016-10-10 at 2.15.45 PM

Figure 2: The three panels show typical issues in astronomical images. Top Panel: The long gray streak is a satellite trail. Middle Panel: Several of the longer streaky bits of light highlighted in blue are cosmic rays, energetic particles constantly entering the Earth’s atmosphere. The streaks seen are cosmic rays that enter the camera itself. Bottom Panel: The bright star on the left creates reflected images (or “ghosts”) in the telescope, seen on the right. These are a few of many possible issues to be identified by automated algorithms and checked by users of Exposure Checker. The blue signifies masked pixels.

For the dozen or so scientists who write the processing pipelines in DES, visually checking the results of thousands of images would be incredibly time consuming (and simply unreasonable). Instead, the pipeline writers have developed a web application so that all of the few hundred DES scientists are able to spend a few minutes at a time to help control the quality of images taken by DECam

Screen Shot 2016-10-10 at 2.17.20 PM

Figure 3: Example of the drop-down menu to mark issues in the DES exposure checker.

Users of the application are presented with an image that has been automatically reduced with the pipeline. After inspecting the image, if the user sees that the pipeline has missed one of many possible flaws (masking of cosmic rays, reflection ghosts, satellite tracks, and much more, see image at left), they can select the issue from a drop-down box and click the image to indicate the location. Likewise, if the user discovers that the pipeline incorrectly applied a correction, they can select “False”. If the user determines that no flaws exist, they simply select “Next” to see a new image. The creators also included an “Awesome!” button so that the user can indicate and share any interesting objects they might discover in an image. Reports are then generated based on this feedback and used by the hardware and software experts to improve the reduction algorithm. As with many crowd-sourced task applications, DES’s “Exposure Checker” incorporates elements of gamification to provide motivation for users to participate. An element of competition is introduced with a leaderboard where users can increase their rank, see the status of other participants, and even unlock badges.

Screen Shot 2016-10-10 at 2.17.28 PM

Figure 4: A sample ranking of Users contributing to exposure checker. Elements of gamification like this are prevalent in crowdsourcing efforts.

As of May 2016, 112 users have submitted over 39,000 reports which provide critical information about unwanted artifacts. “We made it easy to browse DES images and that really changed the way DES participants interacted with them”, says Peter Melchior, an astrophysicist at Princeton University who led the project. “We knew that participants care about the survey data, and we have given them a way to help and provide feedback.” Exposure Checker users quickly detected vertical bands in images that were due to the camera’s shutter failing during certain exposures. The origins of this problem would not have been discovered as quickly without the application.

Not only does Exposure Checker help improve the quality of DES images, it also is an outlet for collaboration members to become familiar with DES data and its common flaws. The more familiar a scientist is with their data, the more effectively that scientist can use it to draw scientific conclusions. While the Exposure Checker is available only to members of the DES collaboration (a much smaller crowd than the publicly available galaxy zoo), it is still a testament to effective problem-solving in science—through crowdsourcing of visual quality control, in this case. A taste of Exposure Checker can be viewed in the demo

Even if you’re not a member of DES, there are many ways to be involved in crowdsourced citizen science projects. As mentioned, Zooniverse is an excellent portal for non-professionals to get involved in science. Your interest and efforts may lead to new discoveries and insight into the universe!


About the Paper Author

peter_melchior_lb6x4709_-_1200x1500_75

 

 

 

Peter Melchior is the lead scientist of this project, which can be viewed in more detail at http://arxiv.org/abs/1511.03391. Peter is an astrophysicist at Princeton University. He measures the mass of galaxy clusters through gravitational lensing, a tiny effect that can easily be wiped out by instrumental artifacts. Hence the interest in finding them, which inspired this work. 

 

 

 


About The DArchive Authors & Editors

Screen Shot 2016-10-10 at 2.28.26 PM

 

Jacob Robertson is the author of this Darchive summary. He is an undergraduate physics major at Austin Peay State University in Clarksville, Tennessee. Jacob joined the Dark Energy Survey collaboration in the summer of 2016 to work with the calibrations group. In addition to his work with DES, he is a member of his university’s high-altitude scientific ballooning team. In his free time, Jacob enjoys making espresso drinks at home and repeatedly failing to create latte art.

 

 

 

 
Ross Cawthon is an astrophysics PhD candidate at the University of
Chicago. He works on various projects studying the large scale structure
of the Universe using the millions of galaxies DES observes. These
projects include galaxy clustering, correlations of structure with the
cosmic microwave background, and using the structure of the Universe to infer redshifts of galaxies. Ross is also an active science communicator, volunteering at Chicago’s Adler Planetarium as well as writing and editing for The Darchives. He also loves observing for DES in Chile, where he has observed more than 30 nights.