Skip to content. | Skip to navigation

Personal tools


You are here: Home / Resources / Kaggle-like competitions

Kaggle-like competitions

A number of data analysis challenges and competitions have been announced in recent years to address difficult and important problems in statistical and computational astronomy. Most involve issues arising in cosmology. Older entries are obtained from the Cosmology meets Machine Learning Uninstitute.

Microlensing Data Challenge

WFIRST will complete our census of the planetary population by using microlensing to discover a large sample of planets between 1-10 AU from their host stars. We are challenging the community to develop new analysis techniques to tackle unresolved questions and maximize the science return from this mission!  First dataset available now by joining:  Deadline: Oct 31, 2018. Newcomers to the field are encouraged!  

The analysis and modeling of microlensing events has always been a computationally-intensive and time-consuming task, traditionally requiring a powerful computer cluster as well as well sampled lightcurves. While the number of interesting events with adequate data remained fairly low, it has been practical to perform a careful interactive analysis of each one, often with the aid of a powerful computer cluster. Even so, a number of challenges remain, particularly concerning the analysis of triple lenses.  This is expected to change with next-generation surveys, especially with the launch of WFIRST. This mission is expected to detect thousands of microlensing events, including hundreds of planetary events. Clearly, our analysis techniques need an upgrade to fully exploit this dataset, and the currently-small microlensing community needs to grow. 

To stimulate research in this area, we are holding a series of data challenges, each based around the release of a large set of simulated WFIRST lightcurves.  The first dataset was recently released, with a submission deadline of Oct 31, 2018.  We are particularly keen to encourage participation by people from the astro-statistics and astro-informatics communities.  For more information, please visit

Observing Dark Worlds

There is more to the Universe than meets the eye. Out in the cosmos exists a form of matter that outnumbers the stuff we can see by almost 7 to 1, and we don't know what it is. What we do know is that it does not emit or absorb light, so we call it Dark Matter. Although dark, it warps and bends spacetime such that any light from a background galaxy which passes close to the Dark Matter will have its path altered and changed. This bending causes the galaxy to appear as an ellipse in the sky. This is an official Kaggle competition, now completed, here.  

Mapping Dark Matter

Mapping Dark Matter is a image analysis competition whose aim is to encourage the development of new algorithms that can be applied to challenge of measuring the tiny distortions in galaxy images caused by dark matter.The aim is to measure the shapes of galaxies to reconstruct the gravitational lensing signal in the presence of noise and a known Point Spread Function. The signal is a very small change in the galaxies’ ellipticity, an exactly circular galaxy image would be changed into an ellipse; however real galaxies are not circular. The challenge is to measure the ellipticity of 100,000 simulated galaxies.  This is an official Kaggle competition, now completed, here.  

Challenges in interferometric image reconstruction

Image reconstruction in optical interferometry has gained considerable importance for astrophysical studies during the last decade. This has been mainly due to improvements in the imaging capabilities of existing interferometers and the expectation of new facilities in the coming years. However, despite the advances made so far, image synthesis in optical interferometry is still an open field of research. Since 2004, the community has organized a biennial contest to formally test the different methods and algorithms for imagereconstruction. In 2016, we celebrated the 7th edition of the "Interferometric Imaging Beauty Contest". This initiative represented an open call to participate in the reconstruction of a selected set of simulated targets with a wavelength-dependent morphology as they could be observed by the 2nd generation of VLTI instruments. See details here

Challenges in visualization:

The VisIVO Contest 2014 (Visualization for the International Virtual Observatory) is a call to the worldwide scientific community to use technologies provided by the VisIVO Science Gateway to produce amazing images and movies from multi-dimensional datasets coming either from observations or numerical simulations.  The package offers a framework for exploration of large-scale scientific datasets, particularly related to cosmological simulations.    

Challenges in exoplanet detection:

The WFIRST Coronagraph Exoplanets Community Data Challenge seeks participation from teams with spectral retrieval expertise.  The Challenge will run from Aug15 to Nov15 2016, and it will consist of a blind spectral retrieval exercise using simulated extracted spectra for several "known RV” and/or hypothetical “discovery” exoplanets.  The data will be served via the IPAC WFIRST Science Center.  For the first five teams that complete the entire retrieval challenge (all five planets, all requested SNR and spectral resolution parameters) we are offering travel expenses to an exoplanets meeting.  Contact: Margaret Turnbull, SETI Institute, WFIRST Coronagraph SIT Principal Investigator.

The Nearby Earth Astrometric Telescope (a proposed satellite mission) is designed to measure the tiny positional wobble of solar-like stars due to orbiting planets.  NEAT scientists have designed a double-blind contest with realistic simulated time series with and without planetary signals.  


Challenges for weak-lensing galaxy image analysis:

GRavitational lEnsing Accuracy Testing (GREAT3) challenge is underway to test methods of weak lensing data analysis.  This is similar to strong lensing (above) but the background objects are galaxies rather than quasars, and the statistical problem involves measuring subtle shearing of the galaxy shapes.  Details are available here and here

Mapping Dark Matter Kaggle Challenge (a more accessible version of GREAT10)

GREAT10 PASCAL Challenge contains a spatially varying kernel, and a kernel estimation challenge

GREAT08 PASCAL Challenge was the first shear measurement challenge aimed at MLers


Challenges for galaxy morphology classification:

Kaggle and GalaxyZoo joined to present The Galaxy Challenge for automated galaxy morphology classification. The $16,000 prize  has been won by data scientist graduate student Sander Dieleman, who used a 7-layer neural network with 42M parameters.  Code was written in Python with Theano wrappers for GPU implementation. See Kaggle's interview here.  Kaggle sponsored an earlier galaxy imaging competition in 2011. 

Challenges for photometric redshift estimation:

The PHAT challenge here and here.  


Challenges for strong gravitational lensing time delay:

Strong Lens Time Delay Challenge is now open for competition.  Based on simulated LSST data of gravitational lensing of quasars lying behind foreground galaxies, the challenge is to accurately establish delays between the stochastic variations of two lensed quasar images from sparse, irregularly sampled time series.