Established October 2, 2014
Astronomical observations produce some of the biggest “big data” today through a new generation of telescopes, and in projects involving next-generation telescopes planned to be launched in less than a decade. While these data sets exhibit the usual challenges associated with big data (high dimensionality, high complexity, disparate variables, and immense data volumes, etc.) there are new problems such as data types that represent a whole new nature and level of complexity. Aspects specific to the information extraction from astronomical observations also warrant specific focus. To give an example, classification, data mining and discovery of patterns need to work with imagery of extremely low signal-to-noise ratio and must produce very precise estimates (more precise than, for example, in terrestrial remote sensing). Another example is the need to deliver some results from multi-Terabyte-size data in (near-)real-time to guide the next day’s observation of interesting objects and best exploit short windows of observing opportunities. A distinguishing factor of astronomical data sets is that, unlike, e.g., in medical or social domains, there are strict laws of physics behind the data production and often those can be assimilated into machine learning mechanisms to improve over general off-the-shelf state-of-the-art. A surge of discussion of the specific problems and need for collaborations with computational / data experts have started in the astronomy community. It seems timely that a DMTC TF should be created to contribute to this effort.
The primary objective of the TF on Mining Complex Astronomical Data is contribution to solutions of problems modern astronomy faces in turning the sky-full of inexhaustible stream of data into reliable knowledge at an accelerated rate. Machine learning and data mining, computational intelligence approaches in general, are in high demand but as yet not sufficiently exploited. The astronomical community is reaching out to engage expertise in these areas. We want to take advantage of this climate, assert our presence early and work diligently on applying our capabilities in support of astronomy’s science goals. Meaningful contributions by experts in the relevant computational areas can only be achieved in tight collaboration with astronomers. Therefore, the members of this TF will strive to be directly involved in astronomical data analysis projects. The initial selection of the founding members reflects this philosophy.
Some of the major needs are improvement of clustering, classification, detection performance, as well as visualization, under the specific constraints of astronomical data. Non-linear approaches, nontraditional measures and statistics, methods to characterize the fidelity of prior labeling, new knowledge representations will most probably play a significant role. Improving algorithmic speed (lowering computational complexity) is also a major interest even within distributed HPC frameworks. More specific goals will have to be identified with respect to specific projects. Similarly, the value of any contribution will have to be measured by its utility to the targeted science goal. While scope of CI technologies can be easily defined on a certain level (such as above), in this TF we want the science needs to drive and to shape the scope of the activities.
This TF is also envisioned to have a broader scope reaching beyond the computational aspects. Showing the astronomical community successful results from astronomical data obtained by advanced CI methods is critical because, presently, astronomers in general do not know the methods developed and used in the CI community. Convincing them of the advantages is most effectively done through joint data analyses. But it is equally important to make CI experts (engineers, computer scientists, statisticians, …) aware of the existence of interesting scientific problems that require expert applications of state-of-the-art techniques and algorithms developed by the CI community, or motivate developments of new methods. The mission of this task force is to widen the bridge between the astronomical and CI world, to mutual benefits.