Scalable Optimization of Neighbor Embedding for Visualization

You are here: Home / Submitted Papers / 2015 / Scalable Optimization of Neighbor Embedding for Visualization

Abstract

Neighbor embedding (NE) methods have found their use in data visualization but are limited in big data analysis tasks due to their O(n2) complexity for n data sam- ples. We demonstrate that the obvious ap- proach of subsampling produces inferior re- sults and propose a generic approximated op- timization technique that reduces the NE op- timization cost to O(n log n). The technique is based on realizing that in visualization the embedding space is necessarily very low- dimensional (2D or 3D), and hence efficient approximations developed for n-body force calculations can be applied. In gradient- based NE algorithms the gradient for an in- dividual point decomposes into “forces” ex- erted by the other points. The contributions of close-by points need to be computed indi- vidually but far-away points can be approx- imated by their “center of mass”, rapidly computable by applying a recursive decom- position of the visualization space into quad- rants. The new algorithm brings a significant speed-up for medium-size data, and brings “big data” within reach of visualization.

Author

Zhirong Yang, Jaakko Peltonen, Samuel Kaski

Journal

Proceedings of the 30th International Conference on Machine Learning

Paper Publication Date

2013

Paper Type

Astroinformatics