Testing over/underdensities in point distributions

 Posted by Krzysztof Findeisen at May 16. 2014

Apologies if this is a repeat question, I haven’t been able to find a search function for the forum.

I’m currently looking at a distribution of sources in a two-dimensional parameter space. The distribution looks largely uniform, but there appears to be a region that sources tend to avoid. Said region was not chosen a priori; it is noteworthy only because I think it has a lower density of points.

How do I test whether an apparently underdense region is significant, if I have no prior constraints on the region’s location? I assume I have to somehow allow for the number of equally sized regions that could have had fewer points than average, but this seems like an uncountably infinite set.

 Posted by Cameron, Ewan at May 17. 2014

Hi Krzysztof,

This sounds similar to the problem of assessing the significance of voids in cosmology, for which there are some early analytical solutions for the case that the underlying distribution can be considered a non-homogenous point process with underlying density fluctuations being the realisation of a Gaussian random field (e.g. Otto et al. 1986); though later the results focus on the case that clusters only form at the local maximum of random fields which complicates things.

But for your specific case that the null hypothesis seems to be a uniform distribution within some fixed boundaries I imagine it would be quickest/easiest simply to compute a p-value for your (apparently) under dense region via Monte Carlo simulations.  E.g. draw 10000 realisations of the uniform Poisson point process on the domain in question, and for each compute some one dimensional summary statistic like the radius of the largest circle that can be placed fully inside the domain while enclosing no more than n points (with n= 0?1?5? as appropriate).  Construct the histogram of this statistic and see where the value of this statistic computed on your observed data lies in comparison.

Hope that helps,
Ewan.

 Posted by Krzysztof Findeisen at June 06. 2014

Thank you for the suggestion, it hadn’t occurred to me to try simulating the point distribution. Unfortunately, I’ve re-analyzed the data in the meantime and the apparent underdensity disappeared. I’ll keep it in mind for next time, though.

Thanks again!