Posted by Krzysztof Findeisen at December 11. 2014
Hello again,
I’m currently working on a problem where I need to test whether there are systematic deviations (in the sense of a locally varying mean) in a presumed uniform set of stellar measurements in a particular area of the sky, with no prior knowledge on size, shape, or intensity of any deviations. The data set is large and all-sky, so there are also (soft) constraints on the complexity of the analysis method. I am only interested in characterizing the nonuniformity of the measurements; clearly the distribution of sources will be highly nonuniform.
I’ve been looking at spatial interpolation methods (in particular, the Nadaraya-Watson estimator and kriging), but while these provide estimates of the mean measurement and errors on the mean, it’s not clear to me how to test the significance of any apparent deviations from uniformity (e.g., how to calculate false positive or false negative probabilities). Are there any standard tests out there for looking for local variations from an otherwise uniform field, probed only at certain points?
Posted by Cameron, Ewan at December 12. 2014
Hi Krzysztof,
My impression is that most p-value style tests for homogeneity of a Poisson process involve the computation of a summary statistic based on distances between points. I doubt there are asymptotic p-value formulae for general domains: but I could imagine there are asymptotic results for the sphere, so if your data is really ‘all sky’ you might be in luck. Otherwise you will just need to use Monte Carlo simulation to estimate the p-value for your dataset.
Another way to characterise your dataset might be to use INLA to fit a Gaussian process on the sphere as the intensity of the spatial point process in the likelihood function. There are various examples of this in the INLA handbook. http://www.r-inla.org
http://www.math.ntnu.no/preprint/statistics/2010/S6-2010.pdf
We use INLA routinely for our spatio-temporal malaria maps at work and I can swear by its speed and ease of use!
cheers,
Ewan.
Posted by Cameron, Ewan at December 12. 2014
- I should also mention that kriging has gone way out of favour in geospatial statistics these days: it’s not necessarily intrinsically bad, it’s just very easy to do wrong or misuse.
Posted by Krzysztof Findeisen at December 15. 2014
Thanks for the quick reply. It sounds like Monte Carlo simulations are the best route forward for me. I’ve tried doing Gaussian process modeling before (albeit for a very different problem), and I wasn’t very impressed with the result — even when I drew simulated data from the same type of Gaussian process as the fitted model, the results tended to be quite inaccurate.