Hypothesis testing – data on a sphere

 Posted by Appleby, Stephen at November 27. 2013
Hello everybody,
I have a problem regarding hypothesis testing for data distributed on a sphere. The question is basically which of two quantities is better for detecting a signal. I will describe my problem in brief below, if someone is willing to offer an opinion or direct me to some literature related to the topic I would be very grateful! If anyone requires more information than given below I would be happy to expand. Thank you!
We have a set of ‘i’ residuals
q_i = (data_i – best fit theoretical model)/error_i
The data is distributed on the two-sphere, and the best fit model is independent of position (theta,phi). We construct the so called Q function on the sphere, which is the sum of the residuals weighted by a Gaussian window function
Q(theta,phi) = Sum_{i} q_{i} W(theta,phi,theta_{i},phi_{i})
where W is a weight function,
W \propto exp[-L(theta,phi,theta_i,phi_i)^2/const]
and here L is the shortest distance on the sphere between two points (theta,phi) and (theta_i,phi_i). We use this Q function to construct a statistical measure – a large Q value indicates a significant deviation from the best fit model at a particular point in the sky. We create many mock simulations to test our hypothesis, which is that the residuals are drawn from a Gaussian distribution, independent of position. The signal is expected to have a non-trivial angular dependence.
In some works, a modified function is used,
\bar{Q}(theta,phi) = Sum_{i} q_{i} W(theta,phi,theta_{i},phi_{i})/Sum_j W(theta,phi,theta_{j},phi_{j})
That is, they take the original Q function and introduce an additional weighting (which is essentially the sum of points in a given patch on the sphere).
The residuals q_i have a component of Gaussian noise and (perhaps) a signal. My argument is that if you weight the Q function at each point, then you will essentially wash out the signal and decrease the significance of any detection. The signal in the original method is cumulative with the number of points, whereas in using \bar{Q} you gain no increase in signal if you have more points in a direction where the signal is large.
One might be concerned that if you have more points in a region where there is no signal, the original method will artificially select this region as having a signal. However I do not believe this to be the case. In those regions the noise will have a larger variance but the mean will still be zero. Moreover, since we are performing a hypothesis test by creating mock data with the same distribution on the sky, the statistical significance of detecting a high Q value in such a region will not be high.
We have shown that the original method is more sensitive to an underlying signal for some particular mock data sets, however we were wondering whether there is a general argument that one could use to claim that the original method will always work better than weighting the residuals as in \bar{Q}. I guess the underlying question here is whether weighting residuals is beneficial or not.
If anyone has any thoughts I would be much obliged!
Best,
Stephen

 Posted by Cameron, Ewan at December 04. 2013

Hi Stephen,

I would expect that bar(Q) is a noisier statistic than Q since it contains the variance of both Q and the normalisation of the weights and for this reason is less powerful for testing the null hypothesis.  The first analogy that occurred to me was in the context of the standard importance sampling estimate which can be computed with or without normalisation of the weights.  Although normalisation seems intuitively like it should be better it is easy to show that its variance is typically greater, as in e.g. Hesterberg 1992/4/5 (“Weighted Average Importance Sampling and Defensive …”).

Hope that steers you in a useful direction,

cheers,

Ewan.

 Posted by Appleby, Stephen at December 09. 2013

Hello Ewan,

Thank you for pointing me in the right direction! I will take a look at the suggested paper.

Best,

Stephen