

I recently stumbled upon the Efron and Petrosian 1992 paper (ApJ 399, 345) discussing a method to obtain unbiased correlation starting from a truncated data set due to observational biases (e.g. flux limited samples).
The method is not entirely clear to me as it looks like they are able to correct a distribution using informations they don’t have, and I could not figure out what is the underlying statistical theory. Is there anyone familiar with the paper that could give me an idea if this is a robust approach and widely used in astronomy or it has some limitations?
It seems it has been applied to GRB correlations as well.
Thanks,
Maurizio

Hi Maurizio,
I’ve not used this particular test before but the basic idea is familiar to me; and I’m basing my reply also on a simplified description of the test in Chapter 1 of Jun S Liu’s (excellent) book “Monte Carlo Strategies in Scientific Computing”.
So the basic idea behind the permutation test proposed by Efron & Petrosian is to test the null hypothesis that the two random variables are independent by comparing a given test statistic of the observed data against the distribution of that test statistic expected in the long run under the null hypothesis. It’s not a Bayesian analysis so we don’t begin with any strong assumptions as to the distributional shapes of the two random variables; instead we will use our observed data under possible permutations of x’s with y’s as a proxy for this null distribution. If there was no truncations of the data points then the standard permutation test we can compute the null distribution of the test statistic from simple (uniform) random permutation of the x’s with y’s (this is the independence assumption of the null). But since we have these truncations in our observed data we must make sure our permutations also satisfy the same conditions as the observed data, whilst maintaining the uniformity of the permutation process: this part is the ‘trick’ of the Efron & Petrosian algorithm. After defining this permutation process and choosing the optimal test statistic (see the discussion in Efron & Petrosian) the final details are just those of standard frequentist hypothesis testing with p-values.
Hope that helps,
Ewan.