Comparison of two datasets put through same selection process – Astrostatistics and Astroinformatics Portal

Posted by Reid, Neill at January 09. 2013

I’d like advice on the appropriate statistical test to use for a comparison that I’m working on. I’m reviewing HST proposal success rates for two groups. For most groups, I have two numbers – the number of accepted proposals and the number of rejected proposals, giving the success rate for each group. For a subset of the groups, i have 3 numbers – accepted, rejected and triaged. I’d like to use a non-parametric test to determine the statistical significance of any differences between the success rates/triage rates for the two groups. If I had a rank-ordered list, I could use a K-S test, or even a Mann-Whitney u test, but I don’t have a consistent rank ordering of the full dataset.

Any suggestions would be appreciated.

thanks

Neill Reid

Posted by Cameron, Ewan at January 15. 2013

Hi Neill,

This is one of those (rare) cases where the Bayesian model selection framework is surprisingly simple to implement. And, it solves the exact problem you’re interested in (rather than some subtly different Neyman-Pearson formulation thereof): that is, without specifying a particular null success rate it asks “given the available data, are the intrinsic success rates of these two populations more like to be the same or different?”. Lee & Pope give a rather comprehensive discussion: http://www.socsci.uci.edu/~mdlee/lee_pope_rate.pdf .

cheers, Ewan Cameron