Best method to quantify the agreement between simulated and observed 1D and 2D distributions

 Posted by Pierbattista, Marco at November 17. 2015

Dear experts,

I need to quantify the agreement between simulated and observed 1D and 2D histograms in order to select the model distribution (1D or 2D) that best describe the observed distribution. For the 1D histograms I know that I can obtain an estimate of the probability that the two 1D distributions are obtained from the same parent distribution by applying the 2 sample KS test. But this does not look true for the 2D distributions, as suggested in the article https://asaip.psu.edu/Articles/beware-the-kolmogorov-smirnov-test (basically because there are infinite ways do sort a 2D histogram). Eric Feigelson explained me that the 2 sample KS statistics gives a reasonable estimate of the “distance” between the two 2D sample but it cannot be used to give the probability that the samples are obtained from the same parent population without bootstrap resampling to find its distribution for your datasets.

Now, since I think that I can safely use the 2KS test to compare 1D distributions and to give the probability that those are obtained from the same parent distribution, can I use the 2KS statistics to compare my 2D distributions and discriminate the model that best explains the observations just by selecting the model (out of 4) that shows the smallest “distance” to the data (since I cannot use the probability here)?

Do you have any other statistical approach to suggest, both for 1D and 2D distributions, that I can use instead of the 2KS test and statistics?

Examples of 1D and 2D distributions are attached. Red points or line are observations, grey histograms are simulations. Each panel represent a model and I have to decide the one that best explains the observations.

Thank you very much,

Marco

Attachments

 Posted by jhilbe at November 23. 2015

I recommend the Epps-Singleton two sample test. Its used in econometrics and Ive seen great applications of it in ecology. It’s definitely preferable to the Kolomogrov-Smirnov (KS) two sample test. It’s discussed in:

Gibbons, J.D and S. Chakraborti (2011), Nonparametric Statistical Inference, 5th, edition, Chapman & Hall/CRC.

An excellent description of the test and its history and comparisons to other similar tests is given in:

Goerg, S.J. and J. Kaiser (2009),  Nonparametric testing of distributions: the Epps-Singleton  two sample test using the empirical characteristic function, Stata Journal 9, 454-465.  There are a number of excellent references at the end of the article.

The original article is from: Epps, T.W. and K.J Singleton (1986), An omnibus test for the two-sample problem using the empirical characteristic function, Journal of Statistical Computation and Simulation 26:177-203.

Epps wrote a very nice description of characteristic functions which I recommend. It’s easy to obtain at most all university library. See Epps 1993, The American Statistician 47:33-38.

I hope that this helps.  Joseph Hilbe