How to perform a bootstrap test to compare the means of two samples?

I have also looked at the Wilcoxon rank-sum but it is not giving very reasonable results due to the very heavily skewed distribution (e.g. the 75th == 95th percentile). For this reason I would like to explore the bootstrapped t-test further.

So my questions are:

Is this an appropriate methodology?
Is it appropriate to use the SE of observed data when I know it is heavily skewed?

asked Apr 4, 2014 at 13:33 CatsLoveJazz CatsLoveJazz 658 1 1 gold badge 5 5 silver badges 23 23 bronze badges $\begingroup$ How large are the samples? $\endgroup$ Commented Apr 5, 2014 at 9:46 $\begingroup$ @Michael Mayer Around 800 $\endgroup$ Commented Apr 7, 2014 at 8:39 $\begingroup$ See also stats.stackexchange.com/questions/189587 $\endgroup$ Commented Mar 9, 2017 at 10:15

1 Answer 1

$\begingroup$

I would just do a regular bootstrap test:

compute the t-statistic in your data and store it
change the data such that the null-hypothesis is true. In this case, subtract the mean in group 1 for group 1 and add the overall mean, and do the same for group 2, that way the means in both group will be the overall mean.
Take bootstrap samples from this dataset, probably in the order of 20,000.
compute the t-statistic in each of these bootstrap samples. The distribution of these t-statistics is the bootstrap estimate of the sampling distribution of the t-statistic in your skewed data if the null-hypothesis is true.
The proportion of bootstrap t-statistics that is larger than or equal to your observed t-statistic is your estimate of the $p$ -value. You can do a bit better by looking at $($ the number of bootstrap t-statistics that are larger than or equal to the observed t-statistic $+1)$ divided by $($ the number of bootstrap samples $+1)$ . However, the difference is going to be small when the number of bootstrap samples is large.

You can read more on that in:

Chapter 4 of A.C. Davison and D.V. Hinkley (1997) Bootstrap Methods and their Application. Cambridge: Cambridge University Press.
Chapter 16 of Bradley Efron and Robert J. Tibshirani (1993) An Introduction to the Bootstrap. Boca Raton: Chapman & Hall/CRC.
Wikipedia entry on bootstrap hypothesis testing.

3,426 8 8 gold badges 36 36 silver badges 44 44 bronze badges answered Apr 4, 2014 at 15:08 Maarten Buis Maarten Buis 21.3k 37 37 silver badges 65 65 bronze badges

$\begingroup$ This is essentially what Im doing but looking at the proportion of times the original/observed t-statistic is >= bootsrapped t-statistic. Is it ok to do a t-test on heavily skewed data in the first instance though, this is one of the reasons why I want to boostrap. $\endgroup$

Commented Apr 4, 2014 at 15:25

$\begingroup$ Techically, for the bootstrap test you just need a test-statistic so that is not a problem. Substantively, a t-test compares means and in skewed data medians are often more meaningful than means. So a test comparing medians instead of means may make more sense. However, that depends on your null-hypothesis, which is your choice and your choice alone. $\endgroup$

Commented Apr 4, 2014 at 15:35

$\begingroup$ Ok thanks, it is the mean we want to test as all our other output has been in this form. $\endgroup$