ONESAMPLERAN
This macro is designed to test whether or not the mean of a single column of data is equal to a hypothesised value specified by the user.
RUNNING THE MACRO
Calling statement
onesampleran c1 k1 ;
nran k1 (999) ;
sums c1.
Input
C1 A single column, containing only numerical values. Missing values are allowed.
K1 A single constant, containing the hypothesised mean value.
Subcommands
nranNumber of randomizations used.
sumsSpecify a column in which to store sample sums for bootstrap samples.
Output
- Basic statistics: Sample size, sample mean, sum of sample values, and sample standard deviation.
- Hypothesised mean value.
- Resampling details: Number of randomizations, One and two-sided randomization p-values.
The two-sided randomization p-value is double the smaller of the one-sided randomization p-values.
Speed of macro : FAST
TECHNICAL DETAILS
Null hypothesis: The population mean is equal to the hypothesised mean value.
Test-statistic : We create a modified dataset by deducting the hypothesised mean from each data value.
The appropriate test-statistic is the sum of these modified values.
Randomization procedure : We randomize the allocation of signs to the absolute values within the modified dataset, since under the null hypothesis there should be an equal probability that any data point will have been allocated a negative or positive value once the hypothesised mean is deducted from it.
ALTERNATIVE PROCEDURES
Standard procedures
onet C1;
test k1.
Performs a one-sample t-test for the mean of the data in c1 being equal to the hypothesised mean value k1, in the situation in which the sample variance is unknown.
onet C1;
sigma k1
test k2.
Performs a one-sample normal test for the mean of the data in c1 being equal to the hypothesised mean value k2, in the situation in which the standard deviation is known to be equal to k1.
REFERENCES
Manly, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 6).
WORKED EXAMPLE FOR ONESAMPLERAN
Name of dataset
DARWIN
Description
The data refers to the heights of 15 self-fertilised offspring from the plant Zea mays. The data were originally collected by Charles Darwin, were analysed by RA Fisher in the 1930s (see Fisher, 1935), and are analysed by Manly (1997) using a one-sample randomization test.
Our source
Manly, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
Fisher, R.A. (1935) The design of experiments, Oliver & Boyd, Edinburgh.
Data
Number of observations = 15
Number of variables = 1
43 67 64 64 51 53 53 26 36 48 34 48 6 28 48
Worksheet
C1Data
Aims of analysis
To test whether the population mean is equal to a hypothesised value of 56.
Minitab output : standard procedure
MTB > Retrieve "N:\resampling\Examples\Darwin.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Darwin.MTW
# Worksheet was saved on 27/07/01 14:03:05
Results for: Darwin.MTW
MTB > onet c1 ;
SUBC> test 56.
One-Sample T: Self
Test of mu = 56 vs mu not = 56
Variable N Mean StDev SE Mean
Self 15 44.60 16.41 4.24
Variable 95.0% CI T P
Self ( 35.51, 53.69) -2.69 0.018
Minitab output : randomization procedure
MTB > % N:\resampling\library\onesampleran c1 56 ;
SUBC> nran 499 ;
SUBC> sums c3.
Executing from file: N:\resampling\library\onesampleran.MAC
One-sample randomization test
Data Display (WRITE)
Number of observations 15
Observed mean value 44.60
Hypothesised mean value 56.00
Observed sum of values 669.0
Observed standard deviation 16.41
Number of randomization samples 499
P-value for one-sided test with alternative: true mean < hypothesised mean 0.0020
P-value for one-sided test with alternative: true mean > hypothesised mean 1.0000
P-value for two-sided test 0.0040
Modified worksheet
C3A column containing 499 sums of values, one for each randomized dataset
Discussion
The standard (two-sided) p-value is 0.018. Manly obtains a randomization p-value of 0.016, by enumeration of the full randomization distribution. Our two-sided p-value of 0.004 is substantially smaller than either of these values, but this may just be a consequence of the relatively small number of randomizations used.
The conclusion is the same in all cases - there is strong evidence that the population mean is not equal to the hypothesised mean. Looking at the one-sided p-values (and the sample means) we see that we can accept the alternative hypothesis that the population mean is lower than the hypothesised mean.
1