ONESAMPLERAN

This macro is designed to test whether or not the mean of a single column of data is equal to a hypothesised value specified by the user.

RUNNING THE MACRO

Calling statement

onesampleran c1 k1 ;

nran k1 (999) ;

sums c1.

Input

C1 A single column, containing only numerical values. Missing values are allowed.

K1 A single constant, containing the hypothesised mean value.

Subcommands

nranNumber of randomizations used.

sumsSpecify a column in which to store sample sums for bootstrap samples.

Output

  • Basic statistics: Sample size, sample mean, sum of sample values, and sample standard deviation.
  • Hypothesised mean value.
  • Resampling details: Number of randomizations, One and two-sided randomization p-values.

The two-sided randomization p-value is double the smaller of the one-sided randomization p-values.

Speed of macro : FAST

TECHNICAL DETAILS

Null hypothesis: The population mean is equal to the hypothesised mean value.

Test-statistic : We create a modified dataset by deducting the hypothesised mean from each data value.

The appropriate test-statistic is the sum of these modified values.

Randomization procedure : We randomize the allocation of signs to the absolute values within the modified dataset, since under the null hypothesis there should be an equal probability that any data point will have been allocated a negative or positive value once the hypothesised mean is deducted from it.

ALTERNATIVE PROCEDURES

Standard procedures

onet C1;

test k1.

Performs a one-sample t-test for the mean of the data in c1 being equal to the hypothesised mean value k1, in the situation in which the sample variance is unknown.

onet C1;

sigma k1

test k2.

Performs a one-sample normal test for the mean of the data in c1 being equal to the hypothesised mean value k2, in the situation in which the standard deviation is known to be equal to k1.

REFERENCES

Manly, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,

Chapman and Hall, London (Chapter 6).

WORKED EXAMPLE FOR ONESAMPLERAN

Name of dataset

DARWIN

Description

The data refers to the heights of 15 self-fertilised offspring from the plant Zea mays. The data were originally collected by Charles Darwin, were analysed by RA Fisher in the 1930s (see Fisher, 1935), and are analysed by Manly (1997) using a one-sample randomization test.

Our source

Manly, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,

Chapman and Hall, London.

Original source

Fisher, R.A. (1935) The design of experiments, Oliver & Boyd, Edinburgh.

Data

Number of observations = 15

Number of variables = 1

43 67 64 64 51 53 53 26 36 48 34 48 6 28 48

Worksheet

C1Data

Aims of analysis

To test whether the population mean is equal to a hypothesised value of 56.

Minitab output : standard procedure

MTB > Retrieve "N:\resampling\Examples\Darwin.MTW".

Retrieving worksheet from file: N:\resampling\Examples\Darwin.MTW

# Worksheet was saved on 27/07/01 14:03:05

Results for: Darwin.MTW

MTB > onet c1 ;

SUBC> test 56.

One-Sample T: Self

Test of mu = 56 vs mu not = 56

Variable N Mean StDev SE Mean

Self 15 44.60 16.41 4.24

Variable 95.0% CI T P

Self ( 35.51, 53.69) -2.69 0.018

Minitab output : randomization procedure

MTB > % N:\resampling\library\onesampleran c1 56 ;

SUBC> nran 499 ;

SUBC> sums c3.

Executing from file: N:\resampling\library\onesampleran.MAC

One-sample randomization test

Data Display (WRITE)

Number of observations 15

Observed mean value 44.60

Hypothesised mean value 56.00

Observed sum of values 669.0

Observed standard deviation 16.41

Number of randomization samples 499

P-value for one-sided test with alternative: true mean < hypothesised mean 0.0020

P-value for one-sided test with alternative: true mean > hypothesised mean 1.0000

P-value for two-sided test 0.0040

Modified worksheet

C3A column containing 499 sums of values, one for each randomized dataset

Discussion

The standard (two-sided) p-value is 0.018. Manly obtains a randomization p-value of 0.016, by enumeration of the full randomization distribution. Our two-sided p-value of 0.004 is substantially smaller than either of these values, but this may just be a consequence of the relatively small number of randomizations used.

The conclusion is the same in all cases - there is strong evidence that the population mean is not equal to the hypothesised mean. Looking at the one-sided p-values (and the sample means) we see that we can accept the alternative hypothesis that the population mean is lower than the hypothesised mean.

1