MEASUREMENTS AND ERRORS

by

Peter Meikle

Blackett Laboratory

Imperial College

Notes to be read in association with the First Year Course on Measurements and Errors

January 2001

Contents

1. Introduction.

2. Types of Errors.

3. The Normal Distribution Model for Random Errors

3.1 The Normal (Gaussian) Distribution and the Parent Population

3.2.1 The Sample Population and the Method of Least Squares.

3.2.2 Random Error in a Single Measurement.

3.3 The Treatment of Deviant Points

3.4 The Combination of Random Errors.

3.4.1 Examples

3.5 Random Error in the Mean (Standard Error)

3.5.1 Error reduction by repeated measurement

3.5.2 How many figures should I quote?

3.5.3 Using your calculator in statistics mode

3.5.4 Significance of a result

3.5.5 Error estimation without repeated measurement

3.6 Weighted Means and Random Errors in Weighted Means

3.7 Fitting a Straight Line

3.7.1 Fitting a straight line known to intersect the origin

4. The Binomial Distribution and its special cases.

4.1 The Binomial Distribution

4.2 Special Case 1: The Normal Distribution

4.3 Special Case 2: The Poisson Distribution

5. Systematic Errors

6. Summary

7. Recommended textbooks

1. Introduction.

"Maturity of mind is the capacity to endure uncertainty" - John Finley

These notes are intended to complement the lecture course on Measurements and Errors. In particular, they will provide you with the mathematical details showing how the formulae used in data analysis are obtained. The notes deal only with the kind of data analysis problems you will encounter in the First Year Laboratory.

Do not be put off by the apparent complexity of some sections. The mathematical derivations are actually quite straightforward, although occasionally tedious. You are not expected to memorise these derivations, nor indeed most of the formulae. A few of the simpler results are worth remembering, and these are indicated in the summary at the end. For most situations, the analysis of data using these formulae is best carried out by using a calculator or computer. A good example of such a package is the curve fitting program, “Curve Expert”, which you use in the First Year Laboratory.

2. Types of Errors.

Imagine that you are doing an experiment which involves using a digital meter to measure the current through a resistor connected across a photodiode which is illuminated by a steady light source. The steady current that results can be described as a definite or true number of microamps. The more accurate the experiment is, the closer a typical measurement will be to the true value.

What limits the accuracy of the experiment? You might suggest that it is the number of digits shown on the liquid crystal display. How could you improve this accuracy? It would be technically feasible to build a meter with more digits. With just a few digits it is likely that

a fixed value (number of A) will be displayed. However, as you increase the number of digits you will discover a problem - the values of the last one or two digits will fluctuate rapidly. In other words the meter is not showing a steady value after all! It is showing small random perturbations about some mean reading. Consequently, a reading taken at one instant will differ from one taken at a later instant. The fluctuations might be due a combination of temperature fluctuations and mechanical vibrations affecting both the mechanical and electrical components of the experiment. No matter how carefully you tried to improve the experiment, there would always be fluctuations in the current at some level. However, the actual cause of these fluctuations is not important for the discussion which follows.

The presence of these fluctuations means that any reading you make is not likely to be equal to the true value of the current. The measurement error due to the fluctuations is known as random error.

To make matters worse, unknown to you, the meter may be incorrectly calibrated. This will introduce another type of uncertainty, known as systematic error, which is constant or varies in a regular way.

Finally, you may make a few readings and then realise you have accidentally set the meter to display volts rather than microamps. This is known as a mistake or blunder!

In all real experiments we have to try to avoid mistakes and also be able to cope with both random and systematic errors. These errors may be due to the apparatus itself, or may be caused by limitations in the experimenter's ability to use the apparatus. Random errors can be

analysed mathematically and most of these notes concern how we deal with this type of error. Systematic errors will be discussed briefly in Section 5. Note that while systematic errors cannot be treated mathematically, they are, nevertheless, often more important than random errors.

3. The Normal Distribution Model for Random Errors

Given the presence of uncontrollable fluctuations in the measurements performed in an experiment, we want to be able to give objective answers to two questions:

a) What is the best estimate we can make of the quantity (e.g. electric current) that we are trying to measure?

b) What is the "quality" of this estimate? i.e. How uncertain is our best estimate?

3.1 The Normal (Gaussian) Distribution and the Parent Population

Suppose you plotted the position, x, of the digital current meter against time. It would look something like the trace shown in Figure 1a.

Figure 1a.

Now suppose you noted the value of x indicated at intervals 1 second apart over a period of, say, 1 hour, as shown in Figure 1b.

Figure 1b

If you now plotted a histogram of the number of readings in intervals 0.1A wide (i.e. plot the number of readings in the range 0 to 0.099 A in the first column, 0.100 to 0.199 A in the next column, and so on), we would obtain a distribution as shown in Figure 2.

Figure 2

We want to use this experimental or "sample" distribution to make a best estimate of the quantity being measured together with the uncertainty in that estimate. In order to do this we proceed as follows.

First of all, rather than plotting on the y-axis the number of measurements in a given current interval in one hour, we instead plot the probability, P.x, that any one measurement will fall into a particular current range x  x + x. Now imagine that we continued taking measurements for a very long time. At the same time we plot these readings in exceedingly narrow current intervals. If we let the number of measurements tend to infinity, and the current interval widths tend to zero, what will the histogram look like? (If you don't like thinking about infinity, just substitute "very large" for infinite and "very small" for infinitesimal.) It can be shown that the histogram will look increasingly like the Normal (or Gaussian) Distribution. The infinite set of measurements (which we can never actually carry out in practice) is known as the parent population of the quantity x. The mathematical model describing the distribution of the parent population is often referred to as the "parent distribution". In this case the mathematical model is the Normal Distribution (Figure 3).

Figure 3. The Normal Distribution.

The function which describes the Normal Distribution is

...... (1)

The value of P(x, ,)dx at position x gives the probability that, in an infinite number of measurements, any one measurement will lie in the range x to x+dx. The function is symmetrical about x = , where  is the mean value of the measurements

i.e. ...... (2)

where n is the number of measurements. The quantity  is known as the standard deviation of the distribution.  characterises the spread or dispersion of the distribution about the mean  (see figure above). The definition of  is:

...... (3)

The quantity in equ.(1) is a normalising factor, so that the total probability that a measurement will lie somewhere between  and + is 1.

The Normal Distribution falls away rapidly for . Thus,

68.3 % of the measurements lie in the range

95.4 % of the measurements lie in the range

99.7 % of the measurements lie in the range

The validity of the Normal Distribution model for describing the parent distribution underlying real experimental distributions has often been debated. Some understanding of its origin can be obtained as follows. Assume that the value of each measurement is the result of the existence of a true value for the quantity being measured, but upon which is superimposed an infinite number of infinitesimal perturbations or "kicks", each acting independently. For any particular measurement, each of these kicks adds to or subtracts from the true value randomly with an equal probability, resulting in the value which we observe. In other words, the net effect of the kicks at the instant of measurement is to produce an observed value which has been perturbed away from the true value. It can be shown that the resulting distribution of measured values can be described by the Normal Distribution. In Section 4 we shall use the Binomial Distribution to examine the statistical effect of these perturbations and gain further insight into the origin of the Normal Distribution. Since we assert that the perturbations act randomly, they are equally likely to perturb the result above or below the true value. Consequently the true value of the quantity being measured is given by the mean, , of the Normal Distribution

i.e. true value of x...... (4)

The perturbations may have a number of causes e.g. mechanical vibrations, thermal fluctuations. The important point is that by using the Normal Distribution model, the effects of the perturbations can be analysed statistically even though we do not know what is causing them. The general acceptance of the appropriateness of the Normal Distribution model is due to the fact that it provides a simple and successful way of describing real experimental distributions. (In practice, experimental distributions are somewhat narrower than the Normal Distribution.Also as x tends to , real distributions do not fall away quite so quickly as does the Normal Distribution.)

In summary, we assume that associated with any measurable quantity there is a parent population of possible outcomes of a given measurement, the mean of which is the true value of the quantity. Unfortunately, in practice we have neither the time nor the money to carry out an infinite (nor even a very large) number of measurements of a quantity (e.g. a single night's observations with a large telescope costs over £10,000). Consequently, in the real world we cannot determine exactly the true value . The best we can do is make a finite number of measurements, and try to estimate  from these measurements. In each such measurement, we are sampling the parent population.

The mathematical treatment of random fluctuations in measurements shows us how, given a limited amount of data, we can obtain the best estimate of  together with an estimate of the range of values within which  is likely to lie.

3.2.1 The sample population and the method of least squares.

(or "Why do we think mean values are near the truth?")

What is the best estimate we can make of , from a finite number of measurements xi? Suppose you have taken n measurements xi (i = 1 n)? of a quantity whose true value is . These n observations are a sample of the parent population corresponding to the quantity being measured. You would probably guess that the best estimate is simply.

This is actually correct. But why does taking the mean of a few sample points give the best estimate of the mean, , of the parent population? We are touching upon a deep and much argued topic in science viz. how do we assess the degree of credence to be accorded to

a hypothesis, given a particular experimental result? A very common approach to this problem is to use the Principle of Maximum Likelihood. This states that when we are confronted with a choice of hypotheses we should choose that hypothesis which gives the greatest probability to the observed event. In the discussion which now follows, we shall adopt the Principle of Maximum Likelihood.

Suppose we measure some quantity, x, (e.g. electric current) a total of n times, yielding n data points. Suppose that the true value of the quantity is , so that in a single measurement

the probability of obtaining a value lying in the range xi to xi + dx (i = 1 n) is

...... (5)

where  is the standard deviation of the parent distribution corresponding to the true value . We want to make the best possible estimate of  from our n data points. To do this, consider a set of parent distributions of differing mean values, (for simplicity we assume they have the same). Let  be able to take any value including . The probability of obtaining a

value in the range xi to xi + dx from any one of these parent populations, in a single measurement, is

...... (6)

Now think of the set of n measurements which we have made. The probability of observing that particular set, given a parent distribution is,

P() = P(x1, , ) P(x2, , )....P(xi, , )....P(xN, , )(dx)n

...... (7)

By the Principle of Maximum Likelihood the best estimate of the true value  is obtained by maximising the probability P(). To maximise P() we follow the usual procedure of differentiating, and equating to zero i.e. when

then by the Principle of Maximum Likelihood, =  (the true value).

This occurs when

i.e. when i.e. i.e.

i.e. i.e.= best estimate of ...... (8)

Thus the best estimate of the true value  is obtained by taking the arithmetic mean of our sample points. This is known as the sample mean and it is equal to mean of the parent distribution which has the greatest probability of giving rise to the observed data set. This will

usually not be the same as the true value - it is only a best estimate of the true value. To make this clear we denote the sample mean, , by. Note that in applying the Principle of Maximum Likelihood to estimate , we have minimised the quantity . For this reason the technique is known as the "Method of Least Squares".

3.2.2 Random error, s, in a single measurement.

It was stated above that 68.3% of the points in a Normal parent population lie in the range

 to  + . Another way of expressing this is to say that if we make a single measurement, xi, of a quantity with a parent distribution then there is a 68.3% probability that the measurement will lie within 1 of the true value . Thus,  can give us a measure of the quality, or precision, or random error of a single measurement. However, we do not know the values of or, since we would need an infinite number of observations to obtain them (see equs. 2 & 3). As with  , we can only estimate  using our n data points. If we again apply the Principle of Maximum Likelihood and follow a similar procedure to that used above to get the best estimate of , we can show that :

Best estimate of  =  s ...... (9)

This is known as the sample standard deviation, denoted s. It is usually not the same as the parent standard deviation  - it is only our best estimate of . The random error in a single

measurement xi is conventionally taken to be 1s. Since there is a 68.3% probability that xi is within 1 of the true value, , we can say that the probability that xi lies within 1s of 

is approximately 68%. (Remember, s is only an estimate of  - see subsection 3.5.2). For a quick estimate of s, a useful rule is that 2/3 of the measurements lie within the range to .

You will notice that in equ.(9) an "n1" appears, rather than the "n" in the expression for parent standard deviation (equ. 4). The reason for this is that we have already used the n measurements to calculate , and in determining s we use this value of . Consequently, the

number of independent observations left available to calculate s is reduced from n to n1. One way of understanding this is to consider a case where just one measurement is made.

Clearly this cannot give you any information about the spread. Yet if we retained n (rather than n1) in equ. 9 it would suggest that the spread is zero, which is nonsense. However, with n1, the resulting indeterminate value is much more appropriate.

3.3 The Treatment of Deviant Points

Suppose you carry out an experimental run, where you repeat the measurement of a quantity about 10 times, and so find values for and s. You notice that one of your measurements is more than 3s different from . From the behaviour of the Normal Distribution we can say that the chance of a single measurement being more than 3s different from the true value is about 330 to 1 against, and so the chance of getting such a measurement in a run of 10 measurements is about 33 to 1 against. Clearly, therefore, it is unlikely that this could be due to random fluctuations. It is much more probable that something went wrong with that particular measurement.

If you encounter a situation like this, the first thing you should do is stop and think. Did anything happen during that particular measurement which might have caused the deviant point? Was there a momentary power cut? Was your attention distracted by something? Did you get more than one deviant point? By stopping to think, you may be able to identify the cause of the deviant point, and eliminate it. If you fail to find any obvious cause, it would probably be best to reject the point, and then recalculate and s. However, be very careful about the criterion you use for data rejection. If you had made 500 measurements, it would be quite probable that simple random fluctuations would result in one measurement being more than 3s away from the mean. A rejection threshold of, say, 5s would be wiser here. The more measurements you take, the higher your threshold should be for identifying deviant points.

3.4 The Combination of Random Errors.

Suppose we want to determine the sample standard deviation of a quantity f which is a function, f(x,y) of two other variables x and y, where x and y are measured. e.g. f could be the area of a field with x and y being the length and breadth respectively i.e. f = xy.

Let the parent population of f have a mean f and a standard deviationf.

Let the parent population of x have a mean x and a standard deviation x.

Let the parent population of y have a mean y and a standard deviation y.

If x and y are independent of each other then the parent population of f is made up of all combinations of the individual measurements xj and yk. Thus if there were nx measurements, xj, of x and ny measurements, yk, of y, the total number of measurements, fi, of f would be nf = nxny.