Means and Variances of Random Variables
Recall the following example from last time:
Let X denote the number of occupants in a randomly selected automobile. For simplicity, we assume that no automobile has more than 5 occupants. Suppose that at a certain time in a certain point on the road, X has the following probability distribution:
X / 1 / 2 / 3 / 4 / 5p / 0.62 / 0.23 / 0.09 / 0.05 / 0.01
What is the average number of occupants per vehicle?
Reason this way: Suppose we had a random sample of 100 automobiles. We’d expect (on average) that
1. 62 of them only have the driver
2. 23 of them have 2 occupants
3. 9 of them have 3 occupants
4. 5 of them have 4 occupants
5. 1 of them has 5 occupants
Then to find the average number of occupants, we just total up the occupants and divide by 100
This simplifies to
Average =
=
where is a possible value and is its probability
X / 1 / 2 / 3 / 4 / 5P / 0.62 / 0.23 / 0.09 / 0.05 / 0.01
The above sum comes out to be 1.37. That is, we expect that in 100 automobiles, we’d have 137 total occupants for an average of 1.37 occupants per automobile
This procedure is used to define the mean (or expected value) of a discrete random variable X.
Possible values of X / / / / …. /Probabilities / / / / …. /
The mean of X is defined by
Similarly, the variance of X is defined by
and the standard deviation
is simply the positive square root of the variance.
A simpler form for the variance:
Example: Find the mean, variance, and standard deviation of the following distribution:
Possible values of X / -3 / 0 / 1 / 2 / 3Probabilities / 0.50 / 0.25 / 0.10 / 0.05 / 0.10
Then the mean is
The variance is
The standard deviation is
Rules for means and for variances:
For linear shifts:
Let X be a random variable. Define where are real numbers.
In other words, we multiply X by a real number , and then add
This is sometimes called a linear shift of X.
Fact: Let denote the mean, variance and standard deviation of X.
Let be defined correspondingly for the linear shift Y.
Then:
For two random variables:
If are independent,
In words, 1) the mean of the sum is the sum of the means
2) the mean of the difference is the difference of the means
3) the variance of the sum is the sum of the variances
4) the variance of the difference is the sum of the variances.
Example: Nituna is a sales associate at an auto dealership. She is also a manager, so she gets $50 salary each day. In addition, she expects to earn (on commission) $350 for each car she sells and $400 for each SUV she sells. The following are the estimates of her daily car and SUV sales:
Cars sold / 0 / 1 / 2 / 3Probabilities / 0.3 / 0.4 / 0.2 / 0.1
SUVs sold / 0 / 1 / 2
Probabilities / 0.4 / 0.5 / 0.1
The means of these random variables are
0(0.3) + 1(0.4) + 2(0.2) + 3(0.1) = 1.1 cars
0(0.4) + 1(0.5) + 2(0.1) = 0.7 SUVs
The variance of these random variables are
0(0.3) + 1(0.4) + 4(0.2) + 9(0.1) - 1.21 = 0.89 cars squared
0(0.4) + 1(0.5) + 4(0.1) - 0.49 = 0.41 SUVs squared
The standard deviations of these random variables are
cars
SUVs
Nituna’s earnings Z are given by
So her mean earnings is
Her variance of earnings is
dollars squared
(note that the $50 salary is not used in the calculation of variance!)
Her standard deviation of earning is dollars
Therefore Nituna’s best estimate of her daily earnings is $715. The standard deviation is quite large: $418. It shows that he earnings fluctuate greatly over the days.
The Law of Large Numbers
Draw independent observations at random from a population with mean . Decide how accurately you’d like to estimate . As the number of observations increases, the sample mean of the observed values eventually approaches the mean as closely as you specified and then stays that close.
In the Nituna example, this would mean that if she, say, specified she wanted to estimate her mean daily earnings to within $0.50, she could just compute by averaging her actual earning over many days. (We would have to assume the daily sales were independent of each other) She could try averaging over a year, 2 years, 10 years, etc…. Assuming that this sales volumes stayed the same, her actual average earnings would eventually stay within $714.50 to $715.50 dollars
Example: In taking measurements it is common practice to report the measured value, plus or minus a margin of error that gives the uncertainty in the measurement. Measurements are never perfectly precise. A measurement apparatus may repeat different values after measuring the same phenomenon repeatedly! If the standard deviation of the measurements is known, this is reported as the uncertainty.
Suppose a mass, one which is known to weigh exactly 100 grams, is weighed on two different scales, both of which are calibrated.
Assume the reading X on the first scale is normally distributed with mean 100 grams and standard deviation .2 gram. In lay terms, we say that the scale reports gram. The reading Y on the second scale has the same distribution, but this reading is independent of X.
Let Z denote the difference between the two measurements: Z = X – Y
What is the mean of Z?
Easy: by the rules above the mean is 100 -100 = 0 grams
What is the variance of Z?
By the rules above the variance is
grams squared
(note: the variances add… they do not subtract!)
What is the standard deviation of Z?
grams
The most common error is to add the standard deviations and say the standard deviation is .2+.2=.4. This is incorrect. Recall that variances add. Standard deviations do not.