251var2 4/19/06 (Open in page layout)
Roger Even Bove
FORMULAS FOR FUNCTIONS OF RANDOM VARIABLES
I.Basic Computational Formulas for Descriptive Statistics
Consider the following set of observations:
Observation number / /1 / 7 / 3
2 / 15 / 6
3 / 2 / 9
You can easily verify that = 8 and = 6. The formula for the sample variance is
So that
and .
Similarly and
The formula for the sample covariance is
So that , for the numbers above
pg. 58
The only thing that we can usually learn from a covariance is whether the variables and move together or in opposite directions. If a covariance is positive the two variables tend to move together. If the covariance is negative the two variables tend to move in the same direction. To find out about the strength of the relationship we compute the correlation. The correlation can only have values between 1.0and -1.0. The sign of the correlation means the same thing as the sign of the covariance. A correlation close to 1.0 is referred to as a strong positive correlation. A correlation close to -1.0 is a strong negative correlation. If the correlation is 1.0, and will tend to move proportionally, that is when rises, will rise and when falls will fall. When takes a big jump, will take a big jump. When takes a small jump, takes a small jump. If the correlation is -1 we have the same proportionality, but now if jumps, will jump in the opposite direction. If the correlation is zero or close to zero, it is weak, which means there is not much tendency of to do anything in particular if moves.
The formula for sample correlation is and we know that , and , so that, for the numbers above
The negative covariance tells us that and have a tendency to move in opposite directions. The negative correlation tells us the same thing, but the fact that it is closer to zero than 1 leads us to feel that the correlation is weak. Actually statisticians tend to measure strength on a zero to one scale by squaring the correlation. In this case , which appears quite weak, though far from nonexistent. The sample covariance is regarded as an estimate of the true or population covariance, just as the sample correlation is regarded as an estimate of the population correlation. Formulas for computing these from a population all of whose points are known are not given here. The next section will deal with computing population covariances and correlations when probabilities are known.
pg. 59
II.FORMULAS FROM PROBABILITY
Let the following table describe the joint probabilities of and :
/ sum/ 7 / 15 / 2 / / / / /
3 / 0.1 / 0.2 / 0.0 / 0.3 / 3 / 0.3 / 0.9 / 2.7
6 / 0.1 / 0.0 / 0.3 / 0.4 / 6 / 0.4 / 2.9 / 14.4
9 / 0.0 / 0.1 / 0.2 / 0.3 / 9 / 0.3 / 2.7 / 24.3
/ 0.2 / 0.3 / 0.5 / 1.0 / 1.0 / 6.0 / 41.4
/ 1.4 / 4.5 / 1.0 / 6.9
/ 9.8 / 67.5 / 2.0 / 79.3
Note that 6.0 and =
and similarly= 6.9 and =
This implies that and that .
The formula for the covariance is .
We call this a population covariance, since the probabilities presumably refer to all values of x and y. The most difficult part of this formula is the evaluation of the expected value of ,. The idea here is to multiply each possible pair of values of and by the joint probability of the pair. One way to do this is to take the joint probability table and to add the values of and to it. Notice that in the table below the probabilities (like 0.1) are in exactly the same place as in the joint probability table above and are followed by the corresponding x and y.
So, .
Once again we find a negative covariance, indicating a tendency of and to move in opposite directions.
To measure the strength of the relationship, we must compute the correlation. As with the sample correlation, the population correlation is computed by dividing the covariance by the standard deviations of and . This time the formula for the correlation reads: . From above, we know that and . Thus
As with the sample correlation this can only take values between negative and positive one. Since, if we square -0.41 we get 0.17, this too is a weak correlation.
pg. 60
In many situations, especially with population correlations, we are likely to need the covariance and know the correlation. The formula for population correlation can be rewritten as .
Thus, if we know and , we can compute the covariance.
The corresponding formula for the sample covariance is .
pg. 61
IIIFUNCTIONS OF RANDOM VARIABLES
A. Functions of a Single Random Variable.
1. The Mean.
If we know the mean of the distribution of a random variable, we can easily find the mean of a linear function of the same random variable. For example if we know the mean of we can find the mean of . In the following let and be constants that either multiply or are added to . Of course,, but these formulas apply to , the sample mean, as well.
a) If is a constant, then . For example .
b) If is a constant, then . For example, , so that if the mean of is , then the mean of will be
c) If is a constant, then . For example,
, so that if the mean of is , then the mean of will be .
d) If and are both constants, then . For example, , so that if the mean of is , then the mean of will be .
Note that rules a), b), and c) are really special cases of rule d). Rule a) is rule d) with set equal to zero. Rule b) is role d) with set equal to zero. Rule c) is rule d) with set equal to one.
2. The Variance.
If we know the variance of a random variable, we can find the variance of a linear function of the same variable. For example, if we know the variance of , we can find the variance of . These formulas are stated in terms of the population variance,, but can also be used for the sample variance .
a) If is a constant, then . For example , This makes perfect
sense. A constant does not vary, so its variance is zero.
b) If is a constant, then . For example,
, so that if the variance is ,
then the variance of will be .
pg. 62
c) If is a constant, then . For example,
so that if the variance of x is , then the variance of is . Again this is something like common sense. Adding a
constant to doesn't affect how much it varies, so it doesn't affect its variance.
d) If a and b are both constants, then . For example, , so that if the variance of is , then
.
We can summarize this in the table below:
If / /B. The Mean and Variance of Sums of Random Variables.
There are two important rules about sums of random variables. The first one seems to be intuitively obvious, the second one much less so.
1. The Mean.
If are random variables, then .
For example, if and , then
.
2. The Variance.
If are independent random variables, then .
For example, if and and these variables are
independent, then . This means, of course,
that, and that you cannot add standard deviations..
pg. 63
C. Functions of two Random Variables
Since these rules work for sample variances and covariances as well as population variances and
covariances, we will use in place of or , and in place of or .
Let us assume that we have two variables, and , with the following properties
, so that .
1. Linear Functions of Two Random Variables.
Let us introduce two new variables, and , so that , and , where and are constants. From the earlier part of this section we know the following:
To this we now add a new rule:
To find the correlation between and , recall that . But since and , then
.
Note that, because the in the numerator cancels the in the denominator, the only
thingthat contributes to the result is its sign. If the product of is negative we
reversethe sign of . thus takes the values or .
For example, let and so that ,, and .
Then . But we already know that , so that . Now remember that and , so that
and . Thus, the correlation
between and can be found in two ways.
or .
Again, remember that these rules hold for sample data too. That is
,,, and .
pg. 64
2. Sums of Random Variables.
The question now is what happens if we add together two random variables. We learned in Section B above that you can add means. This means that if the mean of will be the sum of the mean of and the mean of .
Formally , or, since this also applies to
samples, . For example, if and , then .
But the situation with variances is not so simple. We learned in Section B that we can add variances only if the variables are independent. If the variables are independent their covariance and correlation will both be zero. But if they are not zero, we must take the value of the covariance into account. Often the covariance will not be available and we must compute . Then, if , we can use the formula:
Or, if we are working with sample data: . For example let us assume that and . Then and
.
Thus .
3. Sums of Functions of Random Variables
By combining the information from the last two sections, we can look at the situation that occurs when we deal with a sum of two random variables. To keep things simple, let , and , so that . As far as the mean is concerned, we can say
that . But
, so that . Alternately, .
For example, If and so that , then
and if and ,.
Also . But since and , , and , we can write
or .
To summarize then,.
For example, Let . Then .
So if ,, and ,
pg. 65
IV. APPLICATION TO PORTFOLIO ANALYSIS
We can now use the formulas above to find the mean and variance of a portfolio of stocks, To keep things simple, assume that we are offered only two stocks, and that the return of the first stock is , while the return of the second stock is . Let us assume that each stock sells for $1.00 a share, and that we have exactly $1 to invest. Since we can buy fractional shares, we divide our dollar into two parts,, where . Our total return is . For example, if ,
.
A. Mean Return
We know that , so that . For example, if we split our money equally between two stocks equal .50 . Then the expected return is . In particular, if
and ,.
B. Variance of the Return
We know that , so that
is the variance of the return. Thus if , we can say
.
For example, assume that ,, but is unknown. Then
. If we use the formula for
immediately above,
= . Now we can see the effect various values of will have on and .
pg. 66
C. Variance Minimization.
The purpose of this section is to show how to find the minimum value for Since variance is a measure of risk, minimizing variance minimizes risk, though actually, the best measure of risk is probably the coefficient of variation, the standard deviation divided by the mean, in this case .
Remember that .
Also recall that, since are shares of $1.00,, then .
Remember too, that . If we put all this together,
Now let us assume some values for the standard deviations and the correlation. Let
Then
If we collect terms in , we get
or .
In order to minimize risk we pick our value of to give us a minimum variance. The way
That we find this minimum variance is by taking the first derivative of with respect to
and setting it equal to zero.
Since , if we set the variance equal to zero we get
, which implies that . Now since
, we set . That is, to minimize risk, we put about 23% of
our money in stock 1 and 77% in stock 2.
pg. 67
1