ORMAT Statistics 510: Notes 22
Reading: Sections 7.5.
I. Write-up of example from last class
Example 3: Let be a hypergeometric random variable – i.e., is the number of white balls drawn in n random draws from an urn without replacement that originally consists of m white balls and black balls. Find the variance of X.
Let .
Then and
To calculate , note and so that .
To calculate , we use the formula . Note that if both the ith and jth balls are white, and 0 otherwise. Thus, =P(ith and jth balls are white). By considering the sequence of experiments look at the ith ball, look at the jth ball, then look at the 1st ball, look at the 2nd’s ball, ..., look at the i-1 ball, look at the i+1 ball, ..., look at the j-1 ball, look at the j+1 ball, ..., look at the nth ball, we see that
P(ith and jth balls are white) =
.
Thus, and
and
Note that the variance of a binomial random variable with n trials and probability of success for each trial is
, so the variance for the hypergeometric is smaller by a factor of ; this is due to the negative covariance between and for the hypergeometric.
II. Correlation (Chapter 7.4)
Correlation: The magnitude of the covariance depends on the variance of X and the variance of Y. A dimensionless measure of the relationship between X and Y is the correlation :
.
The correlation is always between -1 and 1. If X and Y are independent, then but the converse is not true.
Generally, the correlation is a measure of the degree of linear dependence between X and Y.
Note that for ,
(this is what is meant by saying that the correlation is dimensionless – if X and Y are measured in certain units, and the units are changed so that X becomes aX and Y becomes bY, the correlation is not changed).
Example 1: From the Excel file stockdata.xls, we find that the correlations of the monthly log stock returns, in percentages, of Merck & Company, Johnson & Johnson, General Electric, General Motors and Ford Motor Company from January 1990 to December 1999 are
III. Conditional Expectation (Chapter 7.5)
Recall that if X and Y are joint discrete random variables, the conditional probability mass function of X, given that , is defined for all y, such that , by
.
It is natural to define, in this case, the conditional expectation of X, given that for all values of y such that by
.
The conditional expectation of X, given that , represents the long run mean value of X in many independent repetitions of experiments in which .
Example 2: Suppose that events occur according to a Poisson process with rate , i.e.,
(a) the probability of an event occurring in a given small time period is approximately proportion to
(b) the probability of two or more events occurring in a given small time period is much smaller than
(c) the number of events occurring in two non-overlapping time periods are independent
Let N be the number of events occurring in the time period [0,1]. For , let X be the number of events occurring in the time period . Find the conditional probability mass function and the conditional expectation of X given that .
Conditional expectation for continuous random variables:
If X and Y are continuous random variables, the conditional probability of X, given that , is defined for all values of y such that by
It is natural, in this case, to define the conditional expectation of X, given that , by
provided that .
Example 3: Let X and Y have the pdf
Find the conditional expectation .
IV. Computing Expectations and Probabilities by Conditioning (Section 7.5.2-7.5.3)
Let us denote by that function of the random variable whose value at is . Note that is itself a random variable. An important property of conditional expectations is the following proposition:
Proposition 7.5.1:
1)
If Y is a discrete random variable, then equation states that
2)
If Y is a continuous random variable, then equation states that
One way to understand equation is to interpret it as follows: To calculate , we may take a weighted average of the conditional expected value of X, given that , each of the terms being weighted by the probability of the event on which it is conditioned. Equation is a “law of total expectation” that is analogous to the law of total probability (Section 3.3, notes 6).
Example 4: A miner is trapped in a mine containing 3 doors. The first door leads to a tunnel that will take him to safety after 3 hours of travel. The second door leads to a tunnel that will return him to the mine after 5 hours of travel. The third door leads to a tunnel that will return him to the mine after 7 hours. If we assume that the miner is at at all times equally likely to choose any one of the doors, what is the expected length of time until he reaches safety?
Example 5: A random rectangle is formed in the following way: The base, X, is a uniform [0,1] random variable and after having generated the base, the height is chosen to be uniform on . Find the expected area of the rectangle.
V. Conditional Variance (Section 7.5.4)
The conditional variance of is the expected squared difference of the random variable X and its conditional mean, conditioning on the event that :
.
There is a very useful formula for the variance of a random variable X in terms of the conditional mean and conditional variance of :
Proposition 7.5.2:
.
Proof:
By the same reasoning that yields , we have that
. Thus,
3)
Also, as , we have that
4)
Hence, by adding Equations and , we have that
.
Example 6: Suppose that by any time t, the number of people that have arrived at a train depot is a Poisson random variable with mean . If the initial train arrives at the depot at a time (independent of when the passengers arrive) that is uniformly distributed over , what is the mean and variance of the number of passengers that enter the train?