AMS 572 Lecture Notes

Sept. 26, 2006.

M.G.F. if X is a continuous R.V. with pdf f(x), the M.G.F. of X is

, where t is a small positive variable.

Note: if X is a discrete R.V. with pdf (also called pmf, probability mass function, for the discrete RV’s) , then

Theorem. If are independent, then .

Proof. are independent iff (if and only if)

, when are continuous.

Note: When are discrete, they are independent iff . The above proof holds by replacing the integration sign with the summation sign.

Note: There is a 1-1 correspondence between the pdf and the mgf. That is: .

Note: If follow the bivariate normal distribution, then are independent iff .

Theorem. If , then .

Proof:

=

Example. , are independent, what is the distribution of ? Prove it.

Answer:

Note: if are independent, are independent.

Theorem. If X and Y are independent, then and are independent, where and are any real valued functions.

Two important statistical inference techniques:

(1) point estimator

(2) confidence interval

Example: I am 95% sure that the average height of adult U.S. male is between 5’7” and 5’8”: 95% CI for m is [5’7”, 5’8”].

The 95% CI for m is

General case: the 100(1-a)% CI for m is

Sept. 28, 2006.

Joint moment generating function for random variables X and Y is

if X and Y are continuous.

if X and Y are discrete.

Theorem. X and Y are independent

iff .

iff .

This holds for virtually all distributions.

Theorem. If X and Y are independent, then . This holds for virtually any distributions.

Theorem. X and Y are independent iff .

This holds if the joint distribution of X and Y is bivariate normal. That is,

Note: The notation and mgf for multivariate normal distribution are , and

Theorem. If X and Y are independent, then and are independent, for any real functions and .

For example, If X and Y are independent, then X and 2Y are independent; and are also independent.

c.d.f.: cumulative distribution function for a R.V. X is defined as

p.d.f. of a continuous R.V. X is

Thus we have

Note: When X is continuous, we have , for any real number a. Therefore, we have .

Confidence interval for population mean m, when the population is normal and the population variance is known.

What we have: a random sample of size n, ,

Point estimator:

Theorem: The 100(1-a)% CI for m is .

Note 1: This means that . That is, the above interval has a probability of 1-a to cover the population mean m.

Note 2: The most popular CI is the 95% CI for m: . We can find from the standard normal probability table.

Note 3: In real life, you will have only one value of . For example, =5’7”=67 in, s=4 in, n=100. The 95% CI for m is .

Theorem. If , and Y=a+bX, where a and b are constants, then .

Proof:

.

Example. If X~N(-2, 4), Y~N(3,7), X and Y are independent, what is the distribution of

3X-2Y?

Solution: Apply the above theorem with a=3 and b=-2, we have 3X-2Y~N(-12, 64). You can also prove this directly.

Theorem. If , then .


Now we are ready to present the proof of the Theorem: The 100(1-a)% CI for m is .

Proof:

Therefore is a 100(1-a)% CI for m, by the definition of the confidence interval.

Derivation of the 100(1-a)% CI for m:

In the above proof, we assume we know the form of the CI already. In reality, we often have to derive the CI ourselves. As illustrated in Figure 1, we start with the distribution of the point estimator, , of the model parameter of interest, m.

From the Z-score transformation, , we know that when the left tail area is , the corresponding Z-value is . Therefore we can derive the -value, , such that the tail area to its left is :

Similarly, we can derive the -value, , such that the tail area to its right is :

Now according to Figure 1, we have

Thus by the definition of the CI, is a 100(1-a)% CI for m.

Figure 1. Illustration of the derivation of the 100(1-a)% CI for m.