The Vector Elements Are Independent If and Only If

The Vector Elements Are Independent If and Only If

Lecture#3 (9/8/04)

Random Vectors

In case more than two random variables are of interest, the random vector X = [X1, … , Xm] can be formed.

Joint PDF:

X(x) (x1, …, xm) =

The vector elements are independent if and only if:

(x1, …, xm) = (x1) … (xm)

Generalized Bayes's Theorem:

(x/y) = (x,y)/(y)

Transformation of Random Variables:

  • Function of one Random Variable: Suppose that X is a random variable and g(x) is a function of the real variable x with expression:

Y = g(X)

The distribution function (CDF) FY(y) of the Y random variable is:

FY(y) = P {Y  y } = P { g(X)  y }

If we denote the real roots of g(x) by xn,

y = g(x1) = g(x2) = … = g(xn)

then Y(y) (the probability density function) for a specific y is given by,

Y(y) =

If a monotonic , one-to-one, relationship exists between the random variables being transformed (e.g., Y = ln X; Y = a + b X; etc.), then for a specific y variable x=g-1(y) and the PDF of Y is:

Examples:

1)Y=lnX ( y=lnx and x=ey )

Y(y) = X(ey)  ey; y > 0

2)Y = aX + b( y=ax+b and x=(y-b)/a )

Y(y) = X((y-b)/a)  1/|a|

  • Function of two Random Variables: Suppose that Z is a random variable and g(X,Y) is a function of two real variables X and Y, with expression:

Z = g (X, Y)

If variables X and Y are independent:

Trivia#4

Shrewd Prisoner’s Dilemma

Because of a prisoner’s constant supplication, the King grants him this favor: He is given 2N balls which differ from each other only in that half of them are green and half are red. The King instructs the prisoner to divide the balls between two identical urns (but the division is not necessarily equal). One of the urns will be then be selected randomly, and the prisoner will be asked to choose a ball at random from the chosen urn. If the ball turns out to be green, he will be freed. Otherwise(if it is red), he will be fed to the crocodiles. How should he distribute the balls in the urn to maximize his chances of freedom?

Comment: Just attempt to propose the solution. The solution itself may be somewhat mathematically long drawn and should not be attempted unless one is bored. Suggestion – a computer solution may infact be easier.

Expectation and Moments

Expected Value of a Nonlinear Function g(X)

  • For Continuous random variables: E[g(X)] = with E[|g(X)|] finite (so that the integral exists).
  • For Discrete random variables: E[g(X)] =

Note: The Expectation operator is a linear operator, i.e., E[aX+b] = aE[X] + b

Moments of Random Variables

If g(X)=Xn, one obtains the moments of the random variable X.

Thus, denoting by m the expected value of X, m=E[X], one obtains,

n=1E[X-m] = 0

n=2E[(X-m)2] = X2 = E[X2] - m2

If m0, the coefficient of variation VX defines the degree of uncertainty: VX = /m

Coefficient of Skewness (1):1 =

Coefficient of Kyrtosis or Flatness (2):2 =

Other characteristic values are the median and the mode of a random variable. These are defined below.

Median: Value of x corresponding to F(x) = 0.5

Mode: Value of x for which PDF is maximum.

Expectation of Two Random Variables

Conditional Expectation

Consider the random variables X1, X2, with conditional PDF: (x1 / X2). The Conditional expectation of g(x1) given X2 is:

Examples of functions g():

Conditional mean:

Conditional variance:

These conditional statistics are themeselves random variables, as they depend on the random variable X2. If the random variable X2 takes a certain value x2 the conditional statistics become constants.

Important Properties:

E [ E{g(X1) | X2} ] = E [ g(X1) ]

For X1 and X2 independent, it holds that:E [ g(X1) | X2 ] = E [ g(X1) ]

In previous formulation one can use random vectors as well.

Covariance of X1, X2

B12 = Cov [X1, X2] = E[(X1-m1)(X2-m2)] = E[X1 X2] - m1m2

Where m1 and m2 are the mean values of variables X1 and X2, respectively.

Coefficient of Correlation (linear correlation)

It follows: B12 = B21, and 12 = 21, and |12|  1 or B1212

Uncorrelated X1, X2, implies that 12=0.

Perfectly correlated X1, X2, implies 12 =  1

Independence implies lack of correlation (opposite not true).

Note on Correlation: IMPORTANT! Coeff. Of Correlation is a measure of LINEAR dependency. NOT NON-LINEAR. Do not use this to justify non-linearity of a relationship (as is often done even at scientific meetings and publications!). There are other measures – such as RANK Correlation for identifying non-linearity of a relationship between two variables.

Linear Dependence Example:

Y=aX + ba,b constants, a0

Y2 = a2X2

Cov[X, Y] = E[ X (aX+b) ] - mX (a mX + b) = a [E[X2] - mX2] = a X2

XY = a X2 / (|a X| X) =

Thus, if a linear dependence exists, X,Y are perfectly correlated.

For independent random variables: E[X1X2…Xm] = E[X1] E[X2] … E[Xm]

Commonly Used Distribution Functions

Uniform Distribution

PDF:(x) = 1/(b-a)with a  x  b

Note: a and b are the lower and upper bounds of the random variable X

GAUSSIAN DISTRIBUTION (NORMAL DISTRIBUTION)

PDF:,- x  +

X is N(m, 2), indicates that the random variable X is a Gaussian random variable with parameters m and .

Moments:

mX = m : mean

X2 = 2 : variance

E[ (X-m)2k+1 ] = 0;k=0,1,…

E[ (X-m)2k ] = 13…(2k-1)2k;k=1,2,…

It is noted that any random variable for which the last two sets of equations hold is a Gaussian random variable.

A standard normal random variable, U, has zero mean and variance equal to one: mU=0, U=1

The standard normal distribution function is widely tabulated and so evaluation of Gaussian CDF is possible:

P(X  x) = FX(x) = FU((x-m)/) = P[U  (x-m)/]

Furthermore, FU(u) is symmetric about 0: FU(-u)=1-FU(u)

For u  0:

P[-u  U  u] = 1-2FU(-u) = 2FU(u)-1

For relatively large values of u (=):

FU(u) =

With

The error is less than the magnitude of the last term used.

Conditional Gaussian Distribution

Consider the random vectors X1, and X2, of dimensions M1 and M2, and with means m1 and m2, respectively. Also, consider the covariance matrices B11, B22, and the cross-covariance matrix B12.

,,

The conditional PDF of X1/X2 can be found by substituting the joint PDF (x1,x2)=(x) and the marginal PDF (x2) into Bayes's theorem.

,

where

It is noted that the conditional covariance B11/2 does not depend on X2 and can be evaluated apriori.

Implications:

One dimensional case:

Reduction of variance:

LOGNORMAL DISTRIBUTION

It characterizes the product of many small random effects.

If X is N(m,2), Y=eX is said to be lognormally distributed:X= ln Y

PDF:,y0

mX: mean of X=lnY, or median of Y

X = standard deviation of X or lnY

E[Y] = E[eX] = gX(1/j) = exp{mX + 1/2 X2}(j=)

Higher order moments can be obtained via the characteristic function:

mY = E[Yk] = E[ekX] = gX(k/j) = exp{kmX + 1/2 k2X2}

  • Relationships between moments of X and moments of Y:

Variance:

Coefficient of variation:

Also,X2 = ln(1+VY2)(if dispersion of Y is small: X = VY)

Skewness coefficient:1 = 3VY +VY3

  • For two random variables: X1~N(m1,12); X2~N(m2,22):

E[Y1Y2] = E[eX1+X2] = gX(1/j,1/j) =

CENTRAL LIMIT THEOREM

The probability distribution of the sum of m independent, identically distributed random variables tends to become Gaussian as m increases. In the limit, m, the PDF of the sums is Gaussian.

If X1, …, Xm not identically distributed, an additional requirement must hold:

That is the fractional contribution of any one of the random variables to the total variance (Y2) must vanish. Experience has shown that complete independence among the random variables is not required.

In practice the key requirement is that the sum represents the aggregation of many weakly correlated random effects and that no single effect accounts for a dominant fraction of the total variance.

GAMMA DISTRIBUTION

It is used to model the waiting time for m random successes to occur.

PDF:, where () is the Gamma function.

exponential distribution

Specific case of Gamma distribution where m=1. It is used to model the time to failure of a device, time between arrival of events (e.g. rainfall events).

PDF:

CDF:F(x) = 1-e-x

Moments

mX = E[X] = 1/,X2 = 1/2

No memory property:

P[X > (a+b) | X > a] = P[X > b],for a > 0 and b > 0