Module II

Random Variables

As pointed out earlier, variability is omnipresent in the business world. To model this variability statistically, we introduce the concept of a random variable.

A random variable is a numerically valued variable which takes on different values with fixed probabilities.

Examples:

The return on an investment in a one year period

The price of an equity

The number of customers entering a store

The sales volume of a store on a particular day

The turnover rate at your organization next year


In order to illustrate a random variable, consider the simple experiment of flipping a fair coin 4 times in a row. The table below shows the 16 possible outcomes each with probability 1/16 = .0625.

Since this is a list of all possible outcomes of this “experiment”, this list is called the “Sample Space” of this experiment.

Notice that there is no random variable.


Let x = the number of heads in the four flips. By simple counting we obtain the table below:


By counting the number of times each value of x occurs, we obtain the following table:

x Pr(x)

0 1/16 = .0625

1 4/16 = .2500

2 6/16 = .3750

3 4/16 = .2500

4 1/16 = .0625

We can portray this same data graphically as shown below:


This graph is a representation of the probability distribution of the random variable x.

It looks very much like a histogram, but it is very different since no sample has been taken. However, we can generalize the ideas we explored for sample data to random variables.

Consider the formula for finding the sample mean from grouped data, specifically,


By analogy, one then has:


In our case we get:

E(x) = (0 x .0625)+(1 x .25)+(2 x .375)+(3 x .25)+( 4x .0625) = 2.


By a similar argument, one can show that the standard deviation of a random variable can be computed using the formula:


By formal mathematical manipulation, the formula can be simplified to:



EXCEL does not automatically compute the expected value and standard deviation of a random variable, but it is extremely easy to do as illustrated below:

We see by adding up the entries in the third Column that

E(x) = 2

Further by using the sum of the entries in the fourth column, we have that


All of the concepts we introduced for samples also apply for random variables. For example Chebyshev’s inequality continues to hold for random variables as it did for the sample data.

The mound rule is also applicable.

Finally the concept of “t” scores also applies except since we are now using theoretical rather than sample values we shall use “z” scores where z is defined as:

z = (x - m)/s .

It is possible to define another random variable on the same sample space.

Let y equal the number of heads which occur before the first tail when you flip a fair coin four times. The result is:


Again, since we are assuming a fair coin, each sequence of four flips has probability 1/16 = .0625. By counting the points for each distinct value of y, we can construct the following table followed by a pictorial representation of the probability distribution of the random variable y:


As before we can compute the expected value and standard deviation of y using EXCEL by cumulating the necessary sums as shown below:

We then have:

E(y) = .9375


Notice that in this case the expected value is not one of the actual values of y that can occur.

Since for each of the outcomes in our sample space we have both an x and y value, it is possible to array the data in two dimensions simultaneously to obtain what is called the joint probability distribution of x and y. The table below gives the numerical values of P(x,y) for the possible x and y values in our experiment:

The totals to the right and below the columns of the table are called the marginal probabilities of x and y respectively, and they agree with the probability distributions computed for x and y earlier.

Graphically, the joint probability distribution can be illustrated three dimensionally as in the figure on the next page.


It is possible to construct the conditional distribution of y given specific values of x. For example suppose that x=3 (which has probability .25 from the table). By the basic rules of probability:

P(y êx = 3) = P(y and x = 3)/P(x = 3).

Therefore,

P(y = 0 êx = 3) = .0625 / .25 = .25

P(y = 1 êx = 3) = .0625 / .25 = .25

P(y = 2 êx = 3) = .0625 / .25 = .25

P(y = 3 êx = 3) = .0625 / .25 = .25

P(y = 4 êx = 3) = .0000 / .25 = .00.

The conditional distribution of y given that x = 2, would be:

P(y = 0 êx = 2) = .1875 / .375 = .5000

P(y = 1 êx = 2) = .1250 / .375 = .3333

P(y = 2 êx = 2) = .0625 / .375 = .1667

P(y = 3 êx = 2) = .0000 / .375 = .0000

P(y = 4 êx = 2) = .0000 / .375 = .0000.


The table below shows the conditional distributions for all cases as well as the

conditional expected value for each value of x.

The formula for the expected value of y given the value of x is given by the formula:

As can be seen in the table above, the expected value of y given x changes as the value of x changes. This implies that there is some sort of relationship between y and x.


The relationship of x and the expected value of y given x is shown below.

Since the expected value of y changes given the value of x, this implies that y and x are related, that is that they are not independent. When we studied sample data we measured the degree of dependency between y and x by using the correlation coefficient r. This concept can be generalized to measure the dependency between two random variables.


The correlation, r, between two random variables x and y is defined as:

r = Covariance (x , y) / ( SD(x) SD(y) ).

We already know the formulae for SD(x) and SD(y), the Covariance (x, y) is defined as:

Covariance (x , y) = E(xy) – E(x) E(y),

where:



We can compute E(xy) by taking our original joint probability distribution shown below:

Then for each cell, multiply the probability in the cell by the row value of x and then by the column value of y. For example the value of xyP(x,y) when x = 2, and y=2 is

2 x 2 x (.0625) = .25.

The table below performs the above computation for each cell of the joint probability distribution:

The sum of all the values is

E(xy) = 2.6875.

Therefore we can compute r as:

r = [E(xy) – E(x)E(y)]/[SD(x) SD(y)]

= [2.6875 – (2)(.9375)]/[(1) (1.1973278)]

= .8125 / 1.1973278 = .6785945 .

One interprets r in exactly the same way as we interpret the sample correlation coefficient r, that is we square it:

r2 = (.6785945)2 = .4605 .

This implies that is we use x as a predictor of y, we can eliminate approximately 46.05% of the variability in y.

Notice that this example makes it clear that there is no “causality” implied by the fact that the theoretical correlation coefficient is relatively large. They are both “caused” by the underlying experiment of flipping a fair coin four times.