Conditional Distributions and Bayesian Updating

A great deal of the analytic accounting literature focuses upon how information is processed, communicated, and employed. Modeling such activities, however, requires a formal assumption for information processing by individuals. In general, individuals are generally assumed to process information like good students of statistics. Thus, they update their beliefs in a Bayesian fashion when they obtain a new piece of information. Hence, to follow along with technical details of the models, it is important that you refresh your memory regarding Bayes’ rule.

Within the context of the formal models we will consider, individuals are typically endowed with some prior beliefs about a variable of interest, they receive some new information, and then they form posterior beliefs based upon the prior beliefs and the new information. The prior beliefs are formalized by assuming that the variable of interest is represented by a random variable, say , and new information is represented by another random variable, say .[1] Bayes’ rule applies so the probability density for conditional upon any realization y, f(x|y), satisfies

(1)

for all x, where f(y|x) is the density function for evaluated at y conditional upon = x, f(y) is the unconditional density function for evaluated at y, and f(x,y) is the joint density function for and evaluated at x and y. Thus, given prior beliefs about and , the posterior beliefs about given = y are represented by the probability density function f(x|y).

This might seem a bit obvious and, at the same time, too abstract to be of any applied use. Hence, we (you) will do some illustrations with distributions commonly employed in the literature we will be covering.

Bernoulli

Many models employ a simple structure in which the variable of interest has two possible outcomes and the information variable has two possible outcomes (i.e., both are characterized by Bernoulli distributions). Assume the variable of interest has two possible realizations, s or f, where sf. Let p denote the prior probability of s. Assume that the information received by the decision maker is represented by the random variable , which has two possible outcomes, g or b. Assume that the probability of g conditional upon = s is qg.5 and the probability of b conditional upon = f is qb.5. Characterize the posterior distribution for given the following realizations for .

a) = g

b) = b

c)In a and b, how do qg and qb capture the quality of the information?

Uniform

Another distribution that is often employed for the variable of interest is the uniform distribution. Assume the variable of interest is uniformly distributed over the range [0,1]. Assume that the information received by the decision maker is represented by the random variable , which has two possible outcomes, b or g.

a)Assume that the probability = b is 1 if  [0,k] and the probability = g is 1 if  (k,1], where k  (0,1). Characterize the posterior distribution for if = g and the posterior distribution for if = b.

b)Repeat question a assuming that the probability = b is q (.5,1] if  [0,.5] and the probability = g is q if  (.5,1].

Normal

A final distribution that is employed heavily in the literature is the normal distribution. The normal distribution is employed because its parameters have nice intuitive interpretations (e.g. it is characterized by a mean and variance and the variance generally serves as a measure of uncertainty and/or quality of information). We will consider the bivariate normal case first and then turn attention to the multivariate case.

Bivariate Normal

Assume prior beliefs about a variable of interested, , and a forthcoming piece of information, , are represented as follows: is normally distributed with mean x and variance sx is normally distributed with mean y and variance sy, and and have covariance cxy. Assume that an economic decision maker observes realization y and updates beliefs about . The decision maker’s posterior beliefs are that is normally distributed with mean

,

and variance

.

To prove this result, note first that, by definition, the joint density function is

.

Furthermore, by Bayes’ rule we know f(x|y)f(x,y) where the factor of proportionality, f(y)-1, is not a function of x. Therefore, we can work with f(x,y) to derive the conditional density function. Specifically, the conditional density function must be proportional to

Note that this last line is the density function for a normally distributed random variable with the asserted mean and variance.

To see if you can apply this result consider the following structures and derive the posterior density for conditional upon = y.

a)is normally distributed with mean x and variance sx. = +where is normally distributed with mean 0 and variance s. Also, and are independent.

b)is normally distributed with mean x and precision hx. = +where is normally distributed with mean 0 and precision h. (Note: the precision is the inverse of the variance h = s–1.)

c) = + , where is normally distributed with mean y and variance sy, is normally distributed with mean z and variance sz, and and are independent.

Multivariate Normal

We can extend the bivariate normal case to multivariate normal distributions. Assume that is an m variate normal with mean vector x and is a n variate normal with mean vector My. Let the covariance matrix for and be denoted as

where Sx is the covariance matrix for , Sy is the covariance matrix for , and Cxy is the covariance matrix across the elements of and and Cyxis the transpose of Cxy. The decision maker’s posterior distribution for given = Y is that is an m variate normal with mean

,

and variance

,

where the superscript – 1 denotes the inverse of the matrix.

The proof is analogous to the proof in the bivariate case and is omitted. To make the statement more concrete, however, we will provide a specific example. Consider a case where the variable of interest is represented by the random variable and the decision maker obtains two pieces of information represented by the random variables and . Denote the mean and variance of random variable i as I and si respectively. Denote the covariance between variables i and j as cij. The covariance matrix is

To see the mapping between this matrix and the generic matrix, substitute sx in for Sx,

in for Cxy and

in for Sy. Applying the formula, the distribution for the variable of interest conditional on realizations y and z is normal with mean

,

and variance

.

To test your ability to apply the formula, derive the posterior expectation and variance for conditional on realizations y and z.

a) = + , where is normally distributed with mean y and variance sy, and is normally distributed with mean z and variance sz.

b) = + and = +, where is normally distributed with mean x and variance sx, and is normally distributed with mean 0 and variance s, is normally distributed with mean 0 and variance s, and , and are mutually independent.

c) = + and = +, where is normally distributed with mean x and variance sx, and is normally distributed with mean 0 and variance s, is normally distributed with mean 0 and variance s, and , and are mutually independent.

Summary

Bayesian updating is a common assumption employed in most of the literature we will consider. By forcing you to work through these notes, I hope you have refreshed your memory regarding Bayesian updating. Furthermore, by focusing your attention on some commonly used distributions, you should have a jump on and a reference for subsequent papers we will cover.

Conditional Distributions and Bayesian Updating – page 1 of 6

[1] Note that and may have more than one dimension (i.e., they may be vectors).