Statistics 512 Notes 18:

Multiparameter maximum likelihood estimation

We consider iid with pdf where is p-dimensional.

As before,

The maximum likelihood estimate is

We can find critical points of the likelihood function by solving the vector equation

We need to then verify that the critical point is a global maximum.

Example 1: Normal distribution

iid

The partials with respect to and are

Settingthe first partial equal to zero and solving for the mle, we obtain

Setting the second partial equal to zero and substituting the mle for , we find that the mle for is

.

To verify that this critical point is a maximum, we need to check the following second derivative conditions:

(1) The two second-order partial derivatives are negative:

and

(2) The Jacobian of the second-order partial derivatives is positive,

See additional sheet for verification of (1) and (2) for normal distribution.

Example 2: Gamma distribution

The partial derivatives are

Setting the second partial derivative equal to zero, we find

When this solution is substituted into the first partial derivative, we obtain a nonlinear equation for the MLE of :

This equation cannot be solved in closed form. Newton’s method or another iterative method can be used.

digamma(x) = function in R that computes the derivative of the log of the gamma function of x, .

uniroot(f,interval) = function in R that finds the approximate zero of a function in the interval. There should be only one zero and the lower and upper points of the interval should have opposite signs.

alphahatfunc=function(alpha,xvec){

n=length(xvec);

eq=-n*digamma(alpha)-n*log(mean(xvec))+n*log(alpha)+sum(log(xvec));

eq;

}

> alphahatfunc(.3779155,illinoisrainfall)

[1] 65.25308

> alphahatfunc(.5,illinoisrainfall)

[1] -45.27781

alpharoot=uniroot(alphahatfunc,interval=c(.377,.5),xvec=illinoisrainfall)

> alpharoot

$root

[1] 0.4407967

$f.root

[1] -0.004515694

$iter

[1] 4

$estim.prec

[1] 6.103516e-05

betahatmle=mean(illinoisrainfall)/.4407967

[1] 0.5090602

Consistency, asymptotic distribution and optimality of MLE for multiparameter estimation

Theorem 6.4.1: Let be iid with pdf for . Assume the regularity conditions (R6-R9) hold [similar to (R0)-(R5), assumptions that the log likelihood is smooth]. Then

(a)

(b)

where is the Fisher information matrix of ,

.

As in the univariate case, the Fisher information matrix can be expressed in terms of the second derivative of the log likelihood function under the regularity conditions:

Corollary 6.4.1: Let be iid with pdf for . Assume the regularity conditions (R6-R9) hold. Then is an asymptotically efficient estimate in the sense that the covariance matrix of any other consistent estimate is “at least as large” (in particular, the variance for each component of is at least as large).

Note on practical use of theorem:

It is also true that

Thus, , which can be used to form approximate confidence intervals.

Example 1: iid .

=

Thus,

Thus,

To form approximate confidence intervals in practice, we can substitute the MLE estimates into the covariance matrix:

Thus, an approximate 95% confidence interval for is and an approximate 95% confidence interval for is .

Example 2:

Gamma distribution:

For the Illinois rainfall data,

Thus,

infmat=matrix(c(6.133,1.964,1.964,1.704),ncol=2)

> invinfmat=solve(infmat)

> invinfmat

[,1] [,2]

[1,] 0.2584428 -0.2978765

[2,] -0.2978765 0.9301816

Thus,

Thus, approximate 95% confidence intervals for and are

Note: We can also use observed Fisher information to form confidence intervals based on maximum likelihood estimates where in place of the information matrix, we use the observed information matrix O where

We could also use the parametric bootstrap to form confidence intervals based on maximum likelihood estimates where we resample from