Statistics 512 Notes 18:
Multiparameter maximum likelihood estimation
We consider iid with pdf where is p-dimensional.
As before,
The maximum likelihood estimate is
We can find critical points of the likelihood function by solving the vector equation
We need to then verify that the critical point is a global maximum.
Example 1: Normal distribution
iid
The partials with respect to and are
Settingthe first partial equal to zero and solving for the mle, we obtain
Setting the second partial equal to zero and substituting the mle for , we find that the mle for is
.
To verify that this critical point is a maximum, we need to check the following second derivative conditions:
(1) The two second-order partial derivatives are negative:
and
(2) The Jacobian of the second-order partial derivatives is positive,
See additional sheet for verification of (1) and (2) for normal distribution.
Example 2: Gamma distribution
The partial derivatives are
Setting the second partial derivative equal to zero, we find
When this solution is substituted into the first partial derivative, we obtain a nonlinear equation for the MLE of :
This equation cannot be solved in closed form. Newton’s method or another iterative method can be used.
digamma(x) = function in R that computes the derivative of the log of the gamma function of x, .
uniroot(f,interval) = function in R that finds the approximate zero of a function in the interval. There should be only one zero and the lower and upper points of the interval should have opposite signs.
alphahatfunc=function(alpha,xvec){
n=length(xvec);
eq=-n*digamma(alpha)-n*log(mean(xvec))+n*log(alpha)+sum(log(xvec));
eq;
}
> alphahatfunc(.3779155,illinoisrainfall)
[1] 65.25308
> alphahatfunc(.5,illinoisrainfall)
[1] -45.27781
alpharoot=uniroot(alphahatfunc,interval=c(.377,.5),xvec=illinoisrainfall)
> alpharoot
$root
[1] 0.4407967
$f.root
[1] -0.004515694
$iter
[1] 4
$estim.prec
[1] 6.103516e-05
betahatmle=mean(illinoisrainfall)/.4407967
[1] 0.5090602
Consistency, asymptotic distribution and optimality of MLE for multiparameter estimation
Theorem 6.4.1: Let be iid with pdf for . Assume the regularity conditions (R6-R9) hold [similar to (R0)-(R5), assumptions that the log likelihood is smooth]. Then
(a)
(b)
where is the Fisher information matrix of ,
.
As in the univariate case, the Fisher information matrix can be expressed in terms of the second derivative of the log likelihood function under the regularity conditions:
Corollary 6.4.1: Let be iid with pdf for . Assume the regularity conditions (R6-R9) hold. Then is an asymptotically efficient estimate in the sense that the covariance matrix of any other consistent estimate is “at least as large” (in particular, the variance for each component of is at least as large).
Note on practical use of theorem:
It is also true that
Thus, , which can be used to form approximate confidence intervals.
Example 1: iid .
=
Thus,
Thus,
To form approximate confidence intervals in practice, we can substitute the MLE estimates into the covariance matrix:
Thus, an approximate 95% confidence interval for is and an approximate 95% confidence interval for is .
Example 2:
Gamma distribution:
For the Illinois rainfall data,
Thus,
infmat=matrix(c(6.133,1.964,1.964,1.704),ncol=2)
> invinfmat=solve(infmat)
> invinfmat
[,1] [,2]
[1,] 0.2584428 -0.2978765
[2,] -0.2978765 0.9301816
Thus,
Thus, approximate 95% confidence intervals for and are
Note: We can also use observed Fisher information to form confidence intervals based on maximum likelihood estimates where in place of the information matrix, we use the observed information matrix O where
We could also use the parametric bootstrap to form confidence intervals based on maximum likelihood estimates where we resample from