Norm and Condition Number of a Matrix

Norm and condition number of a matrix

If A.x = b then A(x+dx) = b+db. Since we are dealing with a set of linear equations, A. dx = db.

If db is an error term then the corresponding error in x, dx, is given by dx = A-1.db. The magnitude and direction of this error obviously depend on the inverse of A and will be large (for some db) when A is nearly singular. Suppose that A is symmetric with positive eigenvalues then it is possible to expand the inverse of A in terms of the eigenvectors and eigenvalues of A as follows:

A.xi = li xi

ci are coefficients to be determined.

The last equation holds provided the vectors xi are complete and orthonormal and so ci li =1 for all i and ci =1/li.

Now consider dx = A-1.db. Rewrite this as

and we see that the error term, db, is projected onto xi with coefficient 1/li xiT.db. The largest contribution to dx will come from eigenvectors with the smallest eigenvalues and those for which db and xi are nearly parallel. Nearly singular matrices will be most sensitive since they have the smallest eigenvalues.

Relative error

The error dx will depend on both the direction and magnitude of db. It makes more sense therefore to compare the relative change ||db||/||b|| with the relative error ||dx ||/||x||. For a positive definite matrix (all eigenvalues positive) the solution vector and error term always satisfy:

||x|| ≥ ||b||/lmax and ||dx|| ≤ ||db||/lmin

Therefore the relative error is bounded by ||dx||/||x|| ≤ ||db||/||b|| (lmax/lmin)

The ratio (lmax/lmin) is known as the condition number of A.

Norm of A

For a symmetric matrix the condition number can be expressed in terms of its maximum and minimum eigenvalues. When A is not symmetric then we make use of the norm of the matrix defined by:

||A|| = max (x≠0) ||A.x||/||x||.

This is a measure of the amplifying power of A and replaces ||dx||/||x|| when A is not symmetric. In this case the condition number is

c = ||A|| ||A-1||

To see why this is an appropriate definition for the condition number, replace A-1 by its expansion in eigenvectors and eigenvalues above. The relative error satisfies

||dx||/||x|| ≤ c ||db||/||b||

as before.

If we consider the effect of an error in the elements of A instead of the vector b then we find, perhaps surprisingly, that the same definition of condition number satisfies

||dx||/||x|| ≤ c ||dA||/||A||

(A+dA).(x+dx) = b

A.dx+dA.x+dA.dx = 0

dx = -A-1.dA.(x+.dx) = 0

Multiplying by dA amplifies a vector by no more than the norm, ||dA||, and multiplying by A-1amplifies by no more than ||A-1||. Therefore

||dx|| = ||A-1||.||dA||.||(x+.dx)||

||dx||/||x+dx || ≤ ||A-1||.||dA|| = c ||dA||/||A||

Round off error therefore comes from: natural sensitivity, c, and the actual errors in A and b.

For the case of an eigenvalue problem the relevant condition number is that of the eigenvector matrix, S, not A. For the eigenvalue problem A.S = S.L with an error dA in A

(A+dA).S = S.(L+ dL)

dA.S = S.dL

dL = S-1.dA.S

||dL|| = ||S-1||.||dA||.||S|| = c ||dA||

NB When S is an orthogonal matrix (A real symmetric, unitary if A Hermitian) then c = 1 since there is no amplification by an orthogonal matrix.

IX Special Functions

Special functions such as the gamma function, Legendre polynomials or Bessel functions appear in many aspects of computational physics. For example, the incomplete gamma function arises in methods for rapidly and absolutely converging lattice sums in crystal physics. They are available in many scientific computing libraries such as the Gnu Scientific library (GSL) www.gnu.org/software/gsl/

Gamma function

See any text in mathematical physics, e.g. Mathematical methods for physicists, Arfken and Weber.

Euler definition (Infinite limit)

G(z+1) = z G(z)

G(1) = 1.G(0) = 1

G(2) = 1.G(1) = 1

G(3) = 2.G(2) = 2.1 = 2

G(4) = 3.G(3) = 3.2.1 = 6

G(5) = 4.G(4) = 4.3.2.1 = 24

G(n+1) = n.G(n) = (n-1)!

Euler definition (Definite Integral)

With the change of variable t = t1/2 this becomes

erf(x) is known as the error function.

Incomplete gamma function

The incomplete gamma function is defined by generalising the Euler definite integral definition of the gamma function to

The utility of the incomplete gamma function lies in the fact that multiplying an expression by (G(z,x)+g(z,x))/G(z) leaves it unchanged and the two parts generated in this way may be, rapidly convergent, e.g., when the original expression is not. Functions such as the gamma functions are generally computed numerically using either asymptotic series (which may converge rapidly for large or small arguments of the function) and recursion relations.

By expanding the exponential in the incomplete gamma function and integrating term by term we obtain

For the particular case z=1/2

erfc(x) is known as the complementary error function and for all arguments

erf(x) + erfc(x) = 1

Plot of erf(x) (rising from -1 to 1), erfc(x) (falling from 2 to zero) and erf(x) + erfc(x) (constant value of 1).

The series expansions for erf(x) and erfc(x) are

The first expansion is obtained simply by expanding e-t^2 and integrating erf(x) term by term. The second is obtained by writing

and integrating by parts. The first expansion is good for small x and the second for large x.

Recurrence relations for the incomplete gamma function

The incomplete gamma functions satisfy the following upwards and downwards recursion relations

The first two are upwards and the last two downwards and the relations can be obtained using integration by parts. The starting points for the downwards recurrences are

See paper by Guseinov and Mamedov for further details.

Use of gamma incomplete functions for calculating lattice sums

The Madelung constant is a familiar concept from crystal physics.

A real space lattice is defined by

and the corresponding reciprocal lattice is defined by

The Madelung constant is defined by

where the primed sum indicates that the self term i = j = k = 0 is omitted. Write this formally as an integral over all space. Define

Sturm-Liouville theory and orthogonal functions

The Sturm-Liouville equation is a second order ordinary differential equation (ode) of the form

w(x) is a weight (or density) function. Many physical problems described by an eigenvalue equation belong to the class of Sturm-Liouville equations. Some examples are

equation / p(x) / q(x) / l / w(x)
Legendre / 1-x2 / 0 / ℓ(ℓ+1) / 1
Bessel / x / -n2/x / a2 / x
Hermite / e-x2 / 0 / 2a / e-x2
SHO / 1 / 0 / n2 / 1

We will look at Bessel’s equation illustrated by vibration of a drumhead and Legendre’s equation illustrated by Laplace’s equation in 3-D. In particular we will look at recursive methods for calculating Bessel functions and Legendre polynomials. For further details on the Sturm-Liouville equation and properties its solutions see e.g. Arfken and Weber.

Bessel’s equation and the wave equation in 2-D

Recurrence relations and zeroes of Bessel functions

Jn and Yn are Bessel functions of the first and second kinds, respectively. Jn is finite at the origin (Jn(0) = 1) while Yn is divergent. For a vibrating drumhead, we impose the boundary conditions, R(0) = finite, R(1) = 0 (assume drumhead has unit radius). Then B = 0 and the solution to the ode is a linear combination of Bessel functions of the first kind. The allowed solutions will have Bessel function zeroes at r = 1. Let the mth zero of the nth Bessel function be denoted kmn.

The first three Bessel functions of the first kind are plotted below. The function which takes the value one at the origin is J0[x]. J1(x) and J2(x) have value zero at the origin and J1 is more tightly curved than J2.

Zeroes of the first two Bessel functions of the first kind

m/n / 0 / 1
1 / 2.4048 / 3.8318
2 / 5.5201 / 7.0516
3 / 8.6537 / 10.1735

Radial and angular nodes for modes tabulated above.

n = 0 n = 1

Radial nodal positions for corresponding modes on drumhead

m/n / 0 / 1
1 / 1.0000 / 1.0000
2 / 0.4356 / 0.5461
3 / 0.2778, 0.6378 / 0.3766, 0.6895

The generating function for Bessel functions of the first kind is

Legendre polynomials and Laplace’s equation in spherical polar coordinates

The Laplacian operator in spherical polar coordinates is

Recurrence relations for Legendre polynomials

The generating function for Legendre polynomials is

The first two Legendre polynomials are P0(x) = 1 and P1(x) =x so these may be used to calculate higher Legendre polynomials using the recurrence relation.

Example of a recursive function

double ftuvn(int t, int u, int v, int n, double *f, VECTOR_DOUBLE x)

{

double result ;

if (t < 0 || u < 0 || v < 0 ) return 0.0 ;

if (t >= u & t >= v) {

if (t == 0 & u == 0 & v == 0) return f[n] ;

result = (t > 0) ? x.comp1*ftuvn(t-1,u,v,n+1,f,x) + (t-1)*ftuvn(t-2,u,v,n+1,f,x) : 0.0 ;

}

else if (u > t & u >= v) {

result = (u > 0) ? x.comp2*ftuvn(t,u-1,v,n+1,f,x) + (u-1)*ftuvn(t,u-2,v,n+1,f,x) : 0.0 ;

}

else {

result = (v > 0) ? x.comp3*ftuvn(t,u,v-1,n+1,f,x) + (v-1)*ftuvn(t,u,v-2,n+1,f,x) : 0.0 ;

}

return result ;

}