Supplementary Online Material

Value of Information for Earth Observing Systems

Roger M. Cooke

Resources for the Future, Dept Math, TU Delft

Feb. 28 2013

This appendix explains the analysis in Leroy, Anderson and Ohring (2008), "Climate Signal Detection Times and Constraints on Accuracy Requirements", J. of Climate, vol. 21, 841-846, providing some background on the theory of linear models.

Linear Model (M. Kendall and A. Stuart, The advanced theory of statistics, Volume II, chap 19):

Y = Xq + e;

Y Î (n ´ 1) [i.e. Y is an n by 1 matrix, observations of dependent variable],

X Î (n ´ k) [matrix of observed covariate values, independent variables];

q Î (k ´ 1), [vector of regression coefficients]

e Î (n ´ 1), [vector of errors, random variables], E(e) = (0,..0)’, E(ee’) = s2In (In = identity matrix order n).

In block form the equations look like

Y = X ´ q + e

X’X is a square (k ´ k) matrix. If X’X is invertible then the block form may be written as (normal equations)

(X’X)‒1 X’ Y q (X’X)‒1 X’ e

´ ´ = + ´ ´

The Least Squares estimate q* of q is found by taking expectations of both sides and solving for q*:

q* = (X’X)‒1X’Y = (X’X)‒1X’(Xq+e) . (1)

The residual is:

Y ‒ Xq* = (Xq+e) ‒ X (X’X)‒1X’(Xq+e) = (1n ‒ X(X’X)‒1X’)e =: (1n ‒Q)e.

One verifies that: Q Î (n ´ n), Rank(Q) = k, Q’Q = Q. Also (1n ‒Q)’ (1n ‒Q) = (1n ‒Q).

For any B Î (n ´ n), E(e’Be) = s2trB, since E(eiej) = 0, i ¹ j. Therefore E(Y ‒ Xq)'( Y ‒ Xq) = E(e’tr(1n‒Q)e) = E(e’(tr 1n‒tr X(X’X)‒1X’)e). We may commute X(X’X)-1 and X’ under the trace operator, converting the product from an (n ´ n) to a (k ´ k) matrix, with 1’s on the diagonal. Hence

E(Y ‒ Xq)'( Y ‒ Xq) = s2(n‒k). (2)

Therefore, the error variance is estimated as:

s2 ~ E(Y ‒ Xq)’( Y ‒ Xq) / (n‒k). (3)

To estimate the variance of q*:

E(q*) = q:

Var(q*) = E{(q*‒ q)’ (q*‒ q)} = E{ [(X’X)‒1Xe]’ [(X’X)‒1Xe]}

= (X’X)‒1XEee’ X(X’X)‒1 = s2(X’X)‒1. (4)

Suppose we standardize X and Y by subtracting their averages, so that Y and the columns of X have mean zero. Then (X’Y) is the vector of covariances of the columns of Xi with Y, where each Xi is a n‒column vector.

(X’,Y) = [ cov(X1,Y)…. cov(Xk,Y) ]’.

(X’,X) is the variance‒covariance matrix of X:

q* is the slope of the best linear estimate of Y based on X, since this slope is invariant under shifts of location, it is given by VAR(X)‒1 COV(X,Y). The variance of the slope (4) based on the observations (Y, X) is s2 VAR(X)‒1, where s2 is estimated as E(Y ‒ Xq)’( Y ‒ Xq) / (n‒k).

Note, if X is univariate (k = 1) , then s2 is estimated as åi=1..n (Yi ‒ Xiq)2 / (n‒1), and

Var(q) = s2 / s2(X). (5)

To interpret this, note that if the X values are close together, Var(X) is small and the uncertainty in q is large, conversely, of the X are spread out, we gain more certainty on q.

q~

Application to Earth Observing Systems

This is the basis of the uncertainty accounting in Leroy et al 2008. X is a column vector of observation times at evenly spaced time points, with time interval d. However, the errors are not independent, so we don’t have E(eiej) = 0, i ¹ j. We will have covariance terms in the estimate of Var(q*).

Switching to Leroy’s notation, we have an N‒element set of data di at times ti. The trend m is covariance over variance, where t* is the mean of the ti’s::

m = åi=1..N di (ti ‒ t*) / åi=1..N (ti ‒ t*)2. (6)

The square of m is

åi=1..N åk=1..N dk (tk ‒ t*) di (ti ‒ t*) / [ åi=1..N (ti ‒ t*)2 ]2. (7)

Considering the di’s as random variables, fluctuations around the trend line are correlated; if s2var is the “zero lag variance associated with natural variability”, and there are no other source of variation, then

Cov(di, dk) = s2var Corri‒k(var) (8)

Where Corri‒k(var) is the correlation function of natural variability for a time lag |i ‒ k|d. NOTE, this equation assumes that, after subtracting the trend line, the series is stationary: s2var is invariant and Corri‒k depends only on the difference |i ‒ k|. If there is also stationary measurement uncertainty with a time lag the last equation becomes

Cov(di, dk) = s2var Corri‒k(var) + s2meas Corri‒k(meas) =x(|i ‒ k|) (9)

(7) is then:

åi=1..N åk=1..N (tk ‒ t*) (ti ‒ t*) x(|i ‒ k|) / [ åi=1..N (ti ‒ t*)2 ]2 (10)

To simplify (10) put i +u = k and replace the sum over k with a sum over u; if N is large and correlations are local, we approximate the numerator as :

åi=1..N (ti ‒ t*) åu= ‒¥..¥ (ti+u ‒t*)x(|u|).

Since the ti’s are evenly spaced, the contribution to the sum over u for u = b and u = ‒b, will be

( (ti + b ‒ t*) + (ti ‒ b ‒ t*) ) x(|b|) = 2(ti ‒ t*) x(|b|) .

The numerator in (1) becomes approximately

åi=1..N (ti ‒ t*)2 åu= ‒¥..¥ x(|u|).

The variance approximation becomes

Var(m) ~ s2var å Corru(var) + s2meas å Corru(meas) / [ åi=1..N (ti ‒ t*)2 ]. (11)

We define characteristic lengths, where d is the length of time between measurements, as

tvar = d åk=‒¥…¥ Corrk(var)

tmeas = d åk=‒¥…¥ Corrk(meas) (12)

We now use a combinatorial trick, by noting that (N even)

åi=1..N ( (ti ‒ t*)/d)2 = 2 åi = 1…N/2 i2.

Approximating the sum by the integral ò0..N/2 x2 dx = (1/3)(N/2)3 we find that

åi=1..N (ti ‒ t*)2 = d2 N3/12,

or

åi=1..N (ti ‒ t)2 ~ d2 N3/12. (13)

Writing DT = Nd and substituting from (12), with the ‘extra d term’ we find:

Var(m) ~ 12(DT)‒3 ( s2var tvar + s2meas tmeas ). (14)

This is eq (1) in the text and in Leroy et al 2008.