Interpolation and Extrapolation Lecture
- Thought question
- Readings:
- Required:
- Arnell, N.W. 1995. Grid mapping of river discharge. Journal of Hydrology 167:39-56.
- Optional:
- Algorithms from Numerical Recipes (Press et al)
- Point Interpolation Methods handout from Dan Brown
- System Methodology Chapter
- What are interpolation and extrapolation?
- Interpolation-estimating numerical values between known points
- Extrapolation-estimating numerical values beyond range of known points
- When would we use interpolation and extrapolation?
- Interpolation
- Some statistical functions are difficult= to compute directly, so we use a table
- Data are often collected at discrete intervals, but we may want to predict values in between these intervals, but we may not have an explicit function to do this
- One method of data smoothing
- Extrapolation
- Making predictions beyond range of data
- We will see that it is sometimes useful when using a digital computer to make continuous time computations
- Interpolation methods
- Polynomial Interpolation
- Concept-if you have N data points, you can draw a polynomial curve of order N-1 exactly through the points. This is known as a Lagrange polynomial. Example: 2 points can be connected by a line (first degree polynomial), 3 points by a parabola (quadratic curve)Lagrange Polynomial- y=a1 + a2x + a3x2 + ...an+1xn
- Begin with linear interpolation
- Quadratic interpolation
- Higher order-difficult to program properly-use Neville=s algorithm (see Press et al for computer implementation). Neville=s algorithm begins with linear polynomial-then based on this, you compute the quadratic polynomial, then 3rd degree.
- Using Neville=s algorithm-can get estimate of error (see Press et al. Routine polint. Approximation Error y = P(x) + R(x) where P(x) is Lagrange and R(x) is error term R(x) = (x-x0) (x-x1) (x-x2) ... (x-xn) ((fn+1ξ) / (n=1)!) This is an approximate error, and can give the upper bound on errors of well behaved functions
- Principles and concepts
- More closely spaced points-better approximation with lower order
- Higher order approximations are better up to a point (~~4th order)
- Extrapolation beyond Δx is risky
- Error depends on higher derivatives-thus more rapidly changing functions are more difficult to interpolate
- Cubic Spline
- Note that linear (piecewise) interpolation has a zero second derivative in the interval between points, and an indefinite derivative at the points them selves. The cubic spline method attempts to get an interpolation that has a smooth first derivative, and has a continuous second derivative. Cubic spline assumes that the second derivative varies linearly between two adjacent points
- Spline-polynomial between two points whose derivatives depends on surrounding points
- with cubic spline-second derivative for y(i) depends only on y(i-1) and y(i+1), more stable than Lagrange polynomials where the derivatives depend on all points used in interpolation. This results in a local= interpolation
(1)Another, statistical local smoothing method, is the Locally Weighted Regression also known as loess
- Programming cubic spline tricky-see Press et al for algorithm. Note that to get this to work, need to use a routine to estimate 2nd derivative. In doing this, points at the end are a problem, so have to either (1) assume 2nd derivative =0 (natural spline) or (2) assume some other value
- Multidimensional Interpolation
- Briefly, several methods
- Bilinear interpolation
- Higher order for accuracy (for smooth functions): polynomial interpolation: do separate interpolations across each dimension
- Higher order for smoothness:
(1)Bicubic Interpolation-need to specify or compute gradients (eg dy/dx1 dy/dx2 and dy/dx1dx2
(2)Bicubic Spline-here compute second derivatives and follow spline logic
- Triangular Irregular Networks (TIN)
- Kriging
- Homework goals
- Real-life examples (When do I use interpolation?)
- Which method to use?
- How closely do the data need to be spaced (When is enough data enough?) -also how to evaluate this
Formulae
Linear Interpolation
Quadratic Interpolation
General Lagrange Polynomial
Approximation Error
y = P(x) + R(x) where P(x) is Lagrange and R(x) is error term
Cubic Spline-
Homework FW853
Due Friday Jan 17
Lagrange Interpolation Polynomials
1.Given the following data, compute the 6th order Lagrange Interpolation Polynomial, and plot the results for values of x from 1 to 15 in steps of 0.2 or less (i.e., compute y for x=1.0, 1.2 1.4 ...). Compute and display on the same graph the results of a stepwise linear interpolation (i.e., connect the dots with a straight line). What insight do you gain from this comparison?
X Y
210 4 8
6 6
8 4
10 6
12 8
1410
2.Given the following data on the abundance of a cohort of plant seedlings, use linear interpolation to estimate the value of y at time=5.5 . Begin this interpolation using only the endpoints of the data set (i.e., the first and last observations). Next, do the interpolation with the second and next to last data points, etc. Assume that the actual value at time 5.5 = 12.28. What is the observed interpolation error for each value of Δx. Approximately at what value of Δx does the interpolation error become less than 2% of the actual value?
Density of Cohort
Time Number per hectare
2.3158.82
3.9 44.16
4.7 23.28
5.1 16.91
5.3 14.41
5.4 13.30
5.45 12.78
5.55 11.80
5.6 11.33
5.7 10.46
5.9 8.92
6.3 6.47
7.1 3.41
8.7 0.95
Now, instead of using linear interpolation, use a 3rd order polynomial to see when the interpolation error becomes less than 2%.
3. Consider the data on river discharge contained in the Excel spreadsheet AFlow data for interpolation homework.xls@. The data are currently being collected on a daily basis, and the measurement at the time of collection is assumed to be taken without error. Your supervisor would like to know what the implications of sampling every other day, every three days, and every five days would be for the accuracy of the sampling program.
From these data, compute a quadratic (2nd order) Lagrange polynomial to interpolate for the known points collected on a daily basis. From this, determine the error rate for the different subsampling periods. Note there are many ways of computing error. Some examples include: 1) mean difference from actual 2) mean squared differences between predicted and actual (similar to variance) 3) mean absolute value of differences 4) maximum difference 5) mean (or maximum) percent difference. Think about which measure may be Abest@ to present to your supervisor to help guide the decision of whether sampling every other day or less frequently is acceptable.