Modeling of Atmospheric Chemistry
Page 25
The interpolation process consists of determining a function Ψ(r) called interpolant whose value is strictly equal to the true function F(r) for specified values ri of the independent variable r, and approximates this function in the intervals between the different nodes ri. Thus, we impose at N + 1 points
Ψ(ri) = F(ri) ≡ Fi (i = 0, N)
(4.366)
The interpolation error e(r) at point r is defined as the difference between the true function F(r) and the interpolating function Ψ(r)
e(r) = F(r) − Ψ(r)
(4.367)
The fitting process adopted for the interpolation often consists of minimizing the mean squared error.
Interpolation schemes must model F(r) by some plausible functional form. In 1-D problems, for example, the function F(x) is approximated by polynomials passing through the nodes where the value of the function is known. In global polynomial interpolation, a single polynomial Ψ(M)(x) of degree M passing through M + 1 points is defined for the entire domain under consideration. In piecewise methods, a different polynomial function is defined in each of the M specified intervals [xi, xi+1] covering the entire domain. Conditions are imposed at the breakpoints xi between intervals to ensure continuity and smoothness of the resulting interpolant.
4.16.1 Global Polynomial Interpolation
If we know the value of a function F(x) at M + 1 distinct points (x0, x1, …, xM) in a specified domain [a, b], we can define a single polynomial Ψ(M)(x) of degree M
Ψ(M)(x) = c0 + c1x + c2x2 + . . . . . + cMxM
(4.368)
that approximates F(x) and passes through all these points. The M + 1 coefficients ci (i = 0, …, M) are determined by expressing that the polynomial verifies the known values Fi of the function at each point xi:
(4.369)
We solve therefore the system of M + 1 equations
(4.370)
where the square matrix of dimension M + 1 is referred to as the Vandermonde matrix. This matrix is nonsingular if all points xi are distinct. The solution of this system can be obtained by standard techniques such as the LU decomposition (see Box 6.2). This technique, however, is computationally expensive since it requires of the order of M3 operations. More efficient numerical methods requiring only O(M2) operations are available (Press et al., 2007). Vandermonde matrices are notoriously ill-conditioned, so the method should be used only if the data points are well-spaced and the values of the function well-behaved. The Newton or Lagrange interpolations described below are often preferred.
Newton interpolation
A polynomial of degree M can be expressed by the Newton expression
(4.371)
or equivalently
(4.372)
with
(4.373)
and φ–1(x) = 1. The coefficients ai, called divided differences, are computed from
a0 = F(x0) a1 = F[x1, x0] a2 = F[x2, x1, x0] aM = F[xM, xM − 1, ….. , x1, x0]
(4.374)
where the bracket functions are defined by
(4.375)
(4.376)
…….
(4.377)
An illustrative example (M = 2) is given by the quadratic interpolation polynomial Ψ(2)(x),
Ψ(2)(x) = a0 + a1(x − x0) + a2(x − x0)(x − x1)
(4.378)
whose value is equal to F0, F1, and F2 at the three points x0, x1, and x2, respectively. By application of (4.374), one finds
(4.379)
An advantage of the method is that the Newton interpolation algorithm can be expressed as a recursive process since
Ψ(M + 1)(x) = Ψ(M)(x) + aM + 1 φM(x)
(4.380)
if
(4.381)
and
(4.382)
Thus, one can easily calculate Ψ(M+1)(x) without having to re-compute all coefficients if the polynomial Ψ(M)(x) is known.
Lagrange interpolation
In the Lagrange formulation, one expresses the interpolation function by
(4.383)
where
(4.384)
In this expression, each of the M + 1 terms is of degree M and is equal to zero at each node xj except at one of them (denoted xi), where it is equal to Fi. For example, if the value of the function F is known at three points x0, x1, and x2, (M = 2), the functional form of the second-order polynomial given by the Lagrange formula is
(4.385)
The Lagrange formulation is not recursive.
As shown by Figure 4.31, the use of a single interpolating polynomial Ψ(M)(x) for the global domain captures broad features, but often produces excessive variations (oscillatory artifacts) in the intervals between data points and specifically in the first and last intervals of the domain (Runge’s phenomenon). Very inaccurate approximations can be found if the interpolant Ψ(M)(x) is used to extrapolate data beyond the limits of the data point domain. High-order polynomial interpolation is often ill-conditioned as small changes in the data lead to large differences in the values derived in the interval between nodes (overfitting). Finally, the errors resulting from local outliers (e.g., measurement error at a given station) propagate to the entire polynomial domain.
Figure 4.31 Comparison of different interpolation methods: piecewise linear, global polynomial, piecewise spline, and piecewise cubic Hermite.
Reproduced From Moler (2004).
Rational function interpolation
It is often preferable to approximate the true function F(x) that passes through M + 1 points by a rational function R(x) that is the quotient of two polynomials, one of degree n and the second of degree m, with n + m = M:
(4.386)
The use of rational polynomials often leads to much better approximations than the use of ordinary polynomials, especially for a large number of nodes. The main drawback is that there is no control over the occurrence of poles (zero denominator) in the domain of interpolation. This problem can be avoided by using rational polynomials of higher degrees and, for example, by making the degree of the numerator and the denominator equal to M. An example is the barycentric interpolant
(4.387)
If the weights w0, w1, …, wN are chosen as
(4.388)
the existence of poles is avoided. An alternative solution, that also prevents poles, is to simply specify (Press et al., 2007)
wi = (−1)i (i = 0, M)
(4.389)
4.16.2 Piecewise Interpolation
Rather than defining a single interpolation function for the entire domain of interest [x0, xN], it is often preferable to define a different interpolation function for each individual interval [xi, xi+1], and express that the junction of these functions and potentially of their derivatives is continuous at each breakpoint (or partition point) xi in the domain. To describe such piecewise interpolation algorithms, we consider again a 1-D function F(x) representing a physical quantity whose values Fi = F(xi) are known at the data points xi. We wish to estimate the value of this function at an arbitrary point x located between two of these partition points. The most desirable methods provide piecewise polynomial functions with a high degree of smoothness at the nodes where they connect.
The simplest piecewise interpolation method, called nearest-neighbor interpolation, is to approximate the function at point x by the value Fi corresponding to the closest point xi. The method can be generalized in the 2-D case, leading to a mosaic of cells called a Voronoi diagram. In each cell i, the value of the interpolant is constant and equal to the value of the function at the data point ri. The resulting interpolant is a discontinuous function, which does not fulfill the requirement of smoothness at the junction between intervals.
A better method is piecewise linear interpolation in which the value of the function between partition points xi and xi+1 is approximated (1-D case) by
(4.390)
This linear interpolation is commonly applied in models of atmospheric composition and is often implemented in more than one dimension. Consider, for example, a function F(x, y) of
two independent variables x and y whose values Fi,j are known at selected points (xi, yj). If we define the reduced variables
(4.391)
whose values range between 0 and 1, the bilinear interpolation function Ψ(x, y) at an unsampled point (x, y) is given by
Ψ(x, y) = (1 − s)(1 − t)Fi , j + s(1 − t)Fi + 1 , j + t(1 − s)Fi , j + 1 + s t Fi + 1 , j + 1
(4.392)
Linear interpolation is characterized by discontinuities in the derivatives of the interpolant at the boundaries of the intervals, which is a major disadvantage of the method (see Figure 4.31 for the 1-D case) The difficulty can be addressed by considering higher-order piecewise interpolation methods, for example cubic interpolation algorithms. A frequently used algorithm based on cubic Hermite basis functions (Fritsch and Carlson, 1980) provides a monotone interpolant (no overshoots or undershoots). In 2-D problems, a bicubic interpolation is performed by applying successively a 1-D cubic interpolation procedure in each direction. The resulting interpolated surface is smoother than the surfaces obtained by the bilinear interpolation.
The error resulting from the approximation by a polynomial function depends on the size of the intervals between nodes xi, the location of the selected intermediate point x, and the properties of function F(x). In the case of a simple linear interpolation, the error can be derived by expanding F(x) in Taylor series about xi, evaluating the first derivative of F(x) at point xi, and expressing the interpolant Ψ(1)(x) as a linear function taking the known values Fi and Fi+1 at points xi and xi+1. Here, the superscript (1) refers to the order of the interpolant. The resulting error is
(4.393)
where the second derivative is calculated at a point within the interval [xi, xi+1]. If Δx represents the length of this interval, the maximum error, which occurs at the midpoint xm, is given by
(4.394)
For a piecewise polynomial interpolation of degree M, the error is proportional to the (M + 1)th derivative of F
(4.395)
where φn is given by (4.381).
Polynomial spline interpolation
Consider a domain [x0, xN] split into N intervals [xi, xi+1]. Spline piecewise interpolation is provided by a set of N polynomial pieces of degree p defined on each interval, so that the adjacent polynomial pieces and their p – 1 derivatives are continuous a junction points xi. In the cubic spline method, which is often adopted, we express the interpolant in each interval by a cubic polynomial
Ψi(x) = ai + bi(x − xi) + ci(x − xi)2 + di(x − xi)3 (i = 0, ….. , N − 1)
(4.396)
The 4N unknown coefficients ai, bi, ci, and di are determined as follows (Figure 4.32). First, we require that the polynomial matches the values Fi of the data at each breakpoint xi (2N conditions)
Ψi − 1(xi) = Ψi(xi) = Fi (i = 1, …….. , N − 1)
(4.397)
Ψ0(x0) = F0 and ΨN − 1(xN) = FN
(4.398)
Figure 4.32 Spline functions Ψi(x) defined in each interval [xi, xi+1] of the entire domain [x0, xN]. The value of the “true” function F(x) and of the spline function are equal to Fi at nodes xi. In the cubic spline method, continuity of the first and second derivatives is also imposed at the nodes.
Second, to make the interpolation as smooth as possible, we request that the first and second derivatives of Ψ(x) be continuous at each breakpoint xi (2N – 2 conditions):
(4.399)
Third, we must add two conditions required to solve the system of 4N unknowns. A standard choice, referred to as natural boundary conditions, is to impose that the second derivative of the interpolant is zero at both endpoints x0 and xN of the domain
(4.400)
An alternative is to prescribe the slope of the spline function at each boundary if the first derivative of the original function F(x) is known at both endpoints:
(4.401)
The 4N coefficients ai, bi, ci, and di are then derived by applying these conditions to polynomial (4.396) and to its first and second derivatives, respectively. The solution of the resulting system requires that a tridiagonal system of equations be solved, which is performed through only O(N) operations (see Box 4.4). The cubic spline interpolation method is therefore computationally efficient. Further, the interpolation function is relatively smooth (Figure 4.31) since its first and second derivatives are continuous functions. The bicubic spline algorithm generalizes the method to two dimensions.
The spline interpolation has the advantage of capturing both broad and detailed features, but in some cases it may be smoother than wished; it also has occasionally the tendency to oscillate. Experience shows that the use of spline polynomials of a degree higher than three seldom yields any real advantage.
4.16.3 Distance-Weighted Interpolation
In distance-weighted interpolation, we express the interpolant by a linear combination of radial basis functions ϕ(|r – ri|), called influence functions. These functions express the degree to which a data point situated at location ri influences its surroundings. They are expressed as a function of the radial distance d(r, ri) = |r – ri|. We write therefore
(4.402)
The weights wi(r) at location r of a given sampling point i are thus given by
(4.403)
In general, only the data points ri located close to the target point r are taken into consideration when applying (4.402). A variety of forms are used for the radial basis function ϕ(d(r, ri)) to describe the decrease in the influence of a data point with distance d from the point. An example is the Gaussian form
(4.404)
where c is an adjustable parameter. In the inverse distance weighting (IDW) method (Shepard, 1968), the radial basis function is
(4.405)
where the choice of the power parameter p (typically 1–20) defines the smoothness of the solution.
Distance-weighted methods using radial basis functions are simple to implement and computationally inexpensive. A disadvantage is that errors cannot be characterized. Another issue is that the interpolant is sensitive to the sampling configuration. When adopting weighting functions that are purely radial, observations clustered in particular directions (and often providing redundant information) carry an artificially large weight. This can be corrected by multiplying the radial basis function ϕ by an anisotropy correction factor that is a function of the angles between every pair of stations relative to the unsampled point r.
4.16.4 Kriging
The methods of interpolation discussed above are often qualified as deterministic because the variation of the physical quantity, described by a single “true” function F(r), is approximated by a single interpolation function Ψ(r). An alternative approach is to assume that F(r) is a random field with several possible realizations among an ensemble of distributions that verify the known data Fi = F(ri) at N points ri. The stochastic method introduced by South African mining engineer D. G. Krige (1951) and formalized by French mathematician G. Matheron (1962) is commonly used to interpolate spatially distributed geophysical data (Cressie, 1993). It provides the mean and variance of the ensemble of possible realizations at every point within a defined region. Rather than expressing weights as a function of the distance between sampled and unsampled data points as in the IDW method, the kriging approach accounts for the spatial correlation between data points.
We express again the interpolant Ψ(r) that approximates F(r) at any unsampled point r by a linear combination
(4.406)
where the weights wi(r) must be determined from the assumed probabilistic behavior of the random field. Several variants of the kriging method have been developed. We present here the ordinary kriging method, which is the most widely used.
The method assumes that, for each pair of variables F(ri) and F(rj), a covariance exists that depends only on the separation vector d = rj – ri. The difference between variables F(r) and F(r + d) is treated as a stationary unbiased variable with mean
(4.407)
and variance
(4.4
08)
Function γ(d) is called the semi-variance function; it is determined experimentally as a function of the norm of the separation distance d between different data points in the domain by
(4.409)
When displayed graphically (Figure 4.33), function γ(d) constitutes a so-called semi-variogram or simply variogram. The experimental data are fitted by an analytical curve, from which values γ(i, j) of the semi-variance for each pair of points (i, j) are determined. As shown below, the values of γ(i, j) will be used to derive the weights needed to calculate the interpolant.
Figure 4.33 (a) Typical variogram γ(d) and equivalent covariance function C(d) (covariogram) as a function of the distance d between data points. One can show that γ(d) = C(0) – C(d), where C(0) = Var (F(x)). The semi-variance increases with the distance d, while the covariance function decreases. From Gentile et al. (2012). (b) Graphical illustration of 1-D data interpolation by kriging and spline methods. The open squares indicate the location of the data. The kriging interpolation is shown in red, and corresponds to the means of the normally distributed confidence intervals shown in gray. The spline interpolation polynomials are shown by the blue dashed line.