μixi = 0 ∀ i = 1 , … n
(11.116)
The minimum of L with respect to x is given by
∇xL = ∇xJ + μ = 0
(11.117)
Combination of (11.116) and (11.117) gives us 2n equations to solve for the elements of x and μ in a way that forces all elements of x to be positive or zero. To see this, consider that the solution for xi must either be positive, in which case (11.115) imposes μi = 0, or it must be zero to satisfy (11.116). Application of the KKT conditions will thus produce a solution for the state vector where some elements will be zero and others will be positive. Two drawbacks of this method are that (1) it does not provide characterization of errors on the solution, (2) the solution of zero (or any positive threshold value) for a subset of elements may not be realistic and bias the solution for other elements.
11.10 Data Assimilation
Data assimilation is sometimes used in the literature to refer to any inverse problem. Standard practice in atmospheric chemistry is to refer to chemical data assimilation as a particular kind of inverse problem where we seek to optimize a gridded time-dependent 3-D model field of concentrations based on observations of these concentrations or related variables. This is similar to meteorological data assimilation where the 3-D state of the meteorological variables is optimized for the purpose of initializing weather forecasts or creating a consistent meteorological data archive. The optimized state resulting from the assimilation of observations is called the analysis. Chemical data assimilation can be used to improve weather forecasts by accounting for chemical effects on weather (for example, aerosols affecting clouds), or by providing indirect information on meteorological variables (for example, ozone as a tracer of stratospheric motions). Other applications of chemical data assimilation include air quality forecasting, construction of chemical data archives (reanalyses), and assessment of consistency between instruments viewing different species for different scenes and with different schedules. Figure 11.8 gives as an example an assimilated field for stratospheric ozone. Data assimilation generally involves a chemical forecast to serve as the prior estimate and this will be referred to here as the forecast model. This forecast model is usually a numerical weather prediction model initialized with assimilated meteorological data and including simulation of the chemical variables to be assimilated.
Figure 11.8 Illustration of data assimilation principles. An analysis of stratospheric ozone on September 15, 2008 (b) is obtained by optimizing a stratospheric transport model (c) to fit observations (a). Observations are from the Aura MLS satellite instrument.on September 15, 2008 between 9 and 15 UT, and the analysis is for 12 UT. The right panel shows the free-running stratospheric model initialized on April 1, 2008 with no subsequent data assimilation.
Data from the Belgian Assimilation System for Chemical ObsErvations (BASCOE, Errera et al., 2008; Errera and Ménard 2012). Provided by Quentin Errera.
The state vector x in chemical data assimilation is the gridded 3-D field of concentrations at a given time, and it evolves with time as determined by the forecast model and by the assimilated observations. It is typically very large. Solution of the optimization problem usually requires numerical methods for minimizing the Bayesian cost function J. Such methods, involving iterative computation of ∇xJ as part of a minimization algorithm, are called variational methods. The adjoint-based inversion in Section 11.6 is an example of a variational method.
In the standard chemical data assimilation problem, the forecast model initialized with x(0) at time t0 produces a forecast x(1) at time t1 = t0 + h where h is the assimilation time step, typically of the order of a few hours to a day. Two different strategies can be used to assimilate the observations (Figure 11.9). In 3-D variational data assimilation (3DVAR), observed concentrations are collected and assimilated at fixed time intervals h. The observations y(0) at time t0 are used to optimize x(0), and the forecast model is then integrated over the time interval [t0, t1 = t0 + h] to obtain a prior estimate for x(1). The observations at time t1 are used to optimize the estimate of x(1), and so on. In some cases and with simplifying assumptions, the minimization of J can be done analytically as described in Section 11.5 instead of with a variational method; the assimilation is then called a Kalman filter. In 4-D variational data assimilation (4DVAR), the ensemble of observations spread over the time interval [t0, t1] are used to optimize x(0) by applying the adjoint of the forecast model backward in time over [t1, t0]. The forecast model is then used to obtain a prior estimate for x(1), observations over [t1, t1 + h] are used to optimize x(1), and so on.
Figure 11.9 Schematic of 3DVAR and 4DVAR data assimilation. The long tick marks indicate the assimilation time steps h, and the short tick marks indicate the internal time steps of the forecast model. In 3DVAR, observations assembled at discrete assimilation time steps are used to optimize the state vector at the corresponding times. In 4DVAR, observations spread over the forecast interval are used to optimize the state vector at the beginning of the forecast by propagating the information backward in time with the adjoint of the forecast model (see Figure 11.6).
11.10.1 3DVAR Data Assimilation and the Kalman Filter
The 3DVAR approach uses observations at discrete assimilation time steps to optimize the state vector at the corresponding times. It is called “3D” because the optimization operates on the 3-D state vector without consideration of time. Consider an ensemble of observations collected at assimilation time steps t0, t1, t2, etc. Let y(0) be the ensemble of observations collected at time t0, and x(0) the value of the state vector at that time. Starting from some prior knowledge (xA,(0), SA,(0)), we use the observations y(0) to minimize the cost function as in (11.76):
(11.118)
Here the forward model F is not the forecast model, but instead a mapping of the state vector to the observations. If the observations are of the same quantity as the state vector and on the same grid, then F is the identity matrix. If the observations are of the same quantity as the state vector but offset from the grid in space or time, then F is an interpolation operator. If the observations are of different variables than the state vector, then F is a separate model needed to relate the two; it could be for example a 0-D chemical model relating the concentrations of different species through a chemical mechanism.
The minimum for J(x(0)) is obtained by
(11.119)
In 3DVAR, (11.119) is solved numerically by starting from initial guess xA,(0), computing ∇x(0)J(xA , (0)), and using a steepest-descent algorithm to iterate until convergence (Figure 11.4). The computations of J and ∇J require simplifications to keep the matrices to a manageable size and this is typically done by limiting the spatial extent of error correlation. Simplifications may also be needed in the form of the forward model.
In the case of a linear forward model (such as the identity matrix or a linear interpolator) analytical solution to (11.119) is possible. This analytical approach is called the Kalman filter and it has the advantage of characterizing the error on the solution through computation of the posterior error covariance matrix. Starting from the prior estimate (x(0), SA,(0)), assimilation of observations y(0) yields the optimal estimate and its error covariance matrix through application of (11.79)–(11.81):
(11.120)
G = SA , (0)KT(KSA , (0)KT + SO)−1
(11.121)
(11.122)
where the Jacobian matrix K = ∇xF defines the forward model. We then apply the forecast model to compute the evolution of over the forecasting time step [t0, t1], leading to a prior estimate (xA , (1), SA , (1)) at time t1. (calculation of SA,(1) is described below). Assimilation of observations y(1) at time t1 is then done following the above equations to yield an optimal estimate . From there, we apply the forecast model over [t1, t2], and so on.
The prior error covariance matrix SA,(1) is the sum of the forecast model error covariance matrix SM = E[εMεMT] and the error covariance matrix on the initial state modified by the forecast model over [t0, t1]
. It can be fully computed if the forecast model is linear, i.e., represented by a matrix M. In that case we have
(11.123)
where x(1) is the true value at time t1. Thus
(11.124)
The term in (11.124) transports the posterior error covariance matrix from one assimilation time step to the next, thus propagating information on errors. In the trivial case of a persistence model assumed as forecast, and M is the identity matrix. This may be an adequate assumption if the assimilation time step is short relative to the timescale over which x evolves.
Transport of the full error covariance matrices through is often not computationally practical and various approximations are used. One can produce an ensemble of estimates for x(1) by randomly sampling the probability density function, transporting this ensemble over the assimilation time step [t0, t1], and estimating from the ensemble of values at time t1. This is called an ensemble Kalman filter. Other alternatives are to transport only the error variances (diagonal terms of ) or ignore the transport of prior errors altogether and just assign SA,(1) = SM. The Kalman filter is then called suboptimal.
11.10.2 4DVAR Data Assimilation
The 3DVAR approach assimilates observations collected at discrete assimilation time steps, say once per day, and applies a forecast model to update the state vector over that interval. However, observations are often taken at all times of day and therefore scattered in time over the forecast time interval. In 4DVAR data assimilation, observations scattered at times ti over the forecast time interval [t0, t1] are used to optimize the initial state x(0).This involves minimizing the cost function
(11.125)
and is done by applying the adjoint of the forecast model over [t0, t1]. The optimization approach is exactly as described in Section 11.6. It returns an optimized estimate but no associated error covariance matrix (although one can estimate from numerical construction of the Hessian; see Section 11.6). The optimized estimate is used to produce a forecast xA,(1) at time t1 with error covariance matrix SA,(1) defined by the error in the forecast model over [t0, t1]: SA,(1) = SM. Observations over the next forecasting time interval are then applied with the model adjoint to derive an optimized estimate , and so on. The temporal discretization in the assimilation of the observations is limited solely by the internal time step of the forecast model.
11.11 Observing System Simulation Experiments
Observation system simulation experiments (OSSEs) are standard tests conducted during the design phase of a major observing program such as a satellite mission. Their purpose is to determine the ability of the proposed measurements to deliver on the scientific objectives of the program. These objectives can often be stated in terms of improving knowledge of a state vector x by using the proposed observations y in combination with a state-of-science forward model to relate x to y. For example, we might want to design a geostationary satellite mission to measure CO2 concentrations in order to constrain CO2 surface fluxes, and ask what measurement precision is needed to improve the constraints on the fluxes beyond those achievable from the existing CO2 observation network.
If the state vector is sufficiently small that an analytical solution to the inverse problem is computationally practical (Section 11.5), then a first assessment of the usefulness of the proposed observing system can be made by characterizing the error covariance matrices SA and SO, and constructing the Jacobian matrix K = ∂y/∂x of the forward model relating the state vector x to the observations y. Here SA is the prior error covariance matrix describing the current state of knowledge of x in the absence of the proposed observations, and SO is the observational error covariance matrix for the proposed observations. SO describes the properties of the proposed observing system including instrument errors, forward model errors, and observation density. From knowledge of SA, SO, and K we can calculate the averaging kernel matrix A and posterior error covariance matrix as described in Section 11.5. The trace of A defines the DOFS of the observing system (Section 11.5.3) and provides a simple metric for characterizing the improved knowledge of x to result from the proposed observations. Analysis of the rows of A determines the ability of the observing system to retrieve the individual components of x (Box 11.5). Results of such an OSSE will tend to be over-optimistic because of the idealized treatment of observational errors (Section 11.5.1). In particular, the defining assumptions of unbiased Gaussian error statistics and independent and identically distributed (IID) sampling of the PDFs means that posterior error variances will decrease roughly as the square root of the number of observations. This will typically not hold in the actual observing system because of observation bias, non-random sampling of errors, and unrecognized error correlations. Still, this simple approach for error characterization of the proposed observed system is very valuable for defining the potential of the proposed observations to better quantify the state variables, and to compare the merits of different configurations for the proposed observing system (such as instrument precision and observing strategy)
A more stringent and general OSSE is to use two independent forward models simulating the same period (Figure 11.10). For an observing system targeting atmospheric composition these would be two independent CTMs: CTM 1 and CTM 2. The CTMs should both be state-of-science but otherwise be as different as possible, so that difference between the two provides realistic statistics of the forward model error εM. In particular, they should use different assimilated meteorological data sets for the same period. We take CTM 1 as representing the “true” atmosphere, with “true” values of the state vector x generating a “true” 3-D field of atmospheric concentrations to be sampled by the observing system. This simulation is often called the Nature Run. We sample the “true” atmosphere with the current and proposed components of the observing system to produce synthetic data sets for the observed species, locations, and schedules, including random errors based on instrument precision. This generates a synthetic ensemble of current observations y and proposed observations y′ over the time period. We then use CTM 2 as the forward model over the same time period, starting from a prior estimate (xA, SA) for the state vector and assimilating the synthetic observations y from the current observing system to achieve an optimal estimate . This is called the Control Run. The posterior error covariance matrix associated with may be generated from the Control Run or have to be estimated, as discussed in Section 11.10. We then take the prior state defined by (,) and assimilate the proposed synthetic observations y′ to obtain a new optimal estimate (,) reflecting the benefit of the proposed observations. This is called the Analysis Run. The difference measures the departure of our knowledge from the truth after the proposed observations have been made, while measures the departure from the truth before the proposed observations have been made. Comparison of to allows us to quantify the value of the proposed observations for constraining x. Compared to the simple error characterization approach presented in the previous paragraph, this more advanced OSSE system has two critical advantages: (1) a more realistic description of forward model errors and their sampling by the observing system; and (2) the use of a variational data assimilation system to more realistically mimic the use of the observations. Figure 11.11 gives an example.
Figure 11.10 General structure of an observing system simulation experiment for atmospheric composition. Two independent chemical transport models (1 and 2) are used to simulate the same meteorological period. The first model is used to simulate a synthetic “true” atmosphere (Nature Run) to be sampled by the observing system. The second model assimilates current observations in a first step (Control Run), and the additional proposed observations in a second step (Analysis Run). Comparison of the paired differences (Analyzed–Nature) and (Control–Nature) measures the improvement in knowledge to be contributed by the proposed observations.
Figure 11.11 Observing system simulation experiment evaluating the utility of a proposed geostationary satellite instrument for ozone measurements (TEMPO) to detect high-ozone events in surfac
e air over the remote western USA (Intermountain West). The OSSE structure is as shown in Figure 11.10, and the objective is to map the number of days exceeding the ozone air quality standard defined as a 70 ppb maximum daily 8-h average (MDA8). The MOZART and GEOS-Chem chemical transport models are used to simulate the same three-month meteorological period of April–June 2010. These two CTMs use different assimilated meteorological data sets and are also different in their chemical mechanisms, emissions, etc. The MOZART simulation is taken as the “truth” (Nature Run) and the GEOS-Chem model is used as the forecast model for the analysis. The top-left panel shows the “true” atmosphere simulated by MOZART, and the top-right panel shows the atmosphere simulated by GEOS-Chem without data assimilation. The two are very different. The bottom-left panel shows the GEOS-Chem atmosphere after assimilation of data from the current network of surface sites, where the data (circles) are synthetic observations of the “true” atmosphere. This defines the Control Run. The bottom-right panel shows the GEOS-Chem atmosphere after assimilation of additional TEMPO synthetic observations. This defines the Analysis Run. We see that the Analysis Run does much better than the Control Run in simulating the number of days with MDA8 ozone exceeding 70 ppb. Values inset are the average number of days with MDA8 ozone exceeding 70 ppb, and the coefficient of determination R2 relative to the “true” atmosphere.
From Zoogman et al. (2014).
References
Bousserez N., Henze D. K., Perkins K. W., et al. (2015) Improved analysis-error covariance matrix for high-dimensional variational inversions: application to source estimation using a 3D atmospheric transport model, QJRMS, doi: 10.1002/qj.2495.
Modeling of Atmospheric Chemistry Page 65