When an ensemble of K models are used to calculate a given field j, a best estimate of the predicted concentration can be derived as:
(10.36)
where Gm acts as a weight that favors the solutions obtained by the models with the best grades. The variance weighted by the model scores is given by:
(10.37)
10.5.2 The Taylor Diagram
Taylor (2001) proposed a concise graphical method for representing on a single figure several statistical indicators that describe the degree of agreement between model results and observations. The Taylor diagram indicates how model and observed patterns compare in terms of their correlation, their RMS differences, and the ratio of their variances.
Pattern similarities between the calculated and observed fields Mi and Oi can be quantified by the Pearson correlation coefficient r (10.27). However, this does not provide information about the relative amplitude of the two quantities. It is therefore useful to introduce the centered root mean square error (CRMSE):
(10.38)
This quantity tends to zero when the patterns of the two fields and the associated amplitudes are similar. However, the CRMSE does not indicate if the error is due to a difference in the phase or in the amplitude of the signals. An additional comparison of the two fields is their standard deviations σO and σM defined by (10.14). The Taylor diagram is constructed by recognizing that the four statistical quantities (r, CRMSE, σM and σO) are related by
(10.39)
Figure 10.19 illustrates the Taylor diagram. The red arrows are a geometric representation of (10.39). The polar graph provides a rapid quantification of the four statistical parameters for any point on the diagram. The standard deviations of the observed field O and the model field M are represented by the radial distances from the origin; the point representative of the observations, called reference point, is located on the x-axis, with the abscissa equal to the corresponding standard deviation σO. The radial distance between the origin and the location of the test point, which characterizes the simulated field, is equal to the standard deviation σM of the calculated field. The correlation coefficient r between the observed and calculated fields is shown by the azimuthal position on the diagram. The distance between the reference and test points is the CRMSE.
Figure 10.19 Taylor diagram summarizing the statistical comparison of a test data set (model M) to a reference data set (observations O). The observed standard deviation σO is plotted on the right horizontal axis. The model standard deviation σM is plotted as the dotted lines with values given on the left horizontal axis. The CRMSE is given by the dashed lines and the correlation coefficient r by the solid lines. The statistical fit between model and observations is given by the test point on the diagram. The reference point on the diagram indicates a perfect model. Knowledge of σM and r is sufficient to define the location of the test point, and from there the CRMSE is determined by the distance between the reference point and the test point.
An alternate form of the Taylor diagram is often used in which CRMSE and σM are normalized to the standard deviation σO of the observed field. This allows the representation of multiple data sets having different concentrations and/or units. Figure 10.20 gives an example in which comparison statistics for multiple species are shown on a single diagram. The normalized reference point is 1 on the x-axis.
Figure 10.20 Taylor diagram for a comparison of chemical transport model results to aircraft observations of ozone, NOx, isoprene, and formaldehyde (CH2O) in the Southeast USA mixed layer in summer 2013. The different symbols describe model simulations at different horizontal grid resolutions (0.25° × 0.3125°, 2° × 2.5°, 4° × 5°). The radial coordinate is the normalized standard deviation σM/σO. The angular coordinate is the Pearson correlation coefficient. The open circle represents the observations (reference point of Figure 10.19). The normalized CRMSE is shown as solid lines. The figure shows that the best model simulation is for ozone. Correlation improves when the resolution increases from 4° × 5° to 2° × 2.5° but then decreases at 0.25° × 0.3125° because fine-scale features are more difficult to capture by the model than broad synoptic-scale features.
From Yu et al. (2016).
Taylor (2001) and Brunner et al. (2003) propose different expressions to quantify the skill score of a model. These expressions assume that for a given variance, the score increases monotonically with increasing correlation. Further, for a given correlation, the score increases as the variance produced by the model approaches the variance associated with the observation. The resultant expressions for the model skill S take the form
(10.40)
where r0 is the maximum attainable correlation. This model-dependent parameter, which must be estimated, accounts for the fact that the model is not expected to reproduce the details of the noise in the data (unforced variability). The value of exponent n is chosen according to the weight given on a good correlation versus a small RMS error. Isolines for skill scores based on such expressions can be represented on the Taylor diagram.
10.5.3 The Target Diagram
The Taylor diagram does not provide information on the mean bias (BIAS) between model and observed quantities. The Target diagram (Jolliff et al., 2009; Thunis et al., 2012; Figure 10.21) provides this missing information in addition to summary information about the pattern statistics, thus yielding a broader overview of their relative contribution to the total RMSE. Again, the values of the statistical indicators are normalized to the standard deviation of the observations σO. Using Cartesian coordinates, the value of CRMSE/σO is displayed on the x-axis and the value of BIAS/σO on the y-axis. One can show that:
(RMSE)2 = (BIAS)2 + (CRMSE)2
(10.41)
so that the distance between the origin and any data point displayed on the diagram represents the total RMSE normalized by σO and is therefore viewed as the target indicator.
Figure 10.21 Schematic representation of the Target diagram.
Markers may be added within the diagram to better evaluate the model results. In Figure 10.21, the outermost circle corresponds to RMSE/σO = 1, so for all points inside this contour, the model data are positively correlated with observational data. A second contour corresponds to a higher performance, here RMSE/σO = 0.7. All points that represent successful model calculations are expected to appear inside this second contour. A third contour (dashed line) can be added to characterize the threshold of observational uncertainties; no meaningful improvement in the model-data agreement is obtained as the points displayed inside this circle approach the origin (target) of the diagram. Finally, the x-axis of the Target diagram is used to provide information on standard deviations: If the model standard deviation is larger than the observed one, the points are plotted on the left side of the diagram (negative abscissa); in the opposite case, they are plotted on the right side (positive abscissa). A weakness of the Target diagram is that it does not provide explicit information about the correlation coefficient.
10.6 Significance in the Difference Between Two Data Sets
An important question in the comparison of two data sets (such as model vs. observations) is whether differences between the two data sets are real or the result of random noise. The statistical significance of a difference expresses the likelihood that it is real as opposed to random. Consider here the comparison between two sets of sampled data (X1, X2) of sizes (n1, n2) with distributions defined by the population means (, ) and unbiased estimators of their variance (σ1, σ2) (see Appendix E). The Student’s t-test provides a statistical method to estimate the likelihood that the difference between the means of two distributions is significant (Figure 10.22). It assumes that the distribution of the populations is normal (Gaussian). In the student t-test, the t-variable is given by
(10.42)
where σT is the pooled standard deviation of the two samples
(10.43)
The value of t computed from (10.42) is compared to the critical value tc provided by a statistical table for a given v
alue of the number of degrees of freedom (n1 + n2 – 2) and for a user-specified risk factor p (Appendix E). If t exceeds tc, the difference between the averages of the two distributions is considered to be significant at the specified risk level p. If p is chosen to be equal to 5%, the confidence level that the samples differ from each other is equal to 95%. Atmospheric maps showing the spatial distribution of differences between two fields usually highlight the areas where the results are statistically significant at a specified confidence level, and where they are not. Figure 10.22 gives an example. It is conventional in the literature to qualify a statistically significant result as confident, highly confident, and very highly confident if the adopted level of confidence is equal to 95%, 99%, and 99.9%, respectively.
Figure 10.22 (a) Example of two distributions for random variables (green) and (blue). The mean values are identical for the cases characterized by high (top) and low (bottom) variability, but the overlap between the distributions is very different in the two cases. The significance of the difference between the averages of and is highest in the low-variability case (bottom panel with little overlap between distributions). (b) Application of Student’s t-test to derive the significance of the differences in the mesospheric ozone concentrations calculated by a chemistry-climate model for high and low solar activity, respectively (July conditions). Statistical significance larger than 90% (99%) is indicated by light (dark) gray shading.
Reproduced from Schmidt et al. (2006). Copyright © American Meteorological Society, used with permission.
10.7 Using Models to Interpret Observations
The model evaluation metrics described in Section 10.5 are intended to summarize the ability of a model to reproduce large ensembles of observations. They should be supplemented by more ad-hoc comparisons of temporal and spatial patterns, including relationships between species and with meteorological variables, as described in Section 10.4. Combination of these procedures is essential for establishing confidence in the model as a tool to interpret present-day atmospheric behavior and to make future projections. It provides the foundation for using the model to derive chemical budgets, conduct source–receptor analyses, infer source attribution from sensitivity simulations, etc. In this evaluation perspective, the observations are a given, and the task of the model is to reproduce them within a certain error tolerance. Here, we briefly discuss a different use of the model as a tool to explain the variability in the observations and from there to understand the processes that drive this variability. This involves a somewhat different perspective in model evaluation.
The general scientific approach for understanding the behavior of a complex system is to observe its variability and interpret it in terms of the driving variables. This interpretation requires a model as simplification of the system. The model may be very simple and/or qualitative, and indeed such simple models (even mental models) are often presented in observational papers as a first analysis of the data. However, simple models may be flawed by omission of important processes that are not always apparent. For complex problems in atmospheric chemistry, such as those coupling chemistry and transport, access to a 3-D model is usually required for successful interpretation. Here the purpose of the model is to distill the phenomena driving the observed variability through sensitivity simulations and/or through model simplifications to highlight the essential variables. The focus is on interpreting observations to gain scientific understanding, and the model is a tool for addressing that objective.
Interpreting observed correlations between species is an important example. These correlations can point to common sources or source regions, as in the methane vs. CO2 relationships shown in Box 10.1 Figure 2. Changes in the relationships between different air masses provide insights into atmospheric processes or cause-to-effect connections. Quantitatively interpreting the correlations in terms of constraints on processes is, however, fraught with pitfalls because the factors driving the correlations are often not intuitive or easy to isolate. Simulation of the relationships with a 3-D model including a comprehensive treatment of processes can illuminate the interpretation of the observed relationships and in this manner advance knowledge.
We illustrate this point here with the interpretation of observed correlations of acetylene (C2H2) with CO, as presented by Xiao et al. (2007; Figure 10.23). Both C2H2 and CO are emitted almost exclusively by combustion, and both are removed from the atmosphere by oxidation by OH with mean lifetimes of ten days and two months, respectively. Observations taken from aircraft campaigns around the world consistently show strong correlations between C2H2 and CO, from source regions to the most remote air masses. We would like to extract the constraints that these correlations provide for improving our understanding of emissions, atmospheric transport, and OH concentrations.
Figure 10.23 Relationships between acetylene (C2H2) and CO concentrations over the western Pacific. Aircraft observations for different regions (in black) are compared to results from the GEOS-Chem global 3-D chemical transport model (in red). The top row shows linear relationships and the bottom row shows log–log relationships. Reduced-major-axis (RMA) regression lines are shown with coefficient of determination (R2) in parentheses. Errors on the regression lines are determined with the bootstrap method. Note the differences in scales between panels.
Adapted from Xiao et al. (2007). Observations are from D. R. Blake (University of California – Irvine) and G. W. Sachse (NASA).
Figure 10.23 shows aircraft observations of the C2H2–CO relationship over the Pacific just off the China coast (boundary layer outflow), in the more remote west tropical Pacific, and in the very remote south tropical Pacific. The top panel shows the linear relationships ([C2H2] vs. [CO]) and the bottom panel shows the log–log relationships (log[C2H2] vs. log[CO]). Also shown in the figure are the correlations simulated by a global 3-D model. We see that the model reproduces the correlations but there is significant bias in the slopes. We can then use the model to understand the meaning of the correlations and the factors driving the slopes.
Let us first examine the linear correlations (top row). The correlation in the fresh Chinese outflow (top left panel) reflects the dilution of polluted Chinese air masses with background air. The transport time since emission is much shorter than the lifetimes of either C2H2 or CO. The C2H2:CO slope therefore reflects the Chinese emission ratio, providing a useful test of emission inventories. However, we find that the model slope of 4.7 is lower than the Chinese emission ratio used in the model (6.2), because the dilution takes place with non-zero background air. Thus, one cannot interpret the observed slope (4.0) as the emission ratio without accounting for this background correction. The marine air in which the outflow is diluting may further be different from the continental background air where the initial dilution took place. The best estimate of the emission ratio can be made by adjusting the emissions in the model to reproduce the observed slope.
The importance of characterizing the background is even more apparent in the correlations over the more remote west tropical Pacific (top row, middle panel). Here, the correlations are just as strong as in the fresh Chinese outflow, and the slopes are larger than in the fresh outflow, both in the observations and the model. This is counter-intuitive since one would expect chemical loss of C2H2 to decrease the slope, and we can turn to the model to explain this result. We find in the model that the higher slope is because the dilution is now taking place with tropical background air containing very low C2H2. In that case the C2H2–CO correlation is determined by the mixing between mid-latitude and tropical air masses, and provides little information on either emissions or chemistry. The south tropical Pacific (top row, right panel) shows lower slopes because all air masses in that case have experienced considerable chemical aging – note the differences in scale between panels. One could use the correlations over the south tropical Pacific to provide constraints on chemistry (and hence on OH concentrations), but separating chemical influence from emissions and transport is not straightfo
rward.
Correlating the logarithms of concentrations offers a means to remove the influence of emissions. McKeen et al. (1996) proposed a simple Lagrangian mixing model for this purpose. Consider two species i and j in an aging air parcel receiving no fresh emission inputs and diluting at a constant rate in a uniform background. The evolution of the mixing ratio Ci of species i in that air parcel is given by
(10.44)
Here, Li = ki[OH] is the first-order chemical loss frequency [s–1] where ki is the rate constant for reaction with OH, Kd [s–1] is a dilution rate constant, and Ci,b is the background mixing ratio. The chemical lifetime of species i is τi = 1/Li. A similar equation holds for species j. Let us assume that [OH], Kd, Ci,b, and Cj,b are constant, and let β = dln Cj/dln Ci denote the slope of the log–log relationship. Simple analytical solutions for β are available in three limiting cases:
(10.45)
These limiting expressions are useful for interpreting correlations when the proper conditions apply. The case of the C2H2–CO correlation is problematic because both species have relatively long chemical lifetimes and non-negligible backgrounds.
Ehhalt et al. (1998) proposed an alternate simple Eulerian model in which dilution with background air is represented as an eddy diffusive process. Assuming the diffusion to take place in one dimension x with eddy diffusion coefficient Kx, we have
(10.46)
The steady-state solution Ci(x) subject to boundary conditions Ci(0) at the point of origin and Ci(∞) → 0 is given by
Modeling of Atmospheric Chemistry Page 58