Modeling of Atmospheric Chemistry

Home > Other > Modeling of Atmospheric Chemistry > Page 23
Modeling of Atmospheric Chemistry Page 23

by Guy P Brasseur


  If the plume is transported in the x-direction by the prevailing wind velocity u (assumed here to be constant) and is dispersed by turbulent motions in the perpendicular (y, z) plane, the continuity equation for the density ρ(x, y, z, t) [kg m–3] of a given species can be expressed in an Eulerian framework by

  (4.281)

  where Ky and Kz [m2 s–1] are eddy mixing coefficients in the y and z directions, respectively, for which semi-empirical formulations are available (Pasquill, 1971; Seinfeld and Pandis, 2006). This simplified expression ignores chemical transformations. It assumes that the air density is uniform and that eddy mixing in the direction x of the wind can be neglected relative to the advection (this is called the slender plume approximation). The source rate of the species at point r0 is expressed as a boundary condition (see below).

  An alternative approach to the above dispersion equation is to use a Lagrangian formulation. In this formulation, the density ρ(r, t) at point r and time t is determined by an ensemble of individual particles released at points r′ and previous times t′, and displaced by a randomly varying wind velocity to reach point r at time t. The variations in the wind velocity account for all scales of motion including the small-scale turbulent features. One can then write

  (4.282)

  where s(r′, t′) [kg m–3 s–1] is the source term. The Green function G(r, t; r′, t′) defines the probability that a particle released at point r′ at time t′ reaches point r at time t. It is assumed here that the initial concentration is zero but the formulation can easily be generalized (see Section 4.2.8). In most applications, turbulence is considered to be stationary and homogeneous, and the probability distribution of the velocity is assumed to be Gaussian. When considering a point source, the source s is equal to zero at all points r′ except at a single point r′ = r0.

  We now consider two specific applications: the Gaussian plume model, in which the point emission is constant in time, and the puff model, in which a given mass of material is released instantaneously. The two are related in that a Gaussian plume can be viewed as a superimposition of elementary puffs released continuously in time, but the assumption of constant emission enables a straightforward analytical solution in the Gaussian plume model, while the puff model can be generalized to variable winds and emissions.

  4.12.1 Gaussian Plume Models

  Let’s assume that the species under consideration is emitted with a constant source rate Q [kg s–1] at a single point located at r0 with coordinates x0 = 0, y0 = 0 and at height z0 = He above the surface, so that

  s(x, y, z) = Q δ(x) δ(y) δ(z − He)

  (4.283)

  Here, He [m] denotes the effective source height (which may account for buoyancy in the case of a heated source). The Dirac delta function δ(ξ) is equal to 1 for ξ = 0 and zero elsewhere, and has unit of m–1. With the assumptions mentioned above and for steady-state conditions (∂ρ/∂t = 0), the Eulerian advection–diffusion form of the continuity equation takes the form

  (4.284)

  with the following boundary conditions

  (4.285)

  An additional condition at the surface (z = 0) must be provided. One can assume total reflection (no uptake), or total or partial absorption (uptake). In the first case, the condition at the surface (zero flux) is written

  (4.286)

  It requires that the vertical gradient in the concentration be zero at the surface. The analytical solution of (4.284) can be found by separation of variables

  (4.287)

  If, in addition, we replace the independent variable x by

  (4.288)

  the PDEs for the dependent variables Φ(ry, y) and Ψ(rz, z) are expressed by

  (4.289)

  for the domain 0 ≤ ry < ∞ and − ∞ < y < ∞ with the boundary conditions

  Φ(0, y) = δ(y) Φ(∞, y) = 0 Φ(ry, ±∞) = 0

  (4.290)

  and by

  (4.291)

  for the domain 0 ≤ rz < ∞ and 0 ≤ z < ∞ with the boundary conditions

  (4.292)

  The corresponding solution is

  (4.293)

  and

  (4.294)

  and the spatial distribution of the concentration is therefore

  (4.295)

  Quantities ry and rz are provided by (4.288). If the eddy diffusion coefficients are uniform, we have

  (4.296)

  The solution of the advection–diffusion equation is then expressed as a function of the eddy diffusion coefficients by

  (4.297)

  If we adopt a Lagrangian point of view and derive the concentration field by considering the displacement of an ensemble of fluid particles continuously emitted at point (0, 0, He) at a constant rate Q, the application of relation (4.282) with a Gaussian probability distribution function for the particles leads to the following expression for the concentration at t → ∞

  (4.298)

  where σy(x) and σz(x) denote the standard deviations of the particle distribution in the y and z direction, respectively (see Figure 4.27). Transport down to the surface (σz = He) occurs at a distance xD = u He2/2Kz. Beyond that distance, the effect of the surface becomes important.

  Figure 4.27 Gaussian plume released by a continuous point source (here a stack located at point x = 0; y = 0; z = h). The wind direction is aligned with the x-direction. The concentration distribution resulting from the dispersion in the y, z plane is shown at two downwind locations.

  Reproduced from Stockie (2011). Copyright © 2011 Society for Industrial and Applied Mathematics.

  The Eulerian and Lagrangian formulations lead to equivalent results if

  (4.299)

  If Ky and Kz are uniform,

  (4.300)

  The standard deviations σy(x) and σz(x) vary therefore as the square root of the downwind distance x. Experimental studies show, however, that the exponent of this power law relationship is generally higher than 0.5. More elaborate empirical relations have therefore been established to express the standard deviations as a function of micrometeorological parameters and atmospheric stability (Pasquill, 1971).

  The formulation presented here for a continuous point source can be generalized to more complex situations (Lin and Hildemann, 1996) involving, for example, multiple point sources, line sources (e.g., roads), the presence of an inversion layer aloft, absorbing rather than reflecting surfaces, and the addition of chemical and scavenging processes.

  4.12.2 Puff Models

  We now consider the time evolution of a puff of mass QP [kg] instantaneously released at time t = 0 and at a given point (0, 0, He) in the atmosphere and subject to advection (constant wind speed u) in the x-direction and dispersion (constant eddy diffusion coefficients Kx, Ky, Kz) in the three spatial directions. We assume again total reflection of the material at the surface. The advection–diffusion equation to be solved is

  (4.301)

  for the source defined by

  (4.302)

  The solution for the mean distribution of the concentration is

  (4.303)

  where the standard deviations σx, σy, and σz measure the dispersion of the puff relative to its center of mass located at x = ut. They are related to the eddy diffusion coefficients by

  (4.304)

  The solution is readily generalized to the evolution of a plume from a time-dependent point source in a variable 3-D wind. In that case we can view the plume as resulting from a continuous suite of puffs emitted sequentially. If we consider N puffs i = 1, N of mass qiΔt [kg] emitted successively at time ti = iΔt and advected by a 3-D wind field from source point (0, 0, He) to position xi(t), yi(t), zi(t) at time t, the mean concentration field is given by

  (4.305)

  4.13 Statistical Models

  Physical models such as the Eulerian, Lagrangian, and Gaussian plume models presented in the previous sections simulate the behavior of chemical species on the basis of conservation equations that capture the effects of chemical and transport processes in the atmo
sphere. Statistical models offer an alternative approach that does not require a physical representation of the chemical and dynamical processes involved. Such models are typically based on empirical relationships between variables established from a large number of previously observed situations. Sometimes the relationships are established from a highly detailed but computationally unaffordable physical model. Once the statistical model has been constructed, it can be applied to conditions within a certain range of validity (typically the ranges of the input variables used to construct the model). Statistical models are often used to explore relationships between an output variable and different candidate input variables, to parameterize a physical model for faster computation, or to make forecasts on the basis of observations of the present state. In the latter case, the present value of the output variable is often a useful input variable for the forecast model. We describe here two types of statistical models: multiple linear regression models and artificial neural networks.

  4.13.1 Multiple Linear Regression Models

  Multiple linear (or multilinear) regression models specify a linear relation between one dependent (output) variable noted y and P independent (input, predictor, or explanatory) variables denoted xp (p =1, P). Consider a set of N successive observations i of the dependent and independent variables (yi and xi, p for i = 1, N). We write the linear expression

  yi = b0 + b1xi , 1 + b2xi , 2 + . . . + bPxi , P + ei

  (4.306)

  where ei is a residual error term, and bi are unknown parameters to be derived from the ensemble of observations. Errors are assumed to follow a normal distribution with a zero mean value and a variance σ2.

  Equation (4.306) can be rewritten in matrix form

  y = Xb + e

  (4.307)

  where y = (y1, y2, … yN)T is the N-dimensional response vector, b = (b0, b1, … bP)T is the P + 1 dimensional slope vector, and e = (e1, e2, … eN)T the N-dimensional error vector. The N × (P + 1) matrix

  is called the design matrix.

  In the presence of error, design of a reliable multilinear model requires that the number of independent observations available be much larger than the number of unknown parameters bi (N ≫ (P + 1)). In the ordinary least squares method, optimal values for the parameters are derived by minimizing a cost function J(b) defined as the sum of the square differences between the observed values and their corresponding model values

  (4.308)

  Solving dJ(b)/db = 0 yields the P + 1 “normal equations”

  (4.309)

  or in matrix notation

  (4.310)

  The solution

  (4.311)

  is unique, provided that the N rows of matrix X are linearly independent. Here XTX and (XT X)–1 are (P + 1) × (P + 1) symmetric matrices and XTy a P + 1 dimensional vector. The fitted values of the predicted quantity are then

  (4.312)

  The success of the fitting model is commonly measured by the coefficient of multiple determination R2 (more often called R-squared coefficient) defined as the ratio between the variance of the fitted values to the variance of the observed values y. Since ,

  (4.313)

  More explicitly,

  (4.314)

  where

  (4.315)

  is the mean value of the observed data. The R2 coefficient, whose value varies between 0 and 1, represents the fraction of the variance that is explained by the multilinear model. Some caution is needed in interpreting R2 as the quality of the fit because R2 will tend to increase as more predictor variables are added without actually increasing the predictive capability of the model. Adjusted coefficients of determination are used to address this problem (see Appendix E).

  4.13.2 Artificial Neural Networks

  Artificial neural networks are inspired by the functioning and learning ability of biological neural systems. They do not require assumptions on the relationships between input and output variables, which is an advantage over the multiple linear regression approach. The method can be applied to any smooth nonlinear relationships that exist between the variables. The artificial neural network is trained by using observational data to adjust its internal parameters until a usefully predictive input–output mapping is achieved. The predictive power can be continually improved through the ingestion of more observational data.

  An artificial neural network (Figure 4.28) consists of layered interconnected nodes (or neurons) including an input layer, one or more successive “hidden layers,” and an output layer. The input layer plays no computational role; it only supplies data to the first hidden layer. The output layer provides the solution. Neurons are connected to all nodes belonging to the upstream and downstream neighboring layers. The number of hidden layers is determined by the complexity of the problem. In the feed-forward network considered here, information flows only in one direction from the input to the output layer. More complex architectures include possible feedbacks between layers.

  Figure 4.28 Typical artificial neural network with interconnected neurons (or nodes) including an input layer, a “hidden” layer, and an output layer. The network must be trained; from a set of M known training patterns available from observations, it learns how the input X represented by a [M × K] matrix relates to the output Y represented by a [M × J] matrix. Here, K denotes the number of input nodes (and variables) and J the number of output nodes (and variables). In the figure, K = 3 and J = 2.

  Each node is characterized by a set of numerical inputs xi that conveys information from other upstream nodes with specific weights. The total input signal yj to node j from all upstream nodes i is given by

  (4.316)

  where xi is the output signal of node i and wi,j is the weight of the signal flowing from node i to node j. A second element is an activation or transfer function F that transforms the weighted input yj into a numerical output xj. The logistic function is frequently adopted:

  (4.317)

  The output is then passed to the next downstream hidden layer until one reaches eventually the output layer.

  The optimal values of the individual weights wi,j are determined by training the system with observations. Training is the process by which one determines a combination of weights that leads to a minimum error of the output variables. This process is equivalent to the derivation of the intercept and slope coefficients in the linear regression method described in the previous section. Different mathematical algorithms are available to determine the conditions under which the error is minimal (see, e.g., Bishop, 1995). Most of them adopt some form of gradient-descent approach in which the value of the weight parameters of the models, initially small and random, are gradually modified along a steepest-gradient direction of the error function until an absolute minimum value is reached.

  4.14 Operator Splitting

  The continuity equations for chemical species consist of a sum of terms describing different processes for which the model provides independent formulations, as described for example by (4.10). These processes are occurring simultaneously. Ideally, the numerical algorithm used to solve the equations should account for this simultaneous coupling. This is not practical in 3-D models because of the multi-dimensionality of the problem and because the numerical algorithms best suited to treat each of the terms are very different. It is therefore often advantageous to solve for the different terms individually and sequentially over a given time step Δt. This approach is called operator splitting. It enables software modularity, algorithms tailored to each operator, and better performance. For example, a stiff integrator can be employed to integrate the chemical equations, while a flux scheme can be adopted to integrate the transport equations.

  To describe the method, we consider a simple form of the continuity equation for a variable Ψ that represents, for example, the concentration of a chemical species. We assume that this variable is affected by two distinct atmospheric processes (such as transport and chemistry), represented by linear operators A and B. Thus we write

  (
4.318)

  To apply the operator splitting method over a time interval Δt, we first update the value of Ψ from time tn to time tn+1 = tn + Δt by calculating at each model grid point (i, j, k) an intermediate value (Ψ*) resulting from the application of operator A over the time interval [0, Δt]:

  (4.319)

  The resulting value of Ψ* at time Δt is then used as the initial condition for the second step involving operator B:

  (4.320)

  The value of Ψn+1 at time tn+1 is thus given by the calculated value of Ψ** at time Δt.

  In the scheme presented above and referred to here as an A–B scheme, the integration is initiated by applying operator A followed by operator B. If we reverse the order of the successive integration (B–A scheme), the solution will be somewhat different. Ideally the order should not matter, but in fact it does, and this represents the operator splitting error (Box 4.12).

  Box 4.12 Characterization of Operator Splitting Errors (Sportisse, 2010)

  Assume that the continuity equation includes two linear terms representing two distinct processes

  dΨ/dt = A1Ψ + A2Ψ

  where Ψ represents a vector of dimension n (e.g., species concentrations) and A1 and A2 are n × n matrices. If the solution at time tn is known, then the analytical (true) solution at time tn+1 = tn + Δ t is

 

‹ Prev