Conditional Stochastic Simulations of Flow and Transport with Karhunen-Loève Expansions , Stochastic Collocation , and Sequential Gaussian Simulation

We derive a newmethod of conditional Karhunen-Loève (KL) expansions for stochastic coefficients inmodels of flow and transport in the subsurface, and in particular for the heterogeneous random permeability field. Exact values of this field are never known, and thus onemust evaluate uncertainty of solutions to the flow and transportmodels.This is typically done by constructing independent realizations of the permeability field followed by numerical simulations of flow and transport for each realization and assembling statistical estimates of moments of desired quantities of interest.We follow the well-known framework of KL expansions and derive a new method that incorporates known values of the permeability at given locations so that the realizations of the permeability field honor this data exactly. Our method relies on projections to an appropriate subspace of random weights applied to the eigenfunctions of the covariance operator. We use the permeability realizations constructed with our stochastic simulation method in simulations of flow and transport and compare the results to those obtained when realizations are constructed with sequential Gaussian simulation (SGS). We also compare efficiency and stochastic convergence with that of stochastic collocation.


Introduction
Computational modeling of flow and transport in the subsurface requires detailed knowledge of coefficients of partial differential equations (PDEs), in particular of permeabilities K and porosities Φ.These are heterogeneous; that is, K = K() where  ∈  and  ⊂ R  is the domain of flow.However, the actual values of K() are never known exactly in a real subsurface reservoir; thus the values of K() can only be inferred from partial measurements at well locations, say x * = ( * 1 , . . .,  *   ) ∈ , or from some other information such as seismic and other geological analyses.Therefore the PDEs are stochastic and a typical set of simulations of flow and transport involves geostatistical realizations of K() to account for the associated uncertainty of this data, the solutions to the PDE, and any associated quantities of interest .The randomness (uncertainty) of K() is denoted, as usual, by K(, ) where  is a random element of a probability space.
The use of Karhunen-Loève (KL) expansions, a close relative of Principal Component Analysis (PCA), for parametrization of a random field, is well known in computational mathematics and engineering community [1] and has been also applied in mathematical geosciences [2].Assume that K() = K(, ) is a Gaussian log-normal field with known covariance function (, ).By design, KL expansions account for spatial variability of K() through a sequence of eigenfunctions of  smooth in .Since these correspond to rapidly decreasing eigenvalues, one typically truncates the KL expansion to  terms in K  (, ) which capture the desired proportion of the variance of the field.Finally, KL expansions are independent of the resolution intended for  in the sense that a realization of K  (, ) can be constructed with arbitrary resolution.
The framework of KL expansions discussed in [1,2] does not however take into account existing point measurement data at locations x * .Though the variability, or "noise, " around Before KL expansions became popular, other geostatistics methods, for example, sequential Gaussian simulation (SGS), were used to construct conditional or unconditional realizations of K().Thus we compare our expansions to those obtained with SGS implemented in a well known geostatistics package [3].SGS uses the same covariance model and conditioning points as our method and outputs a set of independent realizations of K(, ) of desired size and at a prescribed resolution.We use next the permeability realizations obtained with our method and with SGS as data for a deterministic saturated Darcy flow and single component advection-diffusion transport solver.While the flow results have been published by many authors for unconditional stochastic simulations, the transport results coupled to the flow are less frequently considered and bring interesting insights.
Our comparison of KL and SGS is of theoretical and practical value and we discuss various aspects of these two entirely different family of methods.Finally, we consider stochastic collocation, a highly accurate method of computing statistical moments of quantities of interest, and discuss its use in the context of conditional representations.
Our approach gives a new method of using KL expansions to simulate conditioned random fields.An alternative approach to conditioning of K() was considered in [4].In order to honor the observed values, they compute the eigenfunction and eigenvalue pairs of the conditioned process.This results in an ensemble of eigenpairs that depend on the particular locations.In this paper we project the random coefficients of the KL expansion onto the appropriate subspace, so that the resulting realizations depend on the particular locations.This allows us to work with the eigenfunctions and eigenvalues derived for the unconditioned process.It also gives a simple decomposition of the simulated process in terms of the conditional mean (kriged values) and the fluctuations around that mean.Other stochastic numerical methods for flow and transport equations include the contributions in [5][6][7][8][9]; these consider very different avenues from direct stochastic parametrizations and will not be reviewed here.
The analytical covariance models used in this paper are the ubiquitous exponential model and the less commonly considered Gaussian model.These stationary models depend only on two-point correlations.We do not address nonstationary field, experimental covariance, multipoint geostatistics or physical models beyond linear flow and transport.These important extensions will be discussed elsewhere, while we refer to [10] for some work in these directions.
The plan of the paper is as follows.In Section 2 we recall the flow and transport models which use field realizations of the random field K() as their input.In Section 3 we show how realizations of the field Y() = ln K() are generated with a known covariance function  Y (, ) together with point measurement data given at x * .Section 3.1 develops the main theoretical result, the conditional KL expansions for a generic random field Y().We also recall in Section 3.2 how SGS creates realizations based on the same information.In Section 4 we present simulation results of the numerical model using realizations of K() obtained with KL and with SGS, paying particular attention to the use of stochastic collocation method with data based on prior measurements.

Stochastic Flow and Transport Models
In this section we make precise the flow and transport models and their numerical discretizations.These are well known [11,12] but are provided for completeness together with appropriate boundary and initial conditions.The corresponding numerical discretization of these is also standard [12,13]; details for the formally defined weak formulations of flow and transport are provided in Appendix D. We follow the structure of stochastic models established, for example, in [1,2,14].
Let  ∈ R  be the open bounded domain in R  where the flow and transport take place.In principle, one can consider  = 1, 2, 3 but our examples involve  = 2.The time variable is denoted by .The boundary  of  is assumed to be smooth so that, in particular, the outer unit normal n() to  is well defined.The boundary  is split twofold into disjoint subboundaries depending on the prescribed boundary conditions for the flow  =   ∪   and transport  =   ∪   .These are made precise below.
The region  has associated permeability field K and porosity field .In this paper we assume that K = K(, ) is a random heterogeneous field and  is constant.For simplicity we also assume that K() is isotropic.Furthermore, the fluid and the medium have an associated diffusion-dispersion coefficient D, but we ignore the dispersion by setting D equal to molecular diffusivity; that is, D =  =   I. Since   correlates with porosity, it makes sense to assume D is constant.The case of random , D will not be considered in this paper.In comparisons, we also consider no diffusion; that is, D = 0.
In the flow problem we seek fluid velocity u and its pressure .Since K is random, so are  and u, as given by Doob-Dynkin Lemma [15].The stochastic model of flow combines momentum (Darcy's law) with mass conservation as follows: This flow model is complemented by boundary conditions of Dirichlet and Neumann type, respectively, prescribed on   and   as follows: In (1a) above () are the fluid sources and sinks due to, for example, the presence of wells, henceforth assumed absent.
Once the flow model is solved, the pressures  and fluxes u are known.Now the solute is subject to advection with velocity u and to diffusion.The transport model is solved for the solute concentration  and is a mass conservation combined with Fick's law, with  denoting the solute source/sink term.Consider subject to the following initial conditions The boundary conditions are imposed on the inflow and outflow boundaries with   denoting the inflow boundary where u⋅n < 0 and   the outflow boundary where u⋅n ≥ 0. These are prescribed as  (, , ) =   (, ) , (, ) ∈   × (0, ] , An important quantity of interest is the average breakthrough curve which represents the total amount of the substance leaving the region .It is best plotted against another time dependent quantity, the pore volume injected which in the case of fully saturated flow grows simply linearly with time Finally, the breakthrough times BTT are BTT (; ) := sup { : BTC (, ) = } , where  is some desired value of the breakthrough curve.

Numerical Solution for Flow and Transport.
Given K(, ) and boundary and initial data we can solve the flow ((1a), (1b), (1c), (1d)) and the transport models ((2a), (2b), (2c), (2d)) numerically.Let T ℎ be a partition of the domain  into nonoverlapping rectangular elements   , where ℎ denotes the largest diameter of elements   from T ℎ .Also let 0 =  0 <  1 < ⋅ ⋅ ⋅ <   =  be a given partition of the time interval [0, ] and Δ  =   −  −1 .(We also use the subscript  for random sequences in Section 3, but as they appear in separate contexts there is no risk of confusion.) We seek numerically the approximations  ℎ (, ), u ℎ (, ) to (, ), u(, ), and   ℎ (, ) to (,   , ).To define these approximations, we use standard numerical methods for which natural stochastic extensions have either been already established [1,2], or are straightforward.For the flow problem we employ cell-centered finite differences (CCFD).These are equivalent to mixed finite elements RT [0] of the lowest Raviart-Thomas order on rectangles [16].For the transport we employ CCFD and upwinding, as well as implicit Backward Euler time discretization.These numerical methods are well known and are first order accurate in time and space.They are also superconvergent in some norms, and can be modified to obtain higher accuracy, but we do not discuss numerical error.See [2] for a recent formal extension of CCFD to stochastic methods and error estimates for flow.See also Appendix D for the formal weak setting of these methods for flow and transport, extending [13] to the stochastic case.
See Figure 4 for typical results of flow and transport simulations.

Methods for Generating Realizations of K(𝑥, 𝜔) and Computing Moments of Quantities of Interest 𝑄(⋅, 𝜔)
In this section we describe how to generate independent realizations of the random heterogeneous permeability field K(, ),  ∈ .Here  is a random element in an underlying probability space.Most of the crucial information about K comes from measurements and its spatial structure with  indexing the uncertainty of the information.In physical simulations of flow and transport we consider  independent realizations K(,  1 ), . . ., K(,   ) with which we assess statistical moments of the simulation results; that is, some quantities of interest  which depend on the (  )'s and possibly on , .For example,  could be pointwise values of pressures or some averages of fluxes or concentrations.
In what follows we will denote this dependence by (⋅,   ) where the "⋅" accounts for any relevant nonstochastic variables or parameters.
We assume the following probabilistic information about K(, ): it is a stationary field, its logarithm has a normal distribution, with known bounded (analytical) covariance function We also assume that   measurements Y where the fixed parameters  2 and , representing respectively the stationary variance (or sill) and the correlation length, give the assumed second-order spatial properties of the field.KL expansions additionally require that the mean field, is known.In practice, it appears that (11) provides a highly variable background and a realistic looking field to which the spatially distributed "noise" constructed by unconditional KL expansions is added.However, this does not allow the realizations of the field K() to satisfy the prescribed given values exactly.Thus the unconditional expansions investigated in [1,2] could not be compared to conditional Gaussian fields.Below we describe how to get K(,  1 ), . . ., K(,   ) using KL and SGS.Next we show how the moments of (⋅, ) are calculated.

Karhunen-Loève (KL) Representation and Realizations of K(𝑥,𝜔).
First we consider the well known case when   = 0, next we develop   > 0. By Mercer's theorem [17] the spectral decomposition of the positive definite covariance matrix is where   are positive eigenvalues and   () are mutually orthogonal eigenfunctions, that is, solutions to the Fredholm integral equation of the second kind [14] which satisfy ∫    ()  () =  , with  , being the Kronecker-delta.
In what follows we denote by Λ the infinitely dimensional diagonal matrix of eigenvalues   ,  ≥ 0. By convention, the eigenvalues are arranged in Λ from the largest to smallest.Also, we denote by Φ() the vector of eigenfunctions evaluated at .One can thus rewrite (12) as   (, ) = Φ  ()ΛΦ().We denote the upper left diagonal block of Λ of size  ×  by Λ  and also define Φ  () as the finite length vector corresponding to the first  eigenfunctions.

Unconditional Representation
When   = 0.The KL expansion [18] of Y is well known where {  ()} ∞ =1 are independent standard Gaussian ((0, 1)) random variables.In other words, (14) represents the fluctuation where Ξ() = {  ()} ∞ =1 .In practice ( 14) is truncated to the first  terms is chosen so that most of the "energy" or "volatility" represented by the sum of eigenvalues in Λ  is captured to the desired accuracy; see Section 3.1.3.

Conditional Representation
When   > 0. In order for Y() to have a given measurement value Y( *  ) at every point  *  in x * , we consider the associated conditional process Ỹ().Our main idea is that in the representation of Ỹ() we use the original eigenfunctions listed in Φ() but we project the random coefficients in Ξ onto the appropriate subspace.In other words, we replace the original   by ξ in (14).The variable { ξ } has the same distribution as the original random sequence {  } after being conditioned on X * := ({Y(x * )}), the -algebra generated by the   measurements Y(x * ).
Define Σ ∈ R   ×  to be the covariance matrix corresponding to the observed locations and  to be the matrix with th row given by the values of the th eigenfunction at the observed locations This gives Σ = (Σ  ) =   Λ.We assume that Σ is full rank.Notice that for every  and  *  , Using the usual formula for conditional means and covariances of Gaussian random variables [19], we compute the mean μ and covariance M = ( , ) of the sequence of random variables Thus , and it is easy to show M = M 2 ; that is, M is a projection matrix.(See Appendix A for details.) Finally, the sequence {  } ∞ =1 conditioned on X * has the same distribution as where {  } ∞ =1 is a sequence of i.i.d.(0, 1) random variables.In particular M projects  onto the subspace that gives the Ỹ process conditional variance 0 at the locations  * 1 , . . .,  *   .We can thus state the conditional representation Ỹ() of Y() which modifies (14).The truncation at  finite of the expansion (23) written componentwise is (24) where the { ξ, }  =1 are the random coefficients for the truncated expansion constructed from the analogues μ and M  of μ and M. In particular Σ is replaced by Σ  , the covariance matrix (also assumed full rank) of Y  (x * ).(For more details, see Appendix A.) It should be noted that M  projects the  dimensional vector Ξ  onto the  −   dimensional subspace corresponding to Ỹ (x * ) = Y(x * ).In order to have a sufficient number of degrees of freedom for the projection, we must then have Obviously, the elements of the sequence { ξ }  =1 are not independent.Hence their joint probability density function, which is needed later in moment calculations, is not a product.While this can be handled as in [1], it is much more convenient to work directly with the   's which are independent and consequently have a product density.Combining (24) with the truncated version of ( 22) we get The practical expression ( 26) is used in moment calculations using stochastic collocation discussed in the sequel.We also see that the first term in ( 26) is essentially deterministic, depending on the Y  but not on the   's.It can be readily interpreted as the kriged mean of Y() which honors the known data.The second term in ( 26) is associated with M   and forces the fluctuations of Y  to be zero at the x * locations.

Truncation Error and Conditioning. The truncation Y 𝑁
of Y reduces total variability as follows: The conditioned process has, conditionally, reduced total variability in the following sense: where ) It should be stressed however that the total unconditional variability of the truncated conditioned process matches that of the truncated Y  process; in particular for any  ≥ 1, As a consequence, the conditioned process replicates the irregularity and other distributional properties of the unconditioned process; compare [17].Again, see Appendix A for details.

Example of Conditional Representation with KL (24).
A given covariance function  Y (, ), in principle, uniquely determines the spectra (  ,   ) ∞ =1 of  Y (, ) via (13).The analytical solutions of ( 13) are known only for a few selected stationary separable models of  Y (, ) [20,21], and even these require some numerical calculations.In general, a numerical solution of (13) must be obtained via, for example, collocation or Galerkin methods [22].We use the collocation method to obtain the spectra for covariances corresponding to a few variograms common in geostatistics [23,Chapter 4] and in particular, for the exponential, Gaussian, spherical, and hole variograms common in geostatistics [3].Examples of the first two are given in Figure 1 along with the plot of their eigenvalues.

Sequential Gaussian Simulation SGS. Geostatistical Software Library (GSLIB
) is an open source library of routines which allows simulation of realizations of porous media using different variograms (or covariance functions) as a proxy for assumed physical properties [3].GSLIB is well known in the geostatistical community and SGS is a relatively simple but quite robust simulation method.The realizations of Ỹ() honor input data at their locations, and the global histogram and variogram are reproduced within ergodic fluctuations.SGS follows a sequence of well defined steps including transformations to and from normal scores and establishing a random path through the locations where the new data is needed.
The major differences between Y simulated with SGS and with KL are that (i) KL realizations are grid independent while realizations with SGS are grid dependent, and (ii) the KL fields are, by construction, a linear combination of basis functions; thus they are smooth if the basis functions are smooth.On the other hand, SGS realizations are not smooth.Smoothness is important in history matching and reservoir optimization [10].

Examples of Conditional K(𝑥) Constructed with SGS and KL Expansions. Now we show examples of realizations in
with SGS and with KL expansions on a 50 × 50 grid.For both we use the Gaussian covariance model (10) with  = 0.2,  = 1.
We select as the "true" data a sample field of the same spatial structure generated without conditioning using the sgsim routine in GSLIB 2.0 [3].See Figure 3 and Appendix B for the GSLIB parameter file.Next we select   = 30 data locations x * to be used for conditioning, as shown in Figure 3.

Simulations with SGS.
We generate  = 6400 realizations of Ỹ() conditioned on the "true" data at the selected locations using SGS and follow up with (6) to obtain the  realizations K(⋅,  1 ), . . ., K(⋅,   ) of the permeability field.
Simulations with KL.To get KL expansions, we calculate the eigenfunctions and eigenvalues solving (13).The first  = 250 eigenvalues contribute 97% of the variance of the logarithm of the permeability field.We further generate  = 6400 conditional realizations using the KL expansion (24) with  = 250 terms and follow with (6) to obtain the  realizations of the permeability field K  (⋅,  1 ), . . ., K  (⋅,   ).
Comparison of SGS and KL. Figure 3 shows x * and two typical conditional realizations produced, respectively, with SGS and KL expansion.As expected, the latter looks substantially smoother than that obtained with SGS.

Computing Moments of Quantities of Interest.
The solutions to physical models of flow and transport (described below) use the realizations (, ) as their data.These physical models compute some quantities of interest (⋅, ), which may also depend on , , denoted by the dot (⋅).For brevity we skip (⋅) in (⋅) when no confusion results.The mean and variance of (⋅) can be computed using the probability density (y) of (⋅, y) via if (y) is known.Alternatively, we can approximate (31) and (32) with the Monte Carlo method.This is especially helpful when the vector y ∈ Γ is multidimensional.The approximation process is the same for both simulation methods given in Section 3.However, the choice of the method affects the accuracy of the approximation.We provide details below, focusing on the expectation (31).

Monte Carlo with SGS (SGS-MC).
If SGS is used, the realizations, are independent realizations of the random field (), as in the Monte Carlo method.Thus the moments [], Var[] are estimated by the sample mean and variance

KL Expansions with Monte Carlo (KL-MC).
For KL expansions, the randomness in  comes from the vector Ξ in (24).When we draw  independent realizations of Ξ (that is of Y) we treat this process as a Monte Carlo simulation.Since the representation of (, ) is truncated to   (, ), we generate Thus it is appropriate to denote the corresponding quantity of interest as   .We estimate (31) by the sample mean where the latter is defined in (34).Var[(⋅)] is analogously estimated by Var , [(⋅)].
The following point is important.Let Γ  =   (Ω) denote the image of   and Γ = ∏  =1 Γ  ⊂ R  .When   = 0, the joint probability density function  = ∏  =1   .In this paper we assume that each   is (0, 1); thus  is a multivariate normal density.However if   > 0 the ξ 's are not independent and their joint density is not a product of the individual densities.Via the shift expressed in (26) though, we can work with the product density of the independent   's.

KL with Stochastic Collocation (KL-SC
).An approach yielding more accuracy than that of Monte Carlo is the 2step stochastic collocation (SC) method as given in [1,14].First (y) is approximated by a (multivariate) tensor product polynomial  m of degree m := ( 1 , . . .,   ) where   is the degree of the component polynomial in   , the stochastic dimension .This is followed by a highly accurate numerical integration method which is optimal for the polynomial approximation.The polynomial degrees in m and the total number of degrees of freedom  m := ∏  =1 (  + 1) of  m (y) are chosen to yield the desired degree of accuracy.In practice we choose   =  uniform in  so that The details are somewhat involved but standard [1,2,14].We provide some here for the sake of direct comparison with (34); the remaining details are given in Appendix C. The polynomial  m () is constructed in a special way intended to make (31) easy to evaluate, while maintaining accuracy.As in [14],  m is an interpolating polynomial of  constructed with collocation points {y  } and we approximate the integral where   = ∫ Γ  m  (y)(y)y.Details on the polynomials   , weights   , and points y  are in Appendix C. In practice, only the weights   and points y  are needed for (40), since the approximation (39) is never formed explicitly.

Summary: Stochastic Simulations of the Model Problems.
Consider the formulas (34), (37), and (40).We see that the three corresponding methods, respectively, SGS-MC, KL-MC, and KL-SC, require the knowledge of (⋅) evaluated for each realization   or at each collocation point y  .While the implementation of all three methods is very similar, their accuracy differs.For the estimation of moments in (31) or (32), SGS-MC and KL-MC use similar uniform weights 1/ ≈ 1/( − 1) since all realizations are equally probable, while KL-SC uses weights   optimal for accuracy and optimal choice of realizations.
We summarize the simulations steps for each of the three methods.
(2) Define a desired number  ∈ N + of realizations of the permeability field.
( All the realizations generated in Step (4) of SGS-MC are equally probable and oscillate around the mean.

KL-MC Outline
(1) Decide on the finite number  of terms in KL expansion (24).
If finding these numerically, choose a spatial grid on  at least as fine as T ℎ chosen below.If found analytically, there is no grid dependence.
(5) Define a desired order of polynomials with multiindex m as in Section 3.4.

Results of Stochastic Simulations of Flow and Transport
Now we present numerical experiments and discuss the properties of associated moments of the quantities of interest (⋅, ).Take  = [0, 1] 2 covered with a uniform 50 × 50 spatial grid T ℎ using the Gaussian (10) and exponential (9) covariance models with K(, ) computed as discussed in Section 3. Different choices of the number   of conditioning points are used; we also vary the number of terms  in the KL expansions (24) and the number of realizations , as well as the polynomial order m in KL-SC.In all simulations we assume for simplicity that the background Y() ≡ 0. This is of course easy to modify for practical simulations.
Boundary conditions for the flow and transport models are the same for all examples, and so are the homogeneous source/sink terms and thus the flow and transport are driven only by boundary data.
For the flow model (1a)-(1c) we use the conditions These boundary conditions mean that the prevailing direction of the flow is from left to right.Thus the left boundary is also the inflow, and the right boundary is the outflow boundary, and the bottom and top boundaries are the no-flow boundaries.
For the transport model (2a), (2b)-(2d) we use porosity () = 1 and either diffusion D = 0 or D =   = 0.0252I.We also define the following initial and boundary conditions (, ) = 0, (, ) ∈ {( 1 ,  2 ) ∈  : The condition (47) imposes the no-flow condition on top and bottom boundaries, as well as the usual numerical outflow boundary for D ̸ = 0 on the right (outflow) boundary of .If D = 0, no boundary condition is imposed on the outflow boundary.Simulations are run for  sufficiently long so that BTC() ≈ 0.
An illustration of the typical behavior of the flow and transport solutions is given in Figure 4.As we show below, some of the cumulative quantities of interest associated with the transport solutions have a somewhat different behavior than the pointwise quantities for the flow.
Our emphasis in the examples below is threefold.We compare SGS-MC and KL-MC methods in Section 4.1.In Section 4.2 we consider the impact of conditioning on the KL-MC method on transport results.In Section 4.3 we evaluate stochastic convergence for KL-MC versus that for KL-SC.
In what follows we drop ℎ when referring to numerical results.

Comparison of SGS-MC and KL-MC Methods.
This section gives comparisons of the flow and transport results using  = 6400 realizations of K(,  1 ), . . ., K(,   ), that is, Monte Carlo simulations corresponding to the example in Section 3.3.Our focus is on the quality of the approximations to the moments of the quantities of interest [(⋅)], Var[(⋅)].For conditioning we use   = 30 points; see examples in Figure 3.For (⋅) we choose either pointwise values of pressure () or fluxes u() or breakthrough curves BTC().The approximations to their moments are   [(⋅)] and  , [(⋅)] as defined in (34) and (37), respectively.See results in Figure 5.
The first observation is that, as expected, the structure of results corresponding to SGS-MC and KL-MC is very similar.This is also true about unconditional realizations but we do not show these here (see [23]).
Next the flow results are examined with a bit more detail; in particular the dependence of the quality of the approximations to moments as  increases is shown.Figures 6 and 7 show the statistical moments of solution of the flow problem along the profile  2 = 0.5 approximated by unconditional and conditional simulations with SGS-MC and KL-MC, respectively.We see that, as expected, conditional Var N,M [p(x)] moments of the solutions converge faster than unconditional ones to the asymptotic values for large .More significantly, the fluxes appear to settle earlier (around  = 800) for KL-MC than for SGS-MC (around  = 1200).
Next we discuss transport results shown in Figures 8 and  9.The breakthrough time distributions are qualitatively similar for both GSLIB and KL based experiments.Unconditional breakthrough times have a more skewed distribution and a wider spread in comparison with conditional breakthrough times.As expected, the variances converge faster when   > 0 than when   = 0.However, in the averaged quantities of interest shown in Figure 8, we do not see a dramatic difference between SGS-MC and KL-MC, but the variance of  appears smaller for KL-MC than for SGS-MC.On the other hand, Figure 9 seems to suggest that both variance and expectation estimates settle down more quickly for KL-MC than for SGS-MC.
In summary, KL-MC seems to require half as many realizations than SGS-MC for pointwise quantities of interest.This is not as strongly exhibited for those transient quantities of interest that are averaged over space such as breakthrough curves.More simulations are needed over various correlation lengths and other parameters to determine the relative benefits of KL-MC versus SGS-MC.

Stochastic Convergence of Moments Depending on the Number 𝑁 𝑚 of Conditioning Points.
In this example we consider KL-MC only and assess the dependence on   of the convergence of moments of various (⋅) associated with the flow and transport.This example is motivated by [24] where experiments of this type were performed with a nonlinear flow and transport model and SGS.
We use a nonseparable exponential covariance model ( 9) with  = 0.3,  = 1, and  = 200 and generate  =  max realizations of K(,   );  = 1, . . .,  with KL-MC as described in Section 3.1.First we consider the unconditional field with   = 0.For conditional simulations we use one arbitrarily chosen unconditional realization as the "true data" providing a source of measurements.We first select a set x * of size   = 12 for conditioning locations and then Flux U 1 variance Realization Breakthrough times variance a superset x * of size   = 25 containing the smaller set.We generate  max conditional realizations for all choices of   and investigate the effect on the moments of (⋅).Table 1 presents convergence of the moments of the flow solutions to ((1a), (1b), (1c), (1d)) as  increases.The convergence is assessed experimentally by comparison of the quantity of interest  , with that for  , max ; here we use  max = 25600.We compute the norm ‖ , (⋅) −  , max (⋅)‖ appropriate for the quantity of interest.For pressures it is the discrete  2 norm at the cell centers; for fluxes it is the discrete  2 norm of the values at the midpoints of the edges.The results exhibit the usual decrease in the stochastic error which essentially follows the usual (1/ √ ) ratio.It is important to notice that the magnitude of the stochastic error seems to weakly decrease with the number   of conditioning points.
The transport model ((2a), (2b), (2c), (2d)) has a considerably larger complexity than the flow model ((1a), (1b), (1c), (1d)) since the transient solutions for all time steps of ((2a), (2b), (2c), (2d)) must be obtained with the size of the time step chosen to satisfy CFL condition.In addition, any comparisons at fixed time steps may be difficult because time stepping varies from realization to realization due to a large variability of the fluxes.Therefore the use of large  is not feasible.Instead, we use  = 1600 and present a comparison of cumulative rather than pointwise quantities of interest for   = 0 and   = 25.Figure 10 shows the breakthrough curves and breakthrough times for conditional and unconditional simulations.We see that the spread of the conditional breakthrough curves is considerably smaller than that for   = 0. We also observe a significant reduction in the variances of the breakthrough curves after conditioning.As expected, we see that the presence diffusion D ̸ = 0 when compared to the case D = 0 changes the shape of the breakthrough curves and affects the distribution of breakthrough times.

Comparison of KL-MC and KL-SC.
It is well known that KL-SC has a faster convergence rate compared to KL-MC as shown in for example [1,2] for unconditional simulations of stationary fields.We provide results showing the same behavior for conditional simulations of coupled flow and transport.
Here the Gaussian covariance model (10) with parameters  = 0.45,  = 1, and  = 6 terms is used for K  ().(In this case, the first  = 6 eigenvalues of the covariance matrix  Y contribute around 70% of the variance of Y).For conditional simulations we use one of the unconditional realizations as the "true data" and a source of measurements (see Figure 11).Then   = 3 conditioning locations x * are selected, and realizations of the permeability field are generated based on this data at the collocation points   ,  = 1, . . .,  m .Note that for KL-SC we need to use a relatively small  and thus a small number of conditioning points   .Figure 11 shows the construction.Due to the small value of  it is only possible to simulate large scale variability.This is in contrast to the scale of variability illustrated in Figure 3.
Next we perform flow and transport simulations and compute moments of quantities of interest.We provide quantitative analysis but omit plots of flow results due to the similarity with those given in Sections 4.1 and 4.2.The results in Tables 2 and 3 demonstrate stochastic convergence for KL-MC and KL-SC methods, respectively.In parentheses we put the ratios between the errors on successive levels of refinement.These are to be understood as follows.For KL-MC we use as a "true solution" the mean calculated over  max realizations, and we calculate the error.Thus for KL-MC we see a decrease of the error as  increases, as to be expected, but the decrease is rather slow.On the other hand, for KL-SC, we choose as the "true solution" (or "reference solution") that calculated with  max corresponding to 6 6 = 46656.Recall that here we are increasing the order of the polynomial used for approximations.We observe very fast decrease of the error with the polynomial order  and  ≡   .
It is then important to compare KL-MC and KL-SC to evaluate the accuracy of the moments of quantities of interest corresponding to the same order of computational effort.For example, compare the errors reported for  = 6400 and that for  m = 4 6 , that is, fourth and third rows in Tables 2  and 3 for KL-MC and KL-SC, respectively.These correspond roughly to the same order of computational effort, but the error for KL-SC is about two orders of magnitude smaller.
Next we discuss transport results illustrated in Figure 12.In each plot we compare the moments for various selected values of  against those for  max = 6400.First, we see that the statistical moments approximated by KL-MC and KL-SC methods seem to agree visually with each other and appear close to each other for large .Since the qualitative nature of expectations of breakthrough curves does not differ significantly from those reported in Section 4.2, we skip detailed discussion.We only mention that the reduction in variance is about fivefold between   = 0 and   = 3, for both KL-MC and KL-SC.In addition, KL-MC seems to overpredict the variance in BTC while KL-SC underpredicts it, at least for   = 0.These qualitative observations are consistent with the stochastic error estimates shown in Table 4.The superiority of KL-SC over KL-MC is confirmed, however it is not as dramatic as for pointwise quantities of interest for the stationary (flow) problems, as we seem to be gaining approximately only one order of magnitude accuracy between KL-MC and KL-SC.

Summary
In this paper we presented a new method of constructing conditional Karhunen-Loève (KL) expansions for the permeability field K(, ) used as data in simulations of flow and transport to compute certain pointwise and cumulative quantities of interest (⋅, ).Our new method is important due to its ability to honor given data.It has a natural decomposition giving (i) a kriged mean field agreeing with observed data and (ii) random fluctuations around the kriged mean which are zero at the observed locations; thus all realizations honor the point data.
We give details of Monte Carlo implementation KL-MC of our method as well as of its use in stochastic collocation KL-SC.We compare the results of simulations with KL-MC and KL-SC to those of SGS-MC obtained with sequential Gaussian simulation (SGS) as implemented in GSLIB [3].
All three methods rely on the same initial information: a covariance model  Y and the data K(x * ) at   conditioning locations x * .All three can generate any desired number of realizations ; those in KL-SC have a particular structure.
The comparison between SGS-MC, KL-MC, and KL-SC can be summarized as follows.When used for calculation of the mean and variance of various quantities of interest, KL-MC seems to converge about twice as fast compared to SGS-MC.In turn, KL-SC converges orders of magnitude faster than KL-MC.These observations depend on the quantities of interest, with the pointwise values seeming to be more sensitive to the method chosen than the cumulative ones.However, since  >   as in (25), in practice we can use tensor grid KL-SC implementation only with a few conditioning points due to the (  ) complexity.At this time we thus see KL-MC as the most versatile and accurate technique.
The conditional expansions proposed in our method come from simple algebraic modifications of the unconditional expansions.The additional cost associated with the conditioning has complexity equal to that of finding an inverse of a matrix of size   ×   .In contrast, the expansions in [4] require determining the spectra for each set of conditioning points.As shown in Section 3.5.2,our method can reuse unconditional expansions and vary the number and position x * of conditioning points.This is in constrast to SGS which works with a fixed set x * and a fixed spatial grid or resolution.
In order to fully understand the relative merits of the three methods, more testing is needed with different covariance models and parameters and with a variety of conditioning data sets x * .Moreover, different physical models and quantities of interest need to be evaluated.To further improve on KL-MC as presented and the use of the more efficient KL-SC variant, future plans include exploration of nontensor grids in stochastic collocation.Other current work includes numerical error analysis and the use of multipoint statistics for nonstationary covariance models.Based on the argument similar to that for the corresponding deterministic model [28], we can be sure that a solution of this problem exists and is unique.

D.2. Transport Model.
Here we follow the set up for deterministic problems in [13]  When D = 0, (D.11) is ignored and the  ,ℎ term in (D.12) is set to 0. Note that we handle advection terms explicitly in time to avoid additional numerical diffusion; thus a stability (CFL) condition on the time step Δ  is imposed.In practice, we set Δ to be uniform in  but different for each realization  or y.
we plot the realizations and illustrate how they agree with Y(x * ), and how they are decomposed into the conditioned mean Y() and the fluctuations Ỹ() − Y().
) Approximate [] by the sample average (34) and Var[] by the sample variance (35).

Figure 6 :
Figure 6: Profiles obtained with SGS-MC of moments of pressures and fluxes for unconditional field   = 0 (a) and conditional field with   = 30 (b).

Figure 7 :Figure 8 :
Figure 7: Profiles obtained with KL-MC of moments of pressures and fluxes for unconditional field   = 0 (a) and conditional field with   = 30 (b).

Figure 9 :
Figure 9: Example from Section 4.1.Convergence of expectations and variances of breakthrough times BTT depending on the number of realizations  obtained with SGS-MC and KL-MC.Each figure shows results for   = 0 and   = 30.

Figure 10 : 2 Figure 11 :
Figure 10: Example from Section 4.2.In each row we show expectations (far left) and variances (left) approximated with  = 1600 unconditional and conditional realizations and histograms of unconditional (right) and conditional (far right) breakthrough times.The rows correspond to the cases with D = 0 (a) and D ̸ = 0 (b).

Table 1 :
Stochastic convergence results for KL-MC simulations with  max = 25600 in the pressures and fluxes.Convergence ratios are given in parentheses.

Table 2 :
Stochastic convergence for KL-MC method,  max = 102,400.Convergence ratios are given in parentheses.