A Parallel Stochastic Framework for Reservoir Characterization and History Matching

1 Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX 78712, USA 2 Chevron ETC, San Ramon, CA 94583, USA 3 ConocoPhillips, Houston, TX 77252, USA 4 Institute for Computational Engineering and Sciences, Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, Austin, TX 78712, USA 5 Department of Petroleum and Geosystems Engineering, The University of Texas at Austin, Austin, TX 78712, USA


Introduction
Uncertainties associated with the measurement of subsurface medium properties such as rock permeability and porosity are widely known.Deterministic models of multiphase flow and transport through porous media require accurate estimate of such properties, since these affect the estimation and location of recoverable reserves, in addition to revealing true subsurface flow characteristics under various injection scenarios.Modern drilling instruments routinely have sensors installed at strategic locations in a reservoir.Specialized sensors are capable of measuring at a high local resolution, fluid, and rock properties 1, 2 .These advances together with 4d time-lapse seismic studies are revealing enormous potential in reducing uncertainty associated with reservoir characterization.Meanwhile new stochastic optimization and statistical learning methods are emerging as promising tools to determining nontrivial correlations between data measurements and responses in addition to developing optimal reservoir exploration plans 3 .In this work, the existence of a number of prior realizations of the unknown data i.e., estimates, e.g., from seismic studies is assumed.The estimation and sampling is then performed at a fixed resolution using a parallel version of the SPSA 4 simultaneous perturbation, stochastic approximation algorithm.A multilevel approach, coupled to neural-network engines that enhance the solution by calculating sensitivities in the vicinity of the most promising solution was presented in 5, 6 .
Suppose that a set of N realizations, x i , i 1, . . ., N is given of a reservoir property e.g., permeability where each x i ∈ R M , M being the number of grid elements.Using methods such as wavelet analysis 7 or the principle component analysis PCA method 8, 9 , it is possible to determine a reduced set of basis functions that characterize a majority of the variability.In this work, the PCA method is adopted.Then, let the set of realizations be denoted by the matrix X, given by X {x 1 , x 2 , . . ., x N }.

1.1
Further, let the empirical mean be given by Next, let Y denote the deviation of the data from the mean, given by Y X − {x, x, . . ., x}.

1.3
In the PCA method, the covariance matrix C is required of the deviation of the data from the mean.The covariance matrix is given by C E YY T .1.4 Since each of the N realizations is in general equally likely, in 1.4 , it is possible to further reduce the expression for covariance as C E YY T 1/N YY T .Thus, the covariance matrix C is symmetric and positive definite.Hence, it is possible to compute the eigen decomposition of C and the associated eigen values as follows In 1.5 , V {v i } M i 1 , the matrix of eigen vectors of C and λ i are the corresponding eigen values.Let λ i be arranged in descending order Then each realization can be expressed to a desired level of accuracy as a linear combination of the first P eigen value/vector pairs P < M .Infact, since C is symmetric in addition to being positive definite, the eigen vectors of V are orthogonal and thus, 1.5 can be expressed in terms of orthormal eigen vectors.In such a case, the eigen values coincide with the singular values.
Then, a well known result can be drawn upon from linear algebra, restated here for this special case without proof.
Then for any P < M, the matrix given by C ≡ P i 1 λ i v i v T i is the best "rank-P " approximation of C in the sense of the Frobenius norm, that is, C is the minimizer of C − Z F where Z ∈ R M×M .
In Theorem 1.1, recall that the Frobenius norm which is an operator norm of the matrix C is given by C F ≡ M i 1 λ 2 i .Also, note that v i v T i denotes the outer product sometimes referred to as the "tensor product" of the vectors v i and v T i .Therefore, the theorem serves to guarantee that the reduced basis corresponding to the subspace spanned by the first P eigen vectors is the best possible "rank-P " choice.In other words, it suffices to determine coefficients α j , such that the "true" property sought can be expressed as x ≈ x P i 1 α j v j in the subspace, span{v j } P j 1 .Hence, the dimension of the system is greatly reduced.The coefficients α j are actually determined by minimizing an objective function, using the SPSA method, described in the following section.

Simultaneous Perturbation, Stochastic Approximation (SPSA)
Consider the problem of finding the root, θ * , of the problem g θ ≡ ∇L θ 0, where L : R P → R is assumed to be a differentiable loss function that measures a weighted, time-and space-integrated error between observed measurements of the phase pressures, concentrations, fluxes, seismic travel-times and well-log data and computed solutions in some suitable norm.In this work, the expression for L takes the form Δt i .

2.1
Several terms in 2.1 which is written in concise terms, deserve mention.The functions f ∈ P denote any property of interest in the calculations, f θ which is a function of θ.The set of all such properties is denoted by P. In this work, for the case of two-phase flow and specifically the oil water model P ≡ {p, N, u, q, τ}, where p denotes the oil phase pressures, N denotes the oil phase concentration, q denotes the well-data includes wellrates, bottom-hole pressures, and oil water ratios depending on the kind of well , u denotes the phase fluxes and τ denotes seismic travel-times.Thus it can be seen that the right-hand side of 2.1 is a function of θ.The superscript d denotes measure data, while terms without the superscript denote computed solution.The index i denotes the time instant at which measurements are recorded and w f,i is a space-and time-dependent weight function on the property f, assigned based on priority and relevance at a given time instant and location in space.Finally, N T is the number of time instants in 0, T when data measurements are recorded and Δt i denotes the time interval between such measurements.
In this problem, θ ≡ {α 1 , α 2 , . . ., α P } with α i the coefficients in the expansion of the unknown permeability vector x, as described in Section 1.Let θ k denote the estimate for θ at the kth iterate.Then, the SPSA algorithm has the form where g k θ k is a simultaneous perturbation, stochastic approximation of the gradient g θ k defined as follows.Let Δ k ∈ R P be a vector consisting of {±1} values that are randomly generated using a Bernoulli distribution, that is, satisfying E Δ k,i 0, i 1, 2, . . ., P. Then g k θ k is defined by the central-differences equation component wise , i 1, 2, . . ., P.

2.3
From 2.3 , it is evident the method is very cheap because it only involves two relatively inexpensive forward realizations or simulations for every iteration to update the estimated property.Here a k , c k are monotonically, decreasing sequences of positive scalars chosen according to the method prescribed in 4 .The relevant formulae are given by where a, c, α, A, and γ are positive real numbers satisfying This ensures that some technical conditions are satisfied 4 which are in turn required for the convergence of the stochastic gradient to the steepest descent gradient.The choice of a, c, α, A and γ is to some extent case dependent and may require some experimentation.It is known that α 1 and γ 1/6 are asymptotically optimum values but choosing smaller values, for example, α 0.602 and γ 0.101 are found to be effective in practise.A common recommendation 4 is to set A equal to 5 to 10 percent of the maximum number of iterations allowed.Spall 4 has shown that the method converges and can be regarded as stochastic analogue of the steepest descent method with similar rates of convergence.Therein, proofs of convergence of the method in a "stochastic sense" i.e., in the sense of expectations are presented.Similar proofs can also be found in 10 .For completeness, the basic steps of a proof are presented here to show that the expectation of the stochastic gradient g k θ k equals the actual gradient g θ k .Thus, it can be expected that the method will converge at a rate equal to the steepest descent method.
Theorem 2.1.The expectation of the stochastic gradient g k θ k of the SPSA method equals the true gradient g θ k .
Proof.A standard Taylor-series expansion yields that and likewise Subtracting 2.7 from 2.6 , then rearranging the resulting terms and neglecting the higherorder terms, Next, define the inverse of Δ k as Then, from From 2.10 , it follows that the stochastic gradient in 2.3 can be written as Then, using 2.8 in 2.11 yields The expectation of the stochastic gradient is then given by Observe that the form of the matrix Δ k Δ T k on the right-hand side of 2.13 is given by In 2.14 , note that for any j 1, 2, . . ., P, the only value that Δ 2 k,j can take is 1.Hence, 1.

2.15
Further, because Δ k,i and Δ k,j are independent random variables when i / j , it follows that From this, it follows also that .17 where I denotes the P × P identity matrix.Applying 2.14 -2.16 to 2.13 yields which is the desired result.

A Parallel SPSA Algorithm
In this section, a parallel SPSA algorithm is described that runs several instances of the basic SPSA algorithm, one on each processor.This helps improve the convergence by widening the search space.Numerical tests were performed on various challenging problems on upto 256 processors.Sometimes, convergence is obtained in as few as 2 or 3 iterations.Each processor has its own copy of the vector θ k , i.e., the coefficients of the natural logarithm of permeability field , represented by say, θ id k .The random vectors Δ k are generated on each processor and are not the same as those on other processors.This is easily achieved by using a different seed on each processor for the Bernoulli random number generator program that can be found, for example, in 11 .Thus each processor also has a copy of the stochastic gradient and updates θ k according to 2.3 .
Figure 1 shows the flow chart of the parallel SPSA algorithm for a single SPSA iteration step.Most of the steps in the figure are self-explanatory.The superscript id in θ id k represents the processor ID and np is the total number of processors.The main step that needs to be described is the box "All Gather mean/min".Two approaches are implemented to gather θ id k from each processor and step to θ id k 1 -the "mean" and "min" approaches.In the "mean" method, as the name indicates, the updated vector, θ k 1 broadcast to all processors for the next iteration is simply the mean of the processor updates, θ id k 1 , that is, Update to θ id k+1 using equation (2.3) All Gather (mean/min) to θ k+1 .Broadcast In the "min" method, again as the name indicates, first the processor with the least objective is identified.In other words, the index of that processor, id min say, is defined first as Then the value of the vector θ id min k 1 on the processor id min is broadcast to all others for the next iteration

3.3
The mean method was observed to be more stable and robust in general while the min method, although faster, sometimes exhibited the tendency to get trapped in local minima.Each box with text "Run IPARS.." in the flow chart of Figure 1 is a forward run of the reservoir simulator IPARS 12 with the value of permeability being calculated from the current perturbation of the value of vector, θ id k i.e., left or right perturbation .It is also noted that in practise, the implementation takes the components of θ k on each processor to be the coefficients of ln x in place of x. i.e., natural log of permeability instead of permeability , since the permeabilities can vary by several orders of magnitude in most real-world problems.The permeability x can then be easily calculated from ln x using an exponential transormation.

Numerical Results
Two examples are presented in this section.In the first example, a 2d heterogeneous permeability field derived from the 10th SPE project 13 is used to test the parallel parameter estimation and history matching implementation.The second example likewise determines the heterogeneous permeability field and performs history matching for an upscaled version of the synthetic "Brugge" field 14 test case.
For both tests, basis functions are first generated using sample realizations see Section 1 .A total of 8 basis functions are used in the expansion of permeability for the first example where an isotropic permeability field is assumed.A total of 10 basis functions for each direction i.e., K xx , K yy , and K zz is assumed for the second example this is a nonisotropic case .Both examples use the two-phase hydrology oil and water , model as the base model in IPARS to perform history matching and parameter estimation.It is noted however that IPARS can handle general multiphase and compositional flow problems using various discretization methods.The reason for the choice here is that the inverse modeling framework is currently only supported with the single-phase and two-phase models of IPARS.Both problems presented here were tested on various parallel platforms including the Bevo2 and Ranger linux cluster at The University of Texas at Austin and Blue Gene cluster, IBM.

Example 1: Sensor Tests
In this example, the goal was to perform history matching and estimate the permeability field assuming a known "true" permeability field .Knowledge of the "true" permeability field only serves to validate the final answer and is not required and is in fact not known in practise.Without this assumption, the parallel SPSA implementation can guarantee that the objective function is minimized, but that does not necessarily mean that the minimizer is the unique real permeability field.In practise, the objective functions can have several local minima, hence it is important to ensure that the algorithm not only minimized the objective function, but that it is also converging to the "true" permeability field.
In this example, different objective function combinations based on different sensor combinations were tested by activating or deactivating the weight functions w f,i in 2.1 .The problem simulates an oil gas immiscible model using the hydrology model in IPARS.This is achieved by treating in the input, oil phase as the actual "gas-phase" and assigning the properties of gas to it.Likewise, the water-phase is treated in the input as the actual "oil phase".Then the oil phase actually "gas" is injected to produce water actually "oil" .While this may sound confusing, it is quite common in reservoir simulation to reuse existing two-or three-phase codes to solve problems in contexts they may not have been designed for.The full set of properties P, is available for use in the objective function but as mentioned, some may be "turned-off" by setting the weights to zero.It is also noted that only the production well was included in the objective calculations.Table 1 summarizes the problem data describing this example.Since the "true" θ * , that is, the coefficients corresponding to the actual permeability field are known, the problem is first run using the true permeability and the true solution p * , N * , q * , u * , and τ * are recorded at discrete time instants and at all the grid elements or faces in case of fluxes .Seismic travel times are recorded as the set of times it takes for seismic waves to travel from one end of the reservoir from each element face on the "source" end to the other end each element face on the "target" end .These are then recorded as data.The SPSA iterations are then performed starting with an initial guess θ 0 that is obtained by randomly perturbing the "true" θ * .A total of 1000 SPSA iterations were computed for this test, although the permeability field converged in very few early iterations.
Figure 2 shows the history matching versus time in days, of the oil phase actually, "gas" pressures in units of psi shown across a section through the middepth of the 2d reservoir.In this figure, it is noted that Figure 2 a presents the match obtained when all sensors were active i.e., the weights w f,i ≡ 1 at all element locations and time instants .Figure 2 b presents the match obtained when only the production well-sensor and the solution sensors i.e., w p,i , w u,i , and w N,i at the midsection were active i.e., at the location x 1250 m and for all time instants .Figure 2 c presents the match obtained when only the production well-sensor was active.The seismic sensor result was not included here because it was found to influence the history match in very insignicant amounts.
From the Figure 2, it is clear that sensors are not required at every grid element and potentially not at every time instant as well which is reassuring since installing such costly equipment as sensors at high areal densities can be very expensive and impractical.Figure 3 shows a similar match for the oil phase actually "gas" concentration in units of lb/cu-ft across a section passing through the middepth of the reservoir.Once again a fairly good match is obtained with fewer measurements.As a demonstration of the flux matches, Figure 4 shows the history match obtained for oil phase fluxes actually "gas" at the same section as the previous cases.It is observed that flux match is poorer when fewer sensors are used as opposed to the quality of the pressure and concentration history matches .Finally, the production well-data history matches versus time in days of oil production actually "gas" in mscf/day and oil gas actually "water-gas" ratio and permeability estimation for the case when only the midsection and production well sensors are active is shown in the Figures 5, 6, and 7, respectively.For the permeability estimate, it is pointed out that the legend spans 50 to 900 mD to allow consistent comparison.From these graphs, the convergence of the parallel SPSA method can be seen to be very effective.

Example 2: Brugge Field Test
In this example, a synthetic reservoir referred to as the Brugge field is used for the purpose of history matching and permeability estimation.This synthetic field was constructed by the Norwegian research center, TNO in February 2008 for a comparative case study on history matching and reservoir characterization.It consisted of 104 upscaled realizations of a 3-D geological model with well-log data from 30 wells with fixed spatial positions; first 10 years production history; inverted time-lapse seismic data in terms of pressures and saturations as well as economic parameters for oil, water and discount rates.A more detailed description as well as the data from the field can be found at 14 .This problem is very challenging for several reasons.The computational domain is a full 3d domain, it is irregular in geometry with a geologic fault as shown in Figure 8.The permeability field is anisotropic and very heterogeneous and hence, many more unknowns have to be estimated in this case.Finally, there are 30 wells driving the flow in this oil water problem actually oil water! .There are 10 water injection wells and 20 oil production wells that can be shut-in based on rate or bottom-hole constraints.All these combine to make this a fairly challenging problem.Table 2 summarizes the data that describe the problem that was solved.It is noted that well #19 a production well was the only bottom-hole pressure specified well without any constraint.All other wells were rate specified with a bottom-hole pressure constraint for shut-in.It is also noted that different injectors start injecting water at different times earliest at t 600 days while the producers start producing as early as t 0 days.The first challenge with this problem was in accurately modeling the geometry of the domain with the faults.A stair-stepped approximation was once again employed to treat the geometry in keeping with the stencil resulting from mixed FEM 15 .A bounding box grid of 35 × 21 × 57 was used and suitably interpolated to get values of the properties at corners of the box that intersect with the actual domain.For all other points, negative values are assigned and these are used to keyout the elements that are actually inactive in the bounding box i.e., for the corners that do not intersect the actual domain .Figure 8 shows the true and approximated geometries with the initial water saturation and mesh superimposed.The well locations are indicated by colored spheres, black for injectors and orange for producers.
A parallel SPSA history matching simulation was performed on the Brugge field for P 10 unknowns in each direction of anisotropy i.e., a total of 30 unknowns and a subset M 16 of the realizations.For proof of concept, the mean permeability was assumed to be the true permeability and data measurements derived from the solution corresponding to the mean at prescribed intervals of time at all grid elements.Again, it is noted that this is strictly not a required step, since ideally history matching is the only goal in such studies while estimated permeability is a by-product.But assuming a "true" permeability, allows one to test the effectiveness of the applied algorithm.Also, it is assumed that the history matching is performed for 20-year simulation period.Figure 10 shows the permeability estimate when the production data and solution pressures and concentrations measured at intervals of 5 grid elements in either areal direction are included in the loss function.The legend again spans 50 to 900 mD for all iterations for ease of comparison.Since the physical dimensions of the Brugge formation was huge, this was a reasonable areal density about 1 sensor every 4 square km for sensor locations.It is observed that even though the initial guess was very far from the "target" or "true" permeability, the algorithm "coverges" in as few as 5-10 iterations.To fix ideas, it is noted that the initial permeability error is more than 60% measured in an l 2 -norm and the normalized objective function value is 0.84 approximately.At the end of 10 iterations the permeability error is reduced to 25% and the objective function to 0.076!At the end of 50 iterations, the permeability error is reduced to 13% and the objective to 0.016.Finally, at the end of 200 iterations, the permeability error is reduced to 3.5% while the objective is reduced to 0.001!These numbers are very encouraging given the problem at hand.
Figure 9 shows the history match versus time in days obtained with respect to bottom-hole pressures in psi, shown for rate specified producing well #21 and cumulative production rate in lb/day, shown for bottom-hole pressure specified producing well #19 .The matches obtained for all wells were equally good.These parallel runs were performed on up to 64 processors on the Bevo2 cluster at ICES as well as on the Ranger cluster at TACC, The University of Texas at Austin on up to 512 processors.

Conclusions
This paper presents a robust and efficient parallel stochastic framework for subsurface reservoir characterization and history matching problems using sensor measurements of data from general multiphase flow scenarios.The parallel algorithm involves triggering multiple SPSA simultaneous perturbation, stochastic approximation instances, one on each processor which uses its own realization of the estimated property based on its own seed.At the end of each iteration, two options are available to compute the starting value of the Journal of Applied Mathematics estimated property for the ensuing iteration; one based on the minimum objective across all processors and the other based on the mean of the properties across all processors.It is generally observed that a very accurate history match is obtained in as few as 5 to 10 iterations even for fairly general and complex problems.If a "true" permeability is assumed, a very accurate permeability match is also attained in as few iterations.The algorithm itself is shown to converge at a rate equivalent to that of the steepest descent method in the sense of expectations.A natural extension of this work would be to support parallel domain decomposition in each SPSA "instance" of the algorithm, so that the method not only takes advantage of a wider search space but also utilizes a divide and conquer strategy within each search path.This is perhaps the optimal strategy for such problems and may form the subject of a future work.

Figure 2 :
Figure 2: Example 1: history matching of oil phase pressures based on sensor choices.

Figure 3 :
Figure 3: Example 1: history matching of oil phase concentrations based on sensor choices.

Figure 4 :
Figure 4: Example 1: history matching of oil phase fluxes based on sensor choices.

Figure 5 :
Figure 5: Example 1: history matching of well-data based on sensor choices.

Figure 6 :
Figure 6: Example 1: history matching of well-data based on sensor choices.

Figure 9 :
Figure 9: Example 2 Brugge field test : well 19 a production curves and well 21 b BHP history match.

Figure 10 :
Figure 10: Example 2 Brugge field test : estimate of permeability with SPSA iteration number.

Table 1 :
Parameter estimation: data for example 1, based on 10th SPE permeability set .

Table 2 :
Parameter estimation: data for example 2 Brugge field test .