Development and Validation of a Novel Interpretation Algorithm for Enhanced Resolution of Well Logging Signals

This work presents a novel algorithm that achieves enhanced resolution of well logging signals, e.g., from 1 ft of a pulsed neutron mineralogy tool to 0.04 ft of an imaging tool. The algorithm, denoted as “Digital Core,” combines mineralogical and sedimentological information to generate a high-resolution record of the formation mineralogy which can be consequently applied to thin bedded environments. The keystone to the philosophy of this algorithm is that the spectral information recorded by mineralogy tool is a weighted average of the mineralogy of each lithological component in the analyzed volume. Therefore, by using a high-resolution image log to determine the proportion of each lithological component, their composition can be determined from the mineralogy log data. A field case from a well located in South Australia is presented in this work, and the results validate the feasibility of an integrated core-level petrophysical analysis in a cost-effective and timely manner compared to conventional core measurements.


Introduction
Oil well logging has been known for many years and provides an oil and gas well driller with information about the particular earth formation being drilled. Accurate and detailed knowledge of earth formations that may contain reservoirs of the hydrocarbons is required for the exploration and production of hydrocarbons [1][2][3]. Typically, the recording of rock physical properties (logs) is the only information available in a borehole. These include the resistivity of the formation [4], the acoustic properties [4], the porosity (actually the interaction between neutron and hydrogen in the formation) [4], the electronic density of the formation [4], the natural radioactivity, and the gamma emission induced by neutron stimulation. These data are then interpreted in terms of mineralogy of the formation, matrix density, total and effective porosity, fraction of hydrocarbon in the fluids, etc. For example, it is important to know the lithology of the earth formations as a function of depth, particularly in thinly bedded formations. A lithology characterizing a formation may be determined using one or more of several techniques. A common technique is to retrieve core samples from an earth formation and perform intensive analysis of the core sample at the surface. Typically, this is conducted off-site at a specialized facility remote from the well site. While core samples can provide the detailed knowledge petroanalysts and geophysicists desire, obtaining the samples from deep within the earth formation and performing the analysis can be quite time intensive [2].
A neutron logging tool may be used to obtain lithology parameters by measuring radiation resulting from neutron irradiation of the earth formation. The measured radiation is indicative of the reaction of the neutrons with constituents of the formation and thus contains information about the earth formation. As one example, interactions between the neutrons and the formation may result in the emission of gamma rays with energy levels characteristic of the materials with which the neutrons have interacted [2]. Measurements are repeated at several borehole depths along the longitudinal axis of the borehole. Each measurement is associated with a borehole depth at which it is taken. Unfortunately, however, neutron logging generally provides a coarse vertical resolution, i.e., along the borehole axis, between approximately one and two feet. This resolution is insufficient to locate boundaries of thin beds (e.g., beds less than one foot in thickness). Thus, at the current level of technology, a direct measurement is not possible as these object are commonly far below the vertical resolution of logging tools. For example, a sandstone with 20% clays could correspond to a "dirty" sand with dispersed clay minerals or a "clean" sand with a few argillaceous layers. The properties of the sandstone in terms of flow dynamics, vertical connectivity, porosity, and water content will be very different depending on the actual geological makeup of the formation [3]. This can lead to by-passed pay and/or a misunderstanding of the layer properties. Getting a higher vertical resolution is therefore crucial in complex environments.
In this work, a method is proposed to integrate the sedimentological information from a borehole image and the information from a spectroscopic mineralogy log to improve the vertical resolution of the mineralogy log. The image log provides information about layering: depending on the tool used to acquire the data, a layer is marked by a contrast in resistivity or acoustic properties. Layers are commonly associated to sedimentary beds, but they can be occasionally related to the formation of nodules or the development of diagenetic cements [4]. The vertical resolution of the record is of the order of half an inch. The spectroscopic log provides information about the bulk chemistry of the formation from which the mineralogy is derived. Its vertical resolution is of the order of one foot. The first stage involves comparing the image and spectroscopic logs over the same interval to determine the level of complexity of the logged formation: if the formation displays a layering thinner than the resolution of the spectroscopic mineralogy log, the formation is deemed "thinly bedded" [4]. The algorithm proposed in this work applies to these situations. The borehole is split into intervals ranging from one to several feet in length by clustering similar regions based on the log response of both the mineralogy and the borehole image. Within each interval, lithotype facies, identified as layers of similar properties, are determined based on the image log. In the scope of this work, the term "lithotype" refers to a geological unit characterized by a set of parameters, such as, for example, specific lithology, mineralogical composition, porosity, permeability, grainsize distribution, sedimentological texture, and sedimentological structures. The term "facies" refers to a specific range of the values of these characteristics that characterize a body of rock and allow discriminating it from its surroundings either by way of measurement, observation, or both. For example, natural gamma radiation logs and neutron-induced radiation logs may provide accurate identification of lithotype facies, but with a coarse vertical resolution of approximately two feet in some tools. A high-resolution borehole image log may provide accurate vertical resolution of resistivity and changes in resistivity as small as a few millimeters but with limited ability to identify lithotype facies or minerals.
By employing a high-resolution image log, the volume of each lithological component in the considered interval can be determined. Several classes of high-resolution image logs are available for this purpose: high-resolution electromagnetic imager (e.g., resistivity imager), acoustic imager, and optical image log. From the spectroscopic log, we know the overall composition of the interval, but not the detailed composition of each layer. The mineralogical composition associated with each lithotype is determined by solving a constrained optimization problem [3] that maximizes the likelihood of the determined layer composition. For solving the optimization problem, a constrained sampling algorithm specifically generated for this purpose is utilized and presented in this work.
This work presents in detail the developed mathematical algorithm and methods for combining information obtained from a high-resolution log and low-resolution spectroscopic tool. The targeted application of this method is thin-bedded formations. An example is provided from a well in Australia where spectroscopic mineralogy, a resistivity borehole image, and a core were available. The development of this method allows to increase the resolution of the mineralogy log to the level of the image log: what can be recorded on the borehole image can be discriminated in the mineralogy log. This method will enhance our understanding of complex environments, for instance the distribution of organic matter or the distribution of argillaceous beds in reservoir formations. It allows a better understanding of the reservoir properties and their vertical distributions, a more precise location of hydrocarbon-bearing intervals, and can be used to extend core information more accurately over the logged interval. The higher resolution information could also greatly benefit well completion in cases where stimulation is required or in optimizing the completion design. More specific embodiments may also provide mineralogy logs, matrix density logs, and total porosity logs with a high vertical resolution, as well as allowing calculation of net-to-gross and net pay in thin bed formations.

Mathematical Theory and Algorithm Implementation
This section provides an overview of the mathematical theory based on which the algorithm is constructed. A lithology, e.g., sandstone, limestone, or shale, is characterized by a set of measurable parameters such as grain size distribution and composition [4]. The composition can simply be thought of as being characterized by the relative amount of different minerals contained in the lithology. As an example, sandstone contains more quartz than coal but coal contains more organic matter [4]. However, even among lithologies that are categorized into the same group, the composition can vary. Therefore, it is convenient to characterize the content of a certain mineral in a class of lithologies by a probability density functions. These probability density functions characterize the likelihood of finding a certain mineral with a given concentration within the lithology of interest.
Within this work, we are interested in inferring the mineral compositions of geological layers contained in a logging interval from two separate source of data: (1) from a mineralogical logging tool of coarse resolution; i.e., it gives the average mineralogy over all lithotypes; (2) an image tool that does not provide mineralogical information but has a much finer resolution and can be used to delineate layer 2 Journal of Sensors boundaries within the interval. For accomplishing this task, a novel mathematical optimization algorithm is developed to find the most likely combination of lithotypes assigned to the identified layers along with the mineralogical composition of each layer. The solution to the optimization problem must obey several constraints that will be detailed within this section.
The basic idea of the algorithm is to combine information from a facies analysis with the mineralogical information from a mineralogy log derived from spectral data. For each mineral indexed by m = 1, ⋯, M, the mineralogy log data provides a mineral volume fraction denoted by I m integrated over the whole borehole interval. The facies analysis provides the depth of the boundaries between different layers measured from the surface that can be used to compute the volume fractions ξ j of layers j = 1, ⋯, J. For each possible lithotype l = 1, ⋯, L, M pdfs denoted by p m,l ðx m,l Þ are available, one for each mineral describing the likelihood of finding the mineral volume fraction within this lithology as follows: From the mineralogy and image log data, it is unknown which lithotypes are present in an interval. A proposition of lithotypes is defined as an assignment of a lithotype to each layer in the interval, i.e., the map j ← l for j = 1, ⋯, J. There are a total of L!/ðL − JÞ! combinations all of which have to be analyzed. It will later be demonstrated that the vast majority of these combinations is infeasible because they cannot satisfy the constraints. Nevertheless, a smart preselection of relevant lithotypes by experienced users might be necessary if L is large.
In Section 2.1, the probability density functions p m,l ðx m,l Þ are discussed with particular focus on where the pertinent data can be obtained from. For using a MCMC method [5] to find the most likely x m,j for all m and j, we need to be able to create uniformly distributed samples over the space of all permissible x m,j . To this end, constraints and an efficient sampling algorithm are described in Sections 2.2 and 2.3, respectively.

Probability Density Functions.
The goal of the presented algorithm is to find the most likely combination of mineral volume fractions x m,l that satisfies all constraints detailed in Section 2.2. The likelihood of a combination is given by the joint probability density function pð x ! Þ, where for convenience of notation, we collect the x m,l in the vector x ! . Within this work, the joint probability function is simply the product of the single mineral probability density functions: The single probability density functions p m,l ðx m,l Þ are input to the presented algorithm and must be specified by the user. This appears to be an equally or maybe more difficult task than to infer the x m,l manually using the geophysicist's experience and knowledge. For making the developed method practical, it is necessary to create a database of p m,l ðx m,l Þ that can be reused for subsequent analysis. We decided to implement a bootstrappedlearning algorithm. At first, the geoscientist analyzing the dataset creates probability density functions following geological insight and ensures that the outcome matches the expectations. Once a sufficiently reliable database of probability density functions has been obtained, the algorithm described in Section 2 can be used. As more log data is analyzed, the process is a two-way street: existing probability density functions can be used to analyze results, and actual results can be used to enhance the probability density functions. Missing probability density functions are successively added.
For creating a useful library of probability density functions, we group sets of probability density functions by similarity with respect to general geological indicators, e.g., lithotypes. These geological indicators are known before logging commences at a particular location. A suitable set of probability density functions is retrieved from a database using the set of indicators for the logging location. Note that depending on the general geological features of a location, some minerals or lithologies may be present in one data set but absent in another. It is important to point out that the set of pdfs associated with the particular set of indicators is extended and possibly enhanced as analysis progresses by adding missing probability density functions and fixing inconsistencies of analysis results with observed mineral compositions.

The Setup of Constraints.
Both equality and inequality constraints apply to the solution of the optimization problem. The obvious inequality constraints simply state that the volume fractions need to be between zero and one: However, most pdfs are zero almost everywhere, and therefore, it makes sense to set somewhat tighter bounds on the volume fraction x m,j : There are two types of equality constraints present in the optimization problem. The first type of constraint incorporates the knowledge of the interval integrated mineralogy into the optimization problem. It states that the sum of mineral contents over all layers weighted by the volume fraction of each layer must equal the mineral content determined by the mineralogy tool:

Journal of Sensors
Secondly, the volume fractions of all minerals within each layer must sum to unity: In addition, a lithology l assigned to layer j might have a zero probability to have any traces of mineral m in it and in such cases makes sense to strictly enforce to avoid numerical issues in the solution process. There are a total of M + J equality constraints of types given by Equations (5) and (6) and P equality constraints of type Equation (7). The total number of equality constraints is We want to create uniformly distributed samples of x m,j that satisfy Equations (5)- (7). To this end, we collect the equality constraints into the matrix E: where x ! collects the x m,j , E collects the coefficients of Equations (5)-(7), and r ! collects the right hand sides of Equations (5)- (7).
The matrix E is rank deficient nominally containing more columns than rows (short and wide) and has a nullspace of rank N = M•J − K. Usually, the matrix is made square by adding N rows consisting of all zeros. As there always exists a vector x ! that satisfies Equation (8), it follows that there are infinitely many solutions to Equation (8) as opposed to none if r ! was not in the range of E.
Basis vectors spanning the null space of E can be determined by using SVD [6]. These orthonormal basis vectors are denoted by z  (8) can then be written as follows: where x ! 0 is a particular solution that can, e.g., be computed by using the Moore-Penrose inverse [7]: The vector q ! has fewer entries than the vector x ! and denotes a space of reduced dimensionality in which the solution lives. In Figure 1, a three-dimensional space, in which there are three x m,j , is depicted that has a single constraint. In this case, the null space is of dimension two permitting samples to be located on a plane. In general settings, the space of permissible samples is a N-dimensional hyperplane in a M•J dimensional space. Using Equation (9), we can use uniform samples of q ! for creating uniform samples of x ! [8]. However, these samples do not necessarily satisfy the inequality constraints, because it is unclear within which limits we need to sample q n to limit the x m,j to within α m,j and β m,j [3]. The inequality constraints Equation (4)   Journal of Sensors N-dimensional hyper-plane given by Equation (9) are constrained to. Figure 1 depicts the hyperbox and the hyperplane intersecting it. Valid samples are located on the fraction of the hyperplane that intersects the box and are colored in red.

Constrained Sampling Algorithm Development.
The simplest possible algorithm would be to sample q ! in a large enough space enclosing the hyperbox, e.g., the smallest ellipsoid that fully includes the box, and then use rejection sampling. However, as the dimensionality of the space is much larger than three (usually about 15-20), rejection sampling will be extremely inefficient. This is fairly typical for high-dimensional spaces and is usually referred to as the curse of dimensionality [3]. The algorithm described here is a novel and highly efficient algorithm that does not require rejection and therefore has great capability in addressing high dimension problems as are typical in the described mineralogy analysis.
First, let us assume that we have a seed vector x ! satisfying Equations (4) and (9). A bootstrapping algorithm for obtaining this seed vector is detailed in Section 2.4. Then, we can sample another random vector y ! satisfying the equality constraints Equation (5) through (7) but not the inequality constraints Equation (4). This can be accomplished by first randomly sampling q ! followed by using Equation (9). The difference vector d ! = x ! − y ! lies within the hyperplane defined by the inequality constraints, and all points located on the line drawn through the end points of x ! and y ! are given by where s is the step size that needs to be determined. The limits on the step size are computed by first computing s m,j,min and s m,j,max : The constraint for s are given by The simplest Algorithm 1 that yields uniformly distributed samples is to perform a random walk as follows: However, in practice, we found that the following approach performed better given the same execution time: 2.4. Obtaining the Seed Vector. Within this section, an algorithm for obtaining the first seed vector denoted by x ! in Algorithm 2 is introduced. However, a seed vector satisfying inequality and equality constraints does not always exist. It only exists if the hyperplane defined by the equality constraints intersects the box defined by the inequality con-straints. Therefore, we first check if a solution exists which is detailed in Section 2.4.1. If a solution exists, obtaining the seed vector is trivial from the previously obtained information. Nevertheless, for the sake of completeness, we describe this procedure in Section 2.4.2.

Check Solution Existence.
To facilitate the existence check, we first introduce a coordinate transformation such that limits α m,j and β m,j are mapped to zero and one, respectively. Quantities expressed in these new coordinates are denoted by a tilde to distinguish them from the original coordinate system. The coordinate transformation is given bỹ For compactness of notation, we collect all the coefficients of the first term into the matrix T and the second term of the different into the vector b ! such that we can writẽ All permissible solution, i.e., all solutions on the hyperplane, can now be expressed in this new coordinate system: The transformation transforms the hyperbox to be the unit cube around the midpoint l ! = ½1/2, ⋯, 1/2. The unit cube corresponds to the locus of points whose distance from l ! is less than or equal to 1/2 in the maximum norm: Interior of the unit cube : l The vectorx ! is the locus of all points on the hyperplane as given by Equation (9). Hence, by solving the linear optimization problem, 1. Uniformly sample q ! .

Compute y
Uniformly sample s in between s m,j,min and s m,j,max .

Compute
Algorithm 1: Random walk algorithm. 5 Journal of Sensors the condition for the hyperplane to intersect the hyperbox simply is follows: The solution to the minimization problem Equation (20) is obtained by transforming it into the corresponding linear programming problem [9,10]: Minimize t subject to into Equation (16) and apply the inverse transformation: While x ! is a valid seed vector for the random walk algorithm, it was found that it could be far away from the maximum. A better initial guess can be obtained by moving it closer to where we expect the maximum. For each one-dimensional pdf used for forming the joint pdf in Section 2.1, we can easily infer where it assumes its maximum value. Let us denote this point as x max m,j and collect all of these abscissas in the vector x ! max . Then, we estimate the maximum of the joint pdf to be close to x ! max , but we note that it does not satisfy either the equality or inequality constraints. Therefore, we project the vector onto the hyperplane described by Equation (9) and denote the projection as x ! max is guaranteed to satisfy the equality constraints, it could possibly be in violation of the inequality constraints. For approximating the best possible estimate close to x ! max ⊥ , we solve the following onedimensional maximization problem for obtaining the best step size s step : and then update the initial seed for the random walk process as follows: Figure 2 shows an overview of the mathematical solver flowchart.

Digital Core Interpretation
Model. This algorithm, presented above, has been consequently implemented into an interpretation model denoted as Digital Core, because this algorithm has the potential to replace the conventional, 1. Uniformly sample q ! .

Compute y
. Perform a line search between s m,j,min and s m,j,max to find the maximum x ! * of the joint probability density pð x ! Þ.

Set
and go to 1.

Manufactured Test Case
In this section, the Digital Core model is applied to a manufactured test case and is consequently validated as discussed below.

Manufacturing the Reference Solution and Preparing
Input Data. We selected a three-layer test case and used three lithologies, sandstone, shale, and coal (note: usually, the number of layers and the total number of tested lithologies are not identical). The pdfs for these lithologies are known. The volume fractions for the three layers are selected and listed in Table 1. Along with these volume fractions, we selected that layer 1 was sandstone, layer 2 was shale, and layer 3 was coal. From the pdfs, the most likely composition for each lithology is selected. Then, we use the volume fraction to compute the averaged mineralogy that typical pulsed neutron geochemical tool measures. This data is detailed in Table 2.

Numerical Results.
To simulate noise, we perturb the mineralogy data, the layer volume fractions (i.e., the image log data), and both the pdfs' abscissa and ordinate values using Gaussian noise with noise levels chosen as the Gaussian's standard deviation being x% of the mean value of the perturbed quantity. We selected x to be 0, 1, 2.5, 5, and 10.
Using the 3 lithologies and 3 layers, there are a total of 6 ways of assigning the lithologies to the different layers. The algorithm always found the correct permutation, i.e., 1: >sandstone, 2: >shale, and 3: >coal, for x < 10%. Moreover, all other permutations were rejected. For x = 10%, out of 5 trials, the VC returned: Note that 10% standard deviation about the mean could make the volume fractions of layer 2 and layer 3 "switch places" in which case the algorithm cannot distinguish (1: >sandstone, 2: >coal, and 3: > shale) would be the correct answer! For x = 0, 1, 2.5, and 5, selected predicted and reference mineralogies are depicted in Figure 3.

Field Data Case Study
A representative field data example is presented in this section and compared to core data to demonstrate the feasibility of the Digital Core interpretation model. This case study proves that the introduced method is readily capable of generating a mineralogical record with high vertical resolution for a formation logged with a pulsed neutron geochemistry tool and a high-resolution borehole imager.

Field Data Collection.
A well from Australia where all log information and a core section of 10 ft was collected. This section was used to test the method with the result presented in Figure 4. The mineralogy was recorded with a pulsed neutron tool and was calibrated to XRD measurements. The borehole image was recorded with a resistivity imaging tool, processed, and flattened, and a high-resolution resistivity curve was generated. At its conventional resolution, the spectroscopic mineralogy shows relatively constant weight fractions for illite (37.5%), kaolinite (10.5%), siderite (about  The resistivity image log is used to determine facies over the cored interval. It shows a series of thin beds with varying resistivities that were discriminated in three facies: less than 140 ohm·m, assumed to be more argillaceous silts identified as "shale," 140 to 180 ohm·m, expected to contain more quartz and labelled as "silt," and higher than 180 ohm·m, expected to be siderite-cemented silts identified as "cemented shale." These labels indicate which type of formation would have a similar mineralogical composition: for instance, "shale" means that the formation has a mineralogical composition similar to that of a shale. The core shows that the formation consists of a siltstone with variations of mineralogical composition.

Field Data Processing and Interpretation
Analysis. Based on XRD data available for this well, PDFs are generated for each of these lithological components and are consequently used in the algorithm. The algorithm determined two possible solutions, each characterized by the probability (the most likelihood) that this solution could occur: only the solution with higher probability is presented in this communication.    Table 3 shows that the average mineralogical compositions for each solution are consistent with the average mineralogical composition determined from the spectroscopy log over the interval, indicating that both solutions are mathematically acceptable. The solution with the highest probability is presented in Figure 1 along with the core. Aside from depth matching issues, the highly resistive layers observed on the image and associated with siderite-cemented intervals correspond to brownish streaks marking the occurrence of siderite cements on the core. The distribution of the "shale" facies (more argillaceous sediments) also generally corresponds to the darker intervals in the core. The highresolution mineralogy, after applying a 1 ft mobile average filter, shows a good agreement with the spectroscopic mineralogy recorded by the logging tool ( Figure 5).
Some discrepancies are observed in the distribution of "silt" facies in the upper part of the log where an interval with argillaceous composition corresponds to light-colored levels in the core (86-87 ft) ( Figure 6). The occurrence of the "shale" can be explained by the dark streak on the image, probably related to data acquisition that affected the determination of the facies. At 87 ft, a siderite-cemented interval could not be located with confidence on the core. This might be caused by the lighter color of the silt, next to that of the sideritecemented shale in the picture. However, the complex variation of mineralogy between 87.5 and 88.5 ft, as well as below 90 ft, displays a good correspondence to the color variations observed on the core.
In complex environments as the one in the example, where the grain size distribution does not change significantly and the laminations mostly correspond to variations in mineralogy, this result demonstrates that the method is globally working: the high-resolution mineralogy displays a  0  10  20  30  40  50  60  70  80  90  100  0  10  20  30  40  50  60  70  80  90  100 Spectroscopic mineralogy (%) Virtual core (%)  9 Journal of Sensors fairly representative account of the variability observed in the core. The level at 86-87 ft shows that the method is sensitive to thresholding while defining the facies. A good knowledge of the mineralogy for the logged formation and the geology of the area would naturally be important as lateral variation can occur: this ensures the PDF can take into account the regional variations of facies. The process itself, including the workflow and the algorithm, appears to be both robust and versatile to account for complex formations such as siltstone, thinly bedded sandstones, and most probably thinly bedded limestone or partially dolomitized limestone.
Further data from other wells have also been explored, and similar performance is achieved from the algorithm. Due to sensitivity issue, this work only presents one field data case as discussed above in this section.

Conclusion
The presented method enables a direct quantification of mineralogy at the scale of the lamination in thinly bedded formations and thus allows for determining organic content, matrix density, and total porosity at high resolution, resulting in a more accurate calculation of net pay and net-to-gross ratio. The method provides great advantage over conventional mineralogy log interpretation by revealing the full vertical variability of a formation that would otherwise appear very homogeneous, revealing by-passed potentially targets. The interpretation methodology, once fully implemented and validated for the mineralogy log, could also be applied to other nuclear logs such as density and porosity logs, as well as NMR and acoustic logs. Furthermore, a future work will combine machine learning into various aspects of the Digital Core, e.g., facies recognition and log prediction.

Data Availability
Underlying data is available upon request.

Disclosure
The initial work has been presented in 2016 SEG Conference with an abstract submitted to the 2016 SEG Technical Program.