A New Method for Predicting the Permeability of Sandstone in Deep Reservoirs

Anhui University of Science and Technology, State Key Laboratory of Deep Coal Mine Mining Response and Disaster Prevention and Control, Huainan 232001, China Institute of Energy, Hefei Comprehensive National Science Center, Hefei 230088, China State Key Laboratory of Nuclear Resources and Environment, East China University of Technology, Nanchang, 330013 Jiangxi, China School of Civil Engineering, Ludong University, Yantai 264025, China Shandong University of Science and Technology, Qingdao, China


Introduction
In the process of exploring and developing oil and gas fields, permeability has been recognized as one of the key parameters to understand the characteristics of reservoir percolation and to evaluate the productivity of oil and gas wells [1][2][3]. However, the pore structure of a reservoir is complex, particularly in a reservoir with a high degree of heterogeneity. The accurate prediction of permeability has always been a difficult problem in reservoir evaluation and oil and gas exploration. The research results of Leverett [4] show that mercury injection can be used to generate capillary pressure curves of core samples, which is significant in terms of studying the reservoir pore structure. Thus, a large amount of research has been carried out by scholars for prediction of the reservoir permeability based on mercury injection capillary pressure (MICP) data. Purcell [5] initially proposed that MICP data are useful for estimation of the permeability. If it is assumed that numerous parallel bundles of capillary tubes constitute the porous media, Poiseuille's equation and Darcy's law can estimate the permeability. In addition, there are a large number of methods and corresponding models for the sake of estimating the permeability as accurately as possible, including correlating the permeability with MICP parameters. For predicting the permeability from MICP data, there exist certain common models, such as the Swanson parameter-based model [6], the capillary parachor parameter-based model [7], R35-(the pore throat radius corresponding to 35.0% of mercury injection saturation-) based model, R50-(the pore throat radius corresponding to 50.0% of mercury injection saturation-) based model, and R10-(the pore throat radius corresponding to 10.0% of mercury injection saturation-) based model [8][9][10]. By analyzing the mercury pressure data of cores with different levels of permeability, the parameters reflecting the characteristics of reservoir pore throats are proposed, and a statistical model for the permeability is established [11]. Generally, as is true for researchers who study the permeability estimation by virtue of mercury injection capillary pressure information, the models have only one goal: improvement of the prediction accuracy to meet the needs of reservoir evaluation and oil and gas exploration.
Recently, according to an increasing number of studies, intelligent systems are superior to practical and statistical methods with regard to the relative problems of geosciences and petroleum [12][13][14]. Therefore, it has been observed that most scholars tend to exploit artificial intelligence techniques to solve their problems in different fields of research and application [15][16][17][18]. This paper will attempt to apply artificial intelligence technology to permeability estimation research. Indeed, the SVR method exploits structural risk minimization (SRM) combined with empirical risk minimization (ERM). Under these circumstances, much more reliable results are produced by the SVR method compared to those of neural networks that focus on using the ERM principle. Therefore, the SVR method is used to estimate the permeability from MICP data. In addition, the classical methods proposed by previous researchers have also been improved in this study. Finally, the SVR method is superior to other methods based on comparison results.

Theory and Methodology
2.1. The Analysis of Mercury-Injection Capillary Pressure (MICP) Curves. Mercury injection capillary pressure (MICP) curves are primarily utilized for several purposes, including the classification of rock types, calculation of the oil saturation according to the height above free water level (HAFWL) method, evaluation of the rock quality, and estimation of the relative permeability [10]. Purcell [5] initially suggested that the permeability estimation could be achieved through MICP data. Once it is assumed that numerous parallel bundles of capillary tubes constitute the porous medium, Poiseuille's equation and Darcy's law can be used to estimate the permeability, which is expressed as follows: where g is a lithologic parameter; σ represents the interfacial tension of the two-phase fluid, mN/m; θ denotes the surface contact angle,°; P c refers to the capillary pressure, MPa; and S nw is the nonwetting phase fluid saturation, %. Throughout field applications, it can be challenging to estimate the permeability with equation (1) from MICP data in that there are numerous input parameters that need to be obtained first. There are numerous methods and corresponding models for estimating the permeability as accurately as possible, including correlating the permeability with MICP parameters. For predicting the permeability from MICP data, there exist certain common models, such as the capillary parachor parameter-based model and the Swanson parameter-based model.
In most cases, semilog coordinates are employed to plot MICP curves; the linear coordinates of the x-axis represent the mercury injection saturation (S Hg ), and the logarithmic coordinates of the y-axis represent the mercury injection pressure (P c ). Nevertheless, according to Rahul et al. [19], it is said that while saturation data and capillary pressures become marked on a log-log scale, a smooth curve that relatively fits all points and resembles a hyperbola can be described mathematically as follows: where P c refers to the pressure of mercury intrusion, MPa; P d denotes the displacement pressure, MPa; S Hg is the mercury saturation, %; S Hg∞ refers to the intrusion mercury saturation at infinite capillary pressure, %; and C 2 is the pore geometry factor. As shown in Figure 1(a), the inflection point A in the capillary pressure curve necessarily refers to the vertex of the hyperbola on a log-log scale, which is closely related to the situation where part of the effective pore space, which controls the fluid flow, in effect becomes greatly dominated by the nonwetting phase.
The ðS Hg /P c Þ A value of point A is called the Swanson parameter [6]. According to Guo et al. [7], under steady state conditions, equation (2) is decomposed, and it has been observed that once a chart is designed that uses S Hg as the abscissa and S Hg /P 2 c as the ordinate, the capillary parachor parameter can be regarded as the highest point, ðS Hg /P 2 c Þ max , at which the pore structure of the reservoir can also be reflected. Figure 1(b) shows typical curves of the relations of S Hg /P c and S Hg /P 2 c versus the intrusion mercury saturation S Hg ; it is observed that the curves share similarities with the parabolic curves with a maximum.

Methodology of Support Vector Regression (SVR).
Vapnik [20] proposed that support vector regression could be regarded as a machine learning technique. Attributed to its superior ability with regard to successfully dealing with many nonlinear regression problems, the SVR method is regarded as a kind of arresting tool featuring promising applications. In addition to the supplementary empirical risk minimization principle that traditionally uses neural networks for the development of accurate models, the structural risk minimization principle is exploited by the SVR method [20][21][22][23]. The underlying structure of the SVR method is elaborated on in the following section. Actually, as for SVR regression, the final aim should be that the linear relation is determined between n-dimensional input vectors x ϵ R n and variables y ϵ R, which tput is expressed in the following way: where w and b are the slope and offset of the regression line, respectively. To determine the values of w and b, the minimization of the following equation is necessary: Introduced by Vapnik (1995), the loss function employed in the SVR method can be ε-insensitive and is expressed as follows: A dual space can be used to reformulate this problem, which is expressed as follows: to maximize and subject to where α i ≥ 0, α i * ≥ 0 refers to the positive Lagrange multipliers. C denotes a regulated positive parameter to determine the trade-off between the approximation error and the weight vector norm kwk. After calculating the Lagrange multipliers α i and α i * , training data points that satisfy α i − α i * ≠ 0 can be employed to define the function of decision. Therefore, the most appropriate linear hyper surface regression can be provided as follows: The desired weight vector of the regression hyper plane can be provided by: In nonlinear regression, the input data are mapped onto a higher dimensional feature space through a kernel function, such that a linear regression hyper plane is generated. In the SVR method, some of the common kernel functions include the radial basis function (RBF), polynomial function, and sigmoid function. Under the condition of nonlinear regression, the identical approach is employed in the linear case to formulate the learning problem again; that is, the nonlinear hyperplane regression function is obtained as follows:

Geofluids
where Kðx i , xÞ refers to the kernel function, which can be expressed as follows: where ϕðx i Þ and ϕðx j Þ refer to the projections of x i and x j in the feature space, respectively. Through research, the Gaussian radial basis function was selected as the kernel function for the SVR model construction. In the above equation, γ refers to the width of the kernel function.
For the reader, a brief description of the SVR method has been provided. In addition, several papers and reviews offer additional information about the SVR method through detailed studies, and information on the SVR theory is contained in the references [24][25][26][27].

Data Processing and Analysis
3.1. Data Preparation. In mercury injection capillary pressure (MICP) data, several parameters exist that can represent the pore structure, such as the quality coefficient of the reservoir, displacement pressure P d , capillary pressure midvalue P c50 , and mean value of the pore throat radius R m . The seepage capacity of the pore can be characterized. The displacement pressure P d can be seen as the capillary pressure, which is related to the maximum interrelated pore throat in the pore system. Moreover, the displacement pressure may be closely connected with the rock permeability [28]. To a certain extent, if the rock specimen has a higher permeability, the value of the displacement pressure will decrease and vice versa. In addition, the displacement pressure belongs to a group of the key parameters by which the reservoir property of reservoir rocks is classified [29][30][31]. When the mercury saturation becomes 50%, the P c50 , regarded as the main calculation parameter of the capillary pressure distribution trend, refers to the median capillary pressure. The overall 4 Geofluids average pore throat size can be calculated according to the mean value of the pore throat radius [32][33][34].
Liu et al. [35] analyzed the MICP data, and the porosity and permeability of thirty core samples, as well as the Swanson parameter and capillary parachor parameter of every core sample, were calculated based on the method mentioned, as shown in Figure 1. Table 1 summarizes the results. It can be clearly observed that the 30 core samples are high porosity and permeability rock samples.

Data Analysis.
To evaluate the simple relationships between the quality coefficient of the reservoir and MICP parameters, this paper exploited an analysis technique to highlight the sensitivity of the quality coefficient of the reservoir in the MICP data. This process could also be useful to select input variables in that many unrelated MICP parameters could be eliminated. In addition, the number of input parameters of the model is rather large, which may overwhelm the model and potentially generate 5 Geofluids a certain amount of noise rather than the signal. Furthermore, the simple linear regression test (cross plotting) was used, of which the correlation coefficient (R 2 ) is regarded as a significant parameter of investigation concerning the influence of various MICP parameters on the laboratory-calculated permeability results. The ratio of overall variance is represented by R 2 as calculated for the model, which is obtained as follows: Cross plots of the MICP parameters of the 30 core samples are presented in Figure 2. It is observed from the linear regression that there was a positive relation between the capillary parachor parameter and permeability, which possessed the highest correlation coefficient of 0.9685. In addition, it seemed that the Swanson parameter, R m , and porosity had a positive relation with the permeability, and the relations had different correlation coefficients of 0.9169, 0.9092, and 0.3314, respectively. However, it seemed that P d and P C50 exhibited a negative relationship with the permeability. In addition, both accordingly had poor correlation coefficients of 0.2587 and 0.1088, respectively (see Figure 2).

Input Selection by Sensitivity Analysis
By virtue of a back-propagation neural network, a stable method was proposed by Dutta and Gupta [36] based on the partial derivative of the output in terms of the ith input, which was aimed at capturing the related effort of every input in estimating the output. The equation below is exploited for evaluation of the partial derivative of the output in terms of the ith input: where ∂K/∂x i is a partial derivative of the permeability in terms of the ith input, W oj refers to the weight between the output neuron and the jth hidden neuron, and h j refers to the jth neuron of the hidden layer. The sum of squares of the partial derivatives (S) is used for calculating the related effort of back-propagation neural network inputs, which can be expressed as follows: The contribution of each input variable is given by: where RC i is the relative contribution of the ith input variable. The variable with the highest RC affects the output the most. An improved approach was proposed, and the optimal number of inputs was calculated. First, as shown in Table 2, all of the available MICP parameters were used for the construction of a feed forward back-propagation neural network. Meanwhile, the RC value for every input could be computed by virtue of the sensitivity analysis. Despite the correlation coefficient that had been regarded as the qualitative standard with which to illustrate the relation between inputs and outputs, the sensitivity results represented quantitative standards that tended to be more dependable. The inputs were then ranked according to their RC values.  Figure 3: Comparison of the RMSE and correlation coefficient (R 2 ) for the SVR models with different numbers of membership inputs.

Geofluids
The optimal number of introduced inputs was an important parameter influencing the design of the SVR model. Thus, on the basis of their RC values, MICP parameters were introduced into the SVR model one by one, and the performance of the SVR model was assessed for every group of inputs. As shown in Figure 3, the results indicated that the optimal SVR model was realized when the top 4 inputs with the highest relative contribution values were used including the capillary parachor parameter, Swanson parameter, R m , and porosity.
As shown in Figure 4, the flowchart summarizes the aforementioned procedure. In this work, we omitted the detailed introduction of the theory of back-propagation neural networks (BP-NNs). Readers who are unfamiliar with BP-NNs can learn about the networks in detail with the help of the BP-NN results achieved by Mohaghegh.

Results and Discussion
In terms of the evaluation and comparison of the performance of the suggested SVR model, certain earlier methods, such as the capillary parachor parameter-based model and Swanson parameter-based model, were employed to estimate permeability values by using the same dataset.
According to Swanson [6], the Swanson parameter has a suitable relationship with the absolute permeability of a core sample according to analysis of 58 excised core specimens that had common combinations of high porosity and high permeability. A relationship between the absolute permeability and the Swanson parameter was established by Swanson. Meanwhile, the following model satisfies the permeability equation as suggested by Swanson: where K refers to the permeability, mD; S Hg denotes the mercury intrusion saturation, %; P c represents the mercury intrusion pressure, MPa; ðS Hg /P c Þ max is the Swanson parameter, MPa -1 ; and a and b are statistically undetermined constants, which can be standardized through the use of data sets from mercury injection experiments with core specimens. As mentioned before, the capillary parachor parameter refers to the maximum of the cross plot of the mercury injection saturation S Hg versus the capillary pressure squared ratio of S Hg /P c 2 . Figure 1(b) shows the rule of deciding the capillary parachor parameter from the MICP data. By virtue of mercury injection experimental data from eleven core specimens once presented by Purcell [5], a model for estimating the permeability was developed by Guo et al. [7] based on the capillary parachor parameter, which is expressed as follows: where ðS Hg /P 2 c Þ max refers to the capillary parachor parameter, MPa -2 .
Furthermore, resorting to the method suggested by Guo et al. [7], Xiao et al. [37] analyzed the MICP data from 157 core samples covering three different oil fields, and absolute permeability ranges from 0.002 to 1150.0 mD. Moreover, a common formula for estimating the permeability from MICP data was proposed by Xiao et al. [37] that was in line with the capillary parachor parameter. As such, the proposed equation can be expressed as follows: where c and d refer to statistical model parameters, which are standardized through the usage of data from mercury injection experiments with core specimens. Recently, Liu et al. [35] improved the permeability estimation model based on the capillary parachor   7 Geofluids parameter by adding a porosity factor, and the reconstructed model is as follows: where φ refers to the porosity, %; K is the permeability, mD; S Hg represents the mercury intrusion saturation, %; P c is the mercury intrusion pressure, MPa; ðS Hg /P 2 c Þ max is the capillary parachor parameter, MPa -2 ; and c, d 1 , and d 2 are undetermined parameters.
To construct a model aimed at the estimation of the permeability from the MICP data, an epsilon support vector regression (ε-SVR) algorithm was exploited, which included the capillary parachor parameter, Swanson parameter, R m , and porosity. To optimize the performance of kernel functions and enhance the ultimate precision of the estimation, all data, including inputs and outputs, were normalized within the range of [-1,1] before the SVR model construction.
Model construction primarily relied on data from the group of 30 core samples (Table 1) mentioned before. Previous studies had demonstrated that the kernel function could be approximated by radial basis functions (RBFs) because of fewer parameters that required tuning and lower computational costs (Keerthi and Lin, 2003). Thus, an RBF served as a kernel function to construct the SVR model. The relevant parameters within the SVR model and kernel function (C, γ, and ε) greatly influenced the performance of the SVR model. Thus, it was important to thoroughly examine these parameters. It had been proposed by You et al. [35] that an appropriate way to conduct this survey was to combine the grid search method with pattern search techniques on the grounds that the area within the optimal points would be determined by the grid search and the global optimal point would be found by a pattern search of the area discovered through  Measured permeability (mD) Estimated permeability (mD)  The predictive performance of the three models was evaluated through the correlation coefficient (R 2 ) and RMSE, which is shown in Figure 5. Figures 5(b) and 5(c) present the cross plots of the model-derived permeability and core assessment results. Among the three different methods employed for permeability estimation, the lowest error and highest correlation coefficient were provided by the SVR model. Furthermore, the Swanson parameter-based model had the largest error and lowest correlation coefficient. Finally, Figure 5(d) shows the relative errors, from which could be observed that the SVR model proposed in this paper outperformed all the other models, while the capillary parachor parameter-based model with the influencing porosity parameter was superior to the Swanson parameter-based model with regard to predicting the permeability. In addition, as shown in Figure 5(d), there is one sample which has maximum relative error for three methods which may be caused by the poor prediction correlation of this point.
As shown in Figure 5, the MICP data in a normal sandstone reservoir with a high porosity-permeability could be used to accurately estimate the permeability. To investigate the performance of the suggested models in predicting the permeability in detail, an additional group of 22 core samples excised from well X4 in the Xujiahe gas sandstone formation with low porosity-permeability values in the central Sichuan Basin, southwest China, were employed for assessment of the models proposed in this study.
Analysis was conducted with mercury injection capillary pressure (MICP), porosity, and permeability data of 22 core plugs. In addition, the method mentioned in Figure 6(b) was employed to calculate the Swanson and capillary para-chor parameters of every core plug. Finally, the obtained data are shown in Table 3.
In the latter stage of this study, the well-known Swanson and capillary parachor models were improved by adding porosity information since preceding studies had indicated that the porosity had a significant influence on predicting the permeability [19,36,38]. By using the data sets obtained from the 22 core samples, the Swanson and capillary parachor parameter-based models and their improved models were established. The final results of the four models constructed are visualized in Figure 7. At the same time,  10 Geofluids Improved capillary parachor model:
After model construction, the obtained predicted permeability results for each model are presented in Table 3. Figure 8 shows the relation between the predicted permeability and measured permeability of the core specimens; Table 4 presents a set of relative errors of all models.
To assess the models' performance, two significant concepts were used, including the correlation coefficient (R 2 ) and RMSE (root mean square error). Figures 8(a) and 8(b) show the results of the cross plots of the model-derived permeability and core assessment results; it was observed that the correlation coefficient between the predicted and measured permeability values was 0.9418 for the Swanson model and 0.9475 for the improved Swanson model, which demonstrated that the performance of the Swanson model had barely been improved by adding porosity information. Figures 8(c) and 8(d) show the results of the cross plots of the model-derived permeability and core assessment results, where the correlation coefficient between the predicted and measured permeability values was 0.9401 for the capillary parachor model and 0.9593 for the improved capillary parachor model; these results indicated that the performance of the capillary parachor model had improved by adding porosity information.

Geofluids
Once the R 2 value was greater than 0.9, the model performance was considered premium. Generally, the Swanson and capillary parachor models and their improved models had correlation coefficients that were greater than 0.9, demonstrating that permeability prediction had been successfully achieved. Moreover, according to the comprehensive analysis   , it was observed that the improved capillary parachor model could provide improved prediction accuracy compared to those of the previous Swanson and capillary parachor models, particularly in a high porosity-permeability sand formation. From Figure 8, it was observed that the prediction accuracy varied among all proposed models, yet the SVR model could provide the lowest error (RMSE =4.3151) and highest correlation coefficient (R 2 = 0:9855) among the five different models aimed at estimating the permeability. In addition, according to the results of the relative error of all models shown in Figure 8(f), the SVR model predicted permeability values had very low relative errors with regard to the measured values, which verified the robustness of the SVR model. Finally, Figure 9 shows the statistical analysis results of the proposed models' performance through error distribution information. From Figure 9(e), it could be observed that the lowest mean absolute error (0.3659) and standard deviation (4.4007) belonged to the SVR model, indicating that the SVR model resulted in an improved prediction performance compared to the other models. Figures 8(a) and 8(b) show that the improved Swanson model was not suitable for the permeability estimation of low porosity-permeability sand formations due to the reduced precision in predicting the permeability; the latter is confirmed by its mean absolute error value of 0.9373 and standard deviation value of 8.3921, compared to the mean absolute error value of 0.7255 and standard deviation value of 8.5486 of the Swan-son model. The mean absolute error and standard deviation values were 0.625 and 8.9007, respectively, for the capillary parachor parameter-based model, while the mean absolute error and standard deviation values of the improved capillary parachor parameter-based model were 0.4473 and 7.3365, respectively, which were lower than those of the capillary parachor parameter-based model. Thus, it was concluded that the improved capillary parachor parameterbased model had a higher accuracy than the capillary parachor parameter-based model; the latter illustrated that the improved capillary parachor parameter-based model was suitable for the permeability estimation of a low porositypermeability sand formation.
As illustrated in Figures 5(c) and 8(e), by comparing the measured and SVR model predicted permeability for both high porosity-permeability and relatively low porositypermeability core samples, it was demonstrated that the SVR model was capable of estimating permeability values with a considerable degree of accuracy. Practical results have verified that the SVR model achieved better results than all other models and can be considered a method aimed at estimating the permeability in sandstone formations, particularly in situations where it is crucial to estimate precisely.

Conclusions
The permeability, as one of the most significant quality parameters of reservoirs, is capable of providing meaningful 13 Geofluids data for characterizing reservoirs and petro physical studies when used in combination with the porosity. In fact, certain researchers have attempted to determine the formation permeability by virtue of empirical correlations based on related experiments. Numerous studies have estimated the permeability based on mercury injection capillary pressure (MICP) data due to the significance of the called-for permeability knowledge. However, the estimation requires methods with great precision. Throughout the paper, the support vector regression method and two improved models based on the Swanson model and capillary parachor parameter-based model were utilized in response to this requirement. The MICP data and porosity information were utilized in the SVR model, including the Swanson parameter, capillary parachor parameter, mean pore throat radius (R m ), and porosity, for permeability estimation. All of the models proposed in this study were established based on experimental analysis data from two sets of rock samples, one consisting of 30 core samples with high porosity-permeability values and the other consisting of 22 core samples with relatively low porosity-permeability values. The results implied that the performance of the SVR model was satisfactory and that the SVR model could extract the implied permeability knowledge from the MICP data and porosity information. A comparison between the SVR model and the other four models, including the Swanson model, improved Swanson model, capillary parachor model and improved capillary parachor model, verified the superiority of the SVR model in terms of the permeability estimation for high porosity-permeability and relatively low porosity-permeability formations. Through the comparison, it was observed that the SVR model had a higher correlation coefficient and lower root mean square error, mean absolute error, and standard deviation. In addition, for relatively low porosity-permeability core samples, the improved capillary parachor model outperformed the capillary parachor model, while the improved Swanson model did not perform better than the Swanson model. The improved capillary parachor model was better than the Swanson and capillary parachor model. To summarize the comparative analysis, although the improved capillary parachor model had been greatly improved in accuracy, the SVR model achieved better results than all the other models and was recognized as a significant tool for estimating the permeability of sandstone formations, particularly in situations where it was crucial to estimate with high degree of precision.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.