The Use of 3D Convolutional Autoencoder in Fault and Fracture Network Characterization

Conventional pattern recognition methods directly use 1D poststack data or 2D prestack data for the statistical pattern recognition of fault and fracture network, thereby ignoring the spatial structure information in 3D seismic data. As a result, the generated fault and fracture network is not distinguishable and has poor continuity. In this paper, a fault and fracture network characterization method based on 3D convolutional autoencoder is proposed. First, in the autoencoder training frame, 3D prestack data are used as input, and the 3D convolution operation is used to mine the spatial structure information to the maximum and gradually reduce the spatial dimension of the input. Then, the residual network is used to recover the input ’ s details and the corresponding spatial dimension. Lastly, the hidden features extracted by the encoders are recognized via k -means, SOM, and two-step clustering analysis. The validity of the method is veri ﬁ ed by testing the seismic simulation data and applying real seismic data. The 3D convolution can directly process the seismic data and maximize the prestack texture attributes and spatial structure information provided by 3D seismic data without dimensionality reduction and other preprocessing operations. The interleaving convolution layer and residual block overcome low learning and accuracy rates due to the deepening of networks.


Introduction
In fractured reservoirs, faults and fractures play an important role in the occurrence and migration of oil and gas. The key to the study of fractured reservoirs is the development intensity and direction of faults and fractures [1,2]. Seismic data contain rich geological information, such as reservoir physical properties and fracture information. Seismic attributes such as coherence and curvature are extracted from seismic data to identify faults and fractures [3]. Artificial interpretation of fault and fracture largely depends on the interpreter's work experience and understanding of special geological structure. There are many subjective factors. Thus, artificial interpretation is time-consuming and laborious and easy to ignore small faults and fractures [4,5].
Fault and fracture network characterization based on pattern recognition is an important field in seismic reflection structures analysis research [6,7]. It obtains several characteristic quantities from known seismic data and corresponding mathematical models through a pattern recognition algorithm. Seismic data are trained and distinguished to reveal the nature of the known data and provide useful decision-making information for researchers. In general, the multifeatured clustering method is used for quantitative geological interpretation in pattern recognition; however, feature selection is easily disturbed. The gray-level cooccurrence matrix (GLCM) was first proposed by Haralick et al. [8] in the early 1970s. This matrix assumes that the spatial distribution of each pixel contains the image texture information.
Moreover, it is the most used method to extract the feature attributes of seismic texture. Di and Gao [9] used nonlinear GLCM texture analysis for improved seismic interpretation. Matos et al. used wavelet transform to extract the singularity of seismic trace and used SOM cluster analysis to construct the classification map but is particularly sensitive to the interpretation error of seismic horizon [10]. The Gabor filter is another kind of feature that can be used to describe image texture information. The Gabor filter's frequency and direction are similar to that of the human visual system and especially suitable for texture representation and discrimination. Wang proposed a strategic implementation sequence for reservoir characterization, including unsupervised neural network, hierarchical clustering, and the K-means algorithm [11]. Song et al. use Gabor transform to extract highdimensional features of prestack waveform, combine with submode principal component analysis (SPCA) to reduce the dimensions of high-dimensional features, and use selforganizing map (SOM) cluster analysis to construct a fault and fracture map [12]. Using SPCA, further dimensionality reduction may result in the loss of prestack texture attributes due to Gabor features' frequency and direction. Wavelet transform can be used as a feature extractor to refine the input signal in a multiscale step by step via scaling and translation operations. The above methods have significant subjectivity in feature selection, and the workload is too heavy to meet the needs of actual production tasks [13].
In recent years, the feature learning method of deep neural networks has been proposed in seismic analysis, such the method can be used to learn useful features from the data itself. Wang et al. presented a seismic analysis method based on sparse autoencoder and k-means clustering analysis. From an intuitive point of view, autoencoder can be used for feature dimension reduction, similar to PCA; however, its performance is stronger than PCA [14]. Qian et al. combined the unsupervised learning method of autoencoder and convolutional operation to build a deep convolutional autoencoder, which can be used to extract the implicit features of prestack data intelligently [15]. However, the generated classification maps are not distinguishable and have poor continuity. These low characteristics are mainly due to the use of pooled layers, which can increase the upper convolution core; however, the aggregation background discards part of the location information [16]. The lossy compression of prestack seismic images processed by multilayer networks becomes increasingly serious. During the training process, the network must learn how to counteract this inductive bias [17]. Besides, compared with the poststack or prestack data, 3D seismic data contain rich prestack texture attributes and spatial structure information, which can play essential roles in the pattern recognition of seismic facies.
Aiming at the deficiency of the existing research and under the autoencoder training frame, we take threedimensional (3D) prestack data as input and use 3D convolution operation to mine the spatial structure information maximum and gradually reduce the spatial dimension of the input. The residual network is used to recover the details of the input and the corresponding spatial dimension, and then, the hidden features extracted by the encoders are recognized via two-step clustering analysis. The validity of the method is verified by testing the seismic simulation data and the application of actual seismic data.

Materials and Methods
2.1. General Framework. The 3D convolutional autoencoder proposed in this paper is aimed at compressing the input 3D seismic data into a hidden feature representation and then reconstruct the output of this representation for work. The 3D convolutional autoencoder network framework consists of an encoder subnet and a decoder subnet (Figure 1). The encoder subnet is mainly composed of 3D convolution block C 1 , residual block RB, and 3D convolution block C 2 , and the decoder subnet is composed of 3D deconvolution blocks DC 1 , DC 2 , DC 3 , and DC 4 . From the input and output perspective, the decoder subnet is a reverse process of the encoder subnet. The decoder subnet composed of deconvolution blocks can map the encoder subnet's output to the input of the encoder subnet. This makes the output dimension of the network the same as the input image dimension of the network. The network model parameters can be obtained through training by constructing a loss function between output and output, which realizes end-to-end unsupervised feature extraction. Specifically, the input data is first converted into an atypical low-dimensional space by the encoder subnet and then expanded by the decoder subnet to reproduce the original data. Finally, the best parameters of the encoder are trained through backpropagation. In this way, the trained convolutional autoencoder can extract seismic data features well.

Convolution Block.
The encoder subnet is an essential part of the proposed model. Its purpose is to transform the input data into atypical low-dimensional space. This step can also be called feature extraction. The feature extracted through the encoder subnet has a lower dimension than the input data. While eliminating a lot of redundant information, such a feature is more suitable for some subsequent processing. The convolution block C 1 , the residual block RB, and the convolution block C 2 are cascaded in the encoder subnet. The convolution block C 1 and the convolution block C 2 have the same structure (as shown in Figure 2), including the convolution operation to extract features, batch normalization to adjust the data distribution, and nonlinear activation function. In Figure 2, the input data is x, and the final output is f ðxÞ; f ð•Þ represents the mapping relationship of the convolution block. The convolution block's operation includes three steps: firstly, input the prestack seismic data x into the encoder and perform convolution calculation with the convolution operation in the convolution block. The convolution operation is a mapping that can change the dimensions of input and output. The strided convolutions can control the output dimension of the convolution operation. Generally, the larger the stride, the smaller the output dimension will be, and vice versa. Assuming there are M convolution kernels, each convolution operation is composed of the convolution kernels tensor W M and the bias b M parameters, and the input data x is subjected to the convolution operation to generate M feature maps a M ; then, there is the formula In formula (1), the subscript m represents the m-th convolution operation, m = f1, 2,⋯,Mg, the superscripts ði, j, kÞ and ðp, q, rÞ represent x and W m 3D space position coordinates, and ðP, Q, RÞ represents the dimensions of W m .
The convolution operation is followed by batch normalization. Batch normalization is a frequently added element in deep learning. It can impose additional constraints on the distribution of data, thereby enhancing the model's generalization ability and speeding up the convergence of the model. Assuming that there are N samples in each batch of input data, its general expression is as equation (2) In formula (2), n represents the number of samples of the input data, n = f1, 2, ⋯,Ng, Mean½a n and Var½a n are the mean and variance of the input data a n , respectively, ε is a small enough number to prevent the division problem of 0, and the parameter vectors γ and β are used to scale and translate the normalized feature map, which can be adjusted through learning.
The activation function that follows is aimed at enhancing the nonlinear output of the data. This study utilized a self-regularized nonmonotonic neural activation function Mish, a self-adjusting nonmonotone activation function, keeping negative input as negative output [18][19][20][21][22][23][24][25][26]. This can better exchange input information and network sparsity. Its general expression is as formula (3) In formula (3), z represents the input of the activation function. Figure 3 compares the activation function Mish with the famous activation function ReLU in deep learning to better compare the activation functions. It is evident from the figure that the Mish is not only a continuous function but also does not simply cut the nonpositive part of the input data. It has been proved in some articles that the Mish performs better than the ReLU.

Residual Block.
The learning ability of network models, such as convolutional autoencoder, can be improved in two ways: making the model deeper and the other is to make the model wider. When the model is widened, it adds more units of computation and more functions. When the model is deepened, it increases not only the number of functions but also the degree of embedding. From this point of view, the model must be deepened. As the network model deepens, some problems, such as gradient vanishing, occur. The residual network model solves the gradient vanishing problem of the depth model. The most critical part of residual networks is the introduction of residual blocks. Research has proven that the fast connection of residuals in the network can alleviate the problem of gradient vanishing and can reuse the

Geofluids
features. The integrity of the information is protected directly by passing the input information to the output. As shown in Figure 4, F represents the hidden block, a module containing a convolution layer, a BN layer, and an activation layer. After the convolution operation, the network will retain some characteristic information and lose some data. The lost data may be useful information. Therefore, by adding the original input, the network has more choices, so that the missing information was retained. In general, the skipping connection of the feature graph can be realized by direct summation operation. Skipping connections can be regarded as identity mapping, allowing input data to pass directly over the network. The residuals are the basic units in our network, and the output of the l + 1 residuals can be calculated as 2.4. Decoder Subnet. The 3D seismic data is extracted from the previous encoder subnet to the feature output. This feature often has a smaller dimension than the original data. The decoder subnet's task is to reconstruct the original 3D seismic data through the output feature of the encoder subnet. The decoder subnet is composed of deconvolution blocks, which are similar in structure to convolution blocks, as shown in Figure 5. The deconvolution block uses three layers of operations, namely, deconvolution operation, batch normalization, and activation function. The deconvolution operation is the most critical operation in the decoder subnet.
It is the only operation in the deconvolution block that can change the dimension of the output data. Usually, the input data dimension is expanded by a large number of zero paddings, and then, the expanded data is subjected to convolution operations. The decoder subnet output can have the same dimensions as the original data by cascading multiple deconvolution blocks with deconvolution operations.  Figure 6. Several groups of fractures are regularly arranged in the simplified fracture area in the north, and four small faults and some fracture zones are distributed in the actual fracture area in the south. The input seismic reflection data are typical complex multidimensional signals, and its multidimensional performance is reflected in the reflection amplitude with the inline, crossline, time, and offset or azimuth. How to maximize the inherent structure characteristics of multidimensional signals and reduce the computational complexity of multidimensional signal processing is the key premise for the accurate identification of seismic facies. Given the common reflection, points gathered after normal moveout, such as flattening and filtering, are optimized. The low signal-to-noise data are partially overlaid to form an azimuthal gather. The suitable time window in the seismic interpretation horizon is selected to extract each reflection point's prestack reflection waveform's equal thickness. For example, along the seismic interpretation horizon of the Daanzhai Formation, 35 ms upward and 45 ms downward time windows are opened to extract the seismic data that contain the target reservoir information (see Figure 7(a)). These waveforms and their adjacent reflecting points are assembled into 3D data (i.e., the 3D prestack data of each reflecting point are composed of the prestack waveforms and the adjacent 5 × 5 reflecting points, see Figure 7(b)); the dimensions of 3D prestack seismic data is 25 × 81 × 6.

Geofluids
Since the size of the input data is 25 × 81 × 6, the various dimensions of the input are extremely unbalanced; it is also necessary to design an unbalanced convolution kernel when designing the convolution kernel of this experiment. Among them, the size of the convolution kernel in the convolution block C 1 is 7 × 4 × 4, and the stride is 3 × 1 × 1, so the size of the output data of the convolution block C 1 is 7 × 77 × 3. The size of the convolution kernel in the convolution block C 2 is 7 × 5 × 3, and the stride is 1 × 3 × 1, so the size of the output data of the convolution block C 2 is 1 × 25 × 1. From the output of the convolution block C 2 , the extracted feature shows that the spatial dimension of the data block is changed from 25 to 1, which means that the output features include fusion features of all spatial dimensions. In general, the output features incorporate rich spatial information and significantly reduce the redundancy of texture information. The specific feature extraction process is shown in Figure 8, where k represents the size of the convolution kernel and s represents the stride.
To verify the effectiveness of feature extraction via 3D convolutional autoencoder, the k-means clustering analysis method is first used for the pattern recognition of prestack seismic data, feature data extracted by the 2D convolutional autoencoder, and feature data extracted by the 3D convolutional autoencoder. The generated seismic map by each method is shown in Figure 9. Regardless of what kind of data is used, k-means clustering analysis can clearly identify most of the simplified fractures in the northern region. All kinds of fractures and faults show a dark red color. The result of k -means clustering analysis based on prestack seismic data is the worst for the identification of four faults in the actual fracture area in the south because the amount of prestack data is large, and the features are not fully extracted.
Moreover, redundant information causes recognition errors and reduces recognition accuracy. Although k-means clustering analysis based on two-dimensional convolutional autoencoder can identify most faults and fractures, the continuity of faults is poor due to insufficient spatial structure information of three-dimensional seismic data, as shown by the arrow in the figure. The k-means clustering analysis based on 3D convolutional autoencoder has the best effect on identifying four faults in the actual fracture area in the south. However, some cracks in the north are not correctly identified (see the black box in Figure 9) because the feature dataset extracted by the 3D convolutional autoencoder is huge. The k -means clustering analysis method needs to constantly adjust the sample classification and constantly calculate the adjusted new clustering center. Thus, when the amount of data is large, the time cost of the algorithm is large.
To verify the effectiveness of feature extraction by 3D convolutional autoencoder further, the SOM clustering analysis method is used for the pattern recognition of prestack seismic data, feature data extracted by the 2D convolutional autoencoder, and feature data extracted by the 3D convolutional autoencoder. The generated seismic maps by each method are shown in Figure 10. Regardless of what kind of data is used, the SOM clustering analysis method can clearly identify the simplified fractures in the northern region and the four faults in the southern region's actual fracture area. SOM clustering analysis based on 2D prestack data can identify most faults and fractures, but the continuity of faults is poor. SOM clustering analysis based on 3D convolutional autoencoder is the best method to identify the simplified fractures in the northern region. The overall continuity of the four faults in the south is as good as that of the 2D convolutional autoencoder. Although the SOM clustering analysis method can achieve the dimensionality reduction mapping from the input space (n dimension) to the output plane (2 dimensions), the larger the input space is, the more difficult it is to maintain the topological features of the mapping.
To verify the effectiveness of feature extraction by 3D convolutional autoencoder further, we used the two-step clustering analysis method for the pattern recognition of prestack seismic data, feature data extracted by the 2D convolutional autoencoder, and feature data extracted by the 3D convolutional autoencoder. The generated seismic map by each method is shown in Figure 11. The two-step clustering    7 Geofluids analysis method is sensitive to prestack seismic data, the feature data extracted by the 2D convolutional autoencoder, and the feature data extracted by the 3D convolutional autoencoder. It can identify the simplified fracture area in the north and the actual fracture area in the south; all kinds of fractures and faults show the same color. Two-step clustering has no global objective function similar to k-means clustering and has difficulty selecting the initial point. The similarity of distance and rule is easy to define, and the hierarchical relationship of the class can be found. The two-step clustering algorithm based on prestack seismic data can identify the fractures in the actual fracture area in the south, but it cannot distinguish some small fractures in the north. The two-step clustering algorithm based on the 2D convolutional autoencoder can identify most fractures in the north and south; however, its continuity is poor because the 2D convolutional autoencoder does not consider the 3D spatial structure information of seismic data. The two-step clustering method based on the 3D convolutional autoencoder can identify the four faults in the north and the south and mine some cracks in the actual fracture area.
To study the influence of 3D spatial structure information on the continuity of clustering results, convolution kernel sizes of 3 × 3, 5 × 5, and 7 × 7 are used for feature extraction. The seismic maps generated by the k-means, SOM, and two-step clustering methods are shown in Figures 12, 13, 14, respectively. With the increase in convolution kernel size, the discrimination and the continuity of seismic maps generated by each method is increasingly enhanced. The seismic map generated by the k-means and SOM methods is less distinguishable in the south's actual    Figure 15: Seismic azimuth gathers of one common middle point. 8 Geofluids fracture area, and the fault and surrounding rock are gradually integrated. However, the seismic map generated by the two-step method is highly distinguishable and continuous in the north's simplified area and the actual fracture area in the south.

Real Seismic Data Application.
To verify the validity and reliability of our method further, we applied this method to the Liziba work area in Sichuan Basin, China. Faults play an important role in improving the reservoir properties of the Maokou formation in this area. The key challenge of pattern recognition is generating distinguishable faults from seismic data. The work area has a total of 1,144 inlines and 775 crosslines. The seismic sampling interval is 1 ms. Traces of six azimuth stacking obtain the azimuth gathers. Figure 15 shows the azimuth gathers of the Maokou formation of a typical middle point. To prove the advantages of 3D convolutional autoencoder in feature extraction, poststack and prestack data are first used for two-step clustering analysis; the generated seismic maps are shown in Figures 16(a) and 16(b). Although the main faults in the work area can be identified by poststack and prestack data, the continuity is not strong. Figures 16(c) and 16(d) show the seismic map obtained via the SOM and two-step clustering analyses of the feature data extracted by the 3D convolutional autoencoder. The 3D convolutional autoencoder can extract the features of seismic data changing with azimuth and with 3D space, thus making it easier to distinguish fault and karst cave features. Therefore, the seismic map has high distinguishability and continuity.
Seismic data is reconstructed into 3D data blocks in the simulation data test and the real data application, which include spatial and prestack information. We choose to retain the original dimensions of prestack texture information, while the work area will be cut into small space blocks in terms of spatial dimension. Each 3D data block sample contains a small space block, and the space block size selection is uncertain. In our experiment, the space block size is selected as 3 × 3, 5 × 5, and 7 × 7, respectively. It is not that the bigger the size, the better the clustering result. The larger the space block, the more continuous the clustering result, but the accompanying problem is when the space block is too large and the category "swallow" appears. The phenomenon is that the space block contains multiple categories of original data, and the final diagram will only show the category with the largest amount of data in the space block and "swallow" other categories. On the other hand, using the same network model for the entire data set means using the same set of hyperparameters to extract all features, and different hyperparameters can be simply viewed as being mapped to different feature dimensions. From a theoretical point of view, there is a dimension that minimizes the redundancy of features. According to many experiments, this dimension is not that the higher, the better, or the lower, the better; it needs to be found through experiments.

Conclusions
In this study, a method of seismic pattern recognition is proposed. The proposed method uses a 3D convolutional  9 Geofluids autoencoder to directly extract features from 3D prestack seismic data for clustering analysis. In contrast to traditional methods of seismic recognition, 3D convolution has the following advantages: (1) It can capture the characteristics of seismic reflection signal changing with offset or azimuth, thus facilitating studies on the AVO/AVA characteristics of reservoir. (2) It can capture the characteristics of seismic reflection signal changing with three-dimensional space and thus can facilitate studies on reservoirs' heterogeneity. Therefore, the seismic pattern recognition method based on 3D convolutional autoencoder can effectively distinguish the reservoir characteristics of different reflection structures and enhance seismic classification maps' continuity.

Data Availability
The test data used to support the findings of this study are provided by the Key Laboratory of Geophysical Prospecting in China University of Petroleum and Chuanqing Drilling Geophysical Exploration Company. Readers cannot obtain data supporting the research results because of privacy issues.

Conflicts of Interest
No conflict of interest exists in the submission of this manuscript.