Automatic Approach for Fast Processing and Data Analysis of Seismic Ahead-Prospecting Method: A Case Study in Yunnan, China

)e seismic ahead-prospecting method is useful to detect anomalous zones in front of the tunnel face. However, most existing seismic detection method is designed for drilling and blasting tunnel. )e detection method should be improved to satisfy the rapid tunneling of Tunnel BoringMachines (TBMs).)is study focuses on reducing the time spent on seismic data processing and result analysis. )erefore, to reduce the data processing time, an automatic initial model establishment method based on surrounding rock grade is proposed. To reduce the time spent on result analysis and avoid subjective judgment, a modified kmeans++ method is adopted to interpret the detecting results and extracting anomalous zones. )e efficacy of the developed method is demonstrated by field tests. )e fractured zones such as cavity collapse and fissure are successfully predicted and identified.


Introduction
More and more tunnels are constructed for efficient transport in long-distance traffic [1]. Tunnel Boring Machines (TBMs) have been widely used in the tunnel industry due to their advantages of higher efficiency and being safe and environmentally friendly compared to the traditional drilling and blasting methods. Currently, TBM excavations are increasingly being employed in complex geologies such as karst terrains and mountain areas. However, TBMs incur high risk and have poor adaptation when passing through adverse geologies [2]. If the distribution of adverse geology cannot be accurately predicted, water and mud inrush may occur during excavation and eventually lead to delays and even casualties. Moreover, water inrush may result in water table dropping and lead to damage to the ecosystem [3]. To reduce the impact of water inrush on tunnel construction and ecosystem, adverse geology investigations should be conducted before grouting. Geophysical exploration methods provide a nondestructive means for detecting geological structures in front of tunnel faces with advantages of short detecting time and low cost, and they are widely used in engineering investigation [4,5]. However, the geotechnical investigation performed on the ground surface can only provide rough profiles of the tunnel construction area rather than the exact geological conditions of the local area in front of the tunnel face during the tunneling [4]. erefore, researchers try to perform geophysical exploration in tunnels for detecting anomalous zones in front of the tunnel face [2]. Detecting in a tunnel not only improves the detection by being closer to the anomalous zones but also can obtain surrounding rock lithology, integrity, and its physical properties as prior information for detection during excavation.
ereby, the safety risks during tunnel construction can be decreased by adjusting the tunneling scheme according to the local geological conditions obtained by the ahead-prospecting method. e seismic detection method is one of the frequently used ahead-prospecting exploration methods, which is relying on the detection of elastic differences in the physical properties such as velocity of shear wave, velocity of pressure wave, and density. ere have been lots of developments in the seismic ahead-prospecting method [6]. In most cases, data processing of the seismic ahead-prospecting method is based on the migration method. During the migration correction process, the velocity needs to be calculated based on picking the first arrival wave for the subsequent imaging procedure. However, first arrival picking is time-consuming, which can take up to 20-30% of the total processing time [7].
To satisfy the requirement of TBM rapid construction, the time spent on ahead prospecting should be reduced to less interference in the construction. Fortunately, seismic propagation velocity is related to lithology and integrity of rock, and the distribution of lithology and integrity of rock along the tunnel trunk can be roughly obtained by the surface survey. erefore, geological information obtained from surface surveys can be used as prior information to establish initial velocity models for migration correction, which can reduce time spent on data processing.
Regarding the seismic imaging result analysis, the imaging results are the distribution of geophysical properties. Usually, geophysical experts should analyze the results and interpret the results as a geological model (distribution of adverse geology such as faults and karst caves) according to their experience. us, the constructors without geophysical experience can easily understand the local geology conditions ahead of the tunnel face and then adjust the construction strategy. A common analysis method is to involve geological experts to manually identify areas with strong reflections. However, this method requires geophysical experts to analyze the imaging results according to their experience, resulting in the accuracy of manual identification depending on the experience of the interpreter. is makes manual analysis time-consuming and the division of anomalous zones boundaries in analysis results rely on subjective judgment [8]. Recently, attempts have been made to adopt automated data analysis methods in the geological field to obtain objective data analysis results within a relatively short time. K-means algorithm is one of the effective unsupervised learning algorithms used to solve the well-known clustering problem. By adopting k-means++ method, Li et al. successfully interpreted the resistivity inversion results and obtained the boundaries of high-risk areas [9]. Di et al. introduced k-means cluster analysis into seismic data to accurately delineate the boundary of salt bodies [10]. erefore, k-means has been proved to be an effective tool to interpret geophysical results. Meanwhile, when conducting manual data analysis, it is necessary to comprehensively consider geological information and geophysical detection results and analyze and compare the detection results under similar engineering geological conditions to improve the accuracy. However, this process is time-consuming due to the data analysis of standards that rely on experience. As opposed to determining the anomalous zones by experience, k-means++ algorithm is based on certain criteria, which is more efficient and objective than manual results analysis.
Gaoligongshan is a national nature reserve located in Yunnan, China. To develop the economy, a tunnel constructed by TBM was built through Gaoligongshan. In this paper, we focused on reducing the time-consumption of seismic processing and result analysis to obtain the local geological conditions in front of the tunnel face. To reduce the time spent on data processing, an initial velocity model related to geological conditions is proposed. e concept was to utilize the distribution of lithology and integrity of formations obtained from the surface survey to establish the initial velocity model and then update it according to the exposed geology during the tunneling to ensure its accuracy. In addition, an automated data analysis method based on k-means++ algorithm is proposed to cluster the seismic imaging results for reducing time spent on result analysis. Finally, the proposed method was tested in Gaoligongshan tunnel for preventing water inrush.

Imaging and Results Analysis of
Anomalous Zone e original seismic data collected by the data acquisition system are seismic traces, but the distribution of the anomalous zones in front of the tunnel face can only be obtained after data processing. In this section, an automated result analysis method based on k-means++ method is adopted to the seismic imaging results for reducing the time spent on result analysis.

Automatic Initial Model Establishment with Geological
Information. e initial velocity model is one of the important factors affecting the migration correction quality [11]. Krebs et al. integrated various sources of subsurface velocity information, such as surface seismic reflections [12], surface seismic direct arrivals, well data, and prior geologic information, to estimate the velocity model. Sala et al. presented a velocity building scheme using information from borehole and laboratory data [13]. In this study, an automated velocity model generation method based on dynamic geological information and historical wave velocity was used to improve the imaging accuracy and reduce the data processing time.
During seismic detection, the average value of the direct wave velocity obtained by all source-detectors pairs is asserted as the average velocity of the elastic wave. According to the lithology and classification of the rock mass between the geophones and shot points, this value is then recorded as the direct wave velocity under this condition. e historical wave velocity is generated from the direct wave velocity at each detection and the wave velocity data from the preliminary geotechnical investigation.
e velocity model is generated from the historical wave velocities and then used as the initial model for the migration correction.
It has to be noted that a geologic longitudinal section map can provide the lithology and classification distribution along the tunnel trunk. However, tunnels can be located hundreds or thousands of meters below the ground surface, and the estimated geological conditions along the tunnel trunk may have deviations with the actual geological conditions. To obtain a more accurate distribution of the formation, the geologic longitudinal section map is revised according to the excavation record. If the actual revealed mileage of the geological structure or phenomenon revealed, such as the lithological interface and weathering slot, is different from the mileage in the geologic longitudinal section map, the section will be corrected according to the deviations between the revealed geological conditions and the geologic longitudinal section map generated from the previous survey on the surface ground. en, the historical wave velocities are automatically matched to generate the initial model according to the lithology and rock mass type distribution of the unexcavated longitudinal profile in front of the tunnel. Finally, the velocity distribution in front of the tunnel face is obtained by using the geologic longitudinal section map. In this manner, wave velocity models can be established automatically according to the distribution of lithology and rock mass types in the geologic longitudinal section map, rather than assigning the same velocity to all rock masses within the imaging region.
A fixed finite-difference grid was established for migration correction. When processing the datasets acquired by the acquisition system, the velocity information is automatically loaded from the records and assigned to the grid. en, a 1D velocity model can be generated and used as the initial velocity model for migration correction.

Automated Result Analysis Based on k-Means++ Method.
In this study, a seismic ahead-prospecting method is adopted to image the anomalous zones [14]. Its main process includes band-pass filtering, diffusion compensation, wavefield separation, and migration correction. e workflow of the seismic data processing is presented in Figure 1. To obtain the local geological conditions ahead of the tunnel face, the result analysis should be conducted to convert the imaging results into geological units. Anomalous zones such as lithological interface and fracture zone are usually of concern when interpreting and detecting the results of geological conditions in front of the tunnel face. e common result analysis of seismic waves requires geophysical experts to analyze the imaging results according to their experience. However, the boundary of anomalous zones is difficult to recognize. erefore, we adopted the k-means++ algorithm [15] to the reflection coefficient to automatically and quickly clarify the area representative of adverse geological structures based on the imaging results.
K-means++ is an important technique for clustering. In this study, the k-means++ method is applied after obtaining the migration correction results based on the automated velocity model. For the result analysis problem, the purpose is to minimize the function as follows [15]: Actually, the imaging results consist of cells with a specific reflection coefficient. We consider each cell as a sample, and the spatial coordinates and reflection coefficients of the cells were used as parameters to cluster the cells. Because the magnitude of the coordinate parameter is too large compared to the reflection coefficient, the coordinate parameters and reflection coefficient are normalized as follows first: where R nor is the normalized parameter of the samples, R ori is the original parameter, R ave is the average value, and σ is the standard deviation. en, a scheme for calculating Euclidean distance is proposed. e distance calculated based on the coordinates and reflection coefficient was used as the criteria for centroid selection and sample classification; the shortest distance D (x) between a sample and the closest centroid is calculated as where s (x, y, z) is the spatial coordinates of the cell, c (x, y, z) is the spatial coordinates of the centroid, s (r) and c (r) are the reflection coefficient of the samples and centroid, respectively, and λ is a factor adjusting the weight of the reflection coefficient.
e probability that the sample is chosen as the next center is as follows: Finally, cells in the migration correction results are classified according to the spatial position and the reflection coefficient. e workflow of the k-means++ is as follows: (1) e purpose of adopting the k-means method is to classify the migration correction into k geological units with different rock mass integrity. erefore, the value of k is determined according to the geological analysis, which mainly depends on the possible number of lithological types and the possible number of rock quality classifications in front of the tunnel face.
(2) A cell of migration correction is randomly selected as the initial centroid. en, the distance between the centroid and all the remaining cells in the migration correction results is calculated as in equation (3), and the cell with the largest distance is selected as the second centroid. Next, the distance calculation and centroid selected steps are repeated until k centroids are determined. (3) e distance between the centroids and the cells, except the k centroids, is calculated separately, and each sample is assigned to the cluster containing the nearest centroid. In this manner, the cells of the migration correction results are assigned to k clusters. However, this classification result is not optimal and needs to be further optimized according to the distance. (4) erefore, the location of the centroids of each cluster is recalculated. (5) If the new centroids are the same as the previous ones, the k-means++ method is considered to be convergent. If the new centroids are different from the previous ones, steps 3 to 5 are repeated until the k-means method becomes convergent.
Finally, the cluster representing the strong reflection areas can be selected based on the statistical results of each cluster. e spatial distribution of such cells belonging to this cluster represents the location of anomalous zones, such as faults and karst caves.

Site Description.
e proposed method was tested through application to an actual tunnel. Gaoligongshan tunnel is the longest railway tunnel in China with a length of 34.538 km, 13 km of which was constructed using a TBM of 9 m diameter. e maximum depth of the tunnel floor from the ground surface is about 1155 m. e tunnel site is located in the boundary of the Eurasian plate and Indian Ocean plate, thus featuring complex geological conditions (Figure 1(a)). Meanwhile, many valleys developed resulting in great topographic relief, which makes it easy to collect groundwater and then cause the rock below to be fractured. erefore, obtaining the adverse geological conditions in front of the tunnel face through ahead-prospecting method and then formulating a tunneling strategy are very important to ensure the safe and efficient construction of TBM.
Actually, the detection was performed practically along the whole line of the tunnel; two tests are selected to verify the proposed method. A longitudinal section map illustrating the geological conditions along the tunnel trunk according to the surface survey is described in Figure 2(c). Meanwhile, the seismic observed configuration is as shown in Figure 3. To illustrate the applicability of the proposed method, a test with a relative accurate surface survey investigation and a test with biased survey investigation were selected here.

Field Test 1 at Mileage of D1k 225 + 313.
e first field geological detection was conducted at the mileage of D1k 225 + 313. According to the geotechnical investigation performed on the ground surface, none adverse geology condition can be inferred at this mileage. However, according to the actual excavation, rock fissures, cavity collapse, and alteration zones are developed as shown in Figure 4. is indicates that the actual rock mass is consistently poor with the survey detection results.
To determine geological conditions in front of the tunnel face and adjust the construction strategy for the sake of construction efficiency and safety, an experiment was conducted at mileage of D1k 225 + 313 during the downtime of tunnel excavation. e observed configuration is as shown in Figure 3. In both two tests, the seismic wave is excited by an 8-pound hammer with a frequency of 700-1500 Hz, recording sampling is 0.125 ms, and the whole recording time is 250 ms. According to the corrected geologic longitudinal section map and rock exposed in the tunnel, only granite is developed between mileages of D1k 225 − 313 and 225 + 213, and the parameters of the model are set to V p � 3822 m/s. e migration correction results are shown in Figure 5. In order to recognize the location and shape of the anomalous zones more clearly, the k-means++ method was adopted to further process the imaging results. e value of k was set to 4 because there are 4 possible classifications of rocks according to geological analysis. e descriptive statistical parameters of each cluster are summarized in Table 1. Cluster 3 contains both the higher positive reflection and the higher negative reflection values of the entire imaging region, and the standard deviation of the cluster is also large. erefore, cluster 3 represents the fragmental area because the statistical parameters of cluster 3 are consistent with the strong positive and negative reflection characteristics of the fragmental area. Because we focused only on anomalous zones, the image of cluster 3 is as shown in Figure 6. e geological sketch is drawn according to the excavating records to verify the analysis results and is as shown in Figure 7.
As shown in Figures 6 and 7 Figure 2: Sketch of the cross-section illustrating (a) (b) the geographic location of the Gaoligongshan tunnel and (c) the distribution of the rock lithology and integrity, which is obtained according to both the exposed surrounding rock and survey investigation [16].

S7 -S9
S4 -S6 S10 -S12    To verify the efficacy of the proposed method, a geological sketch map was drawn during the tunnel construction. e automated analysis result and the geological sketch map are compared in Figure 7. Some cavity collapse and fissure exposed during excavation between mileages of D1k 225 + 313 and D1k 225 + 213 are as shown in Figure 8.

Field Test 2 at Mileage of D1k 223 + 563.
e second field geological detection was conducted at the mileage of D1k 223 + 563. According to the geotechnical investigation performed on the ground surface, the TBM tunnel passed through granite strata at this mileage. When the tunnel reached the section between the mileages of D1k 223 + 600 and D1k 223 + 580, the surrounding rock mass was relatively intact, coinciding well with the surrounding rock classification from geotechnical investigation. However, the rock mass became much more fractured in the section between mileages of D1k 223 + 580 and D1k 223 + 563. To determine geological conditions in front of the tunnel face and adjust the construction strategy for the sake of construction efficiency and safety, an experiment was conducted at mileage of D1k 223 + 563 during the downtime of tunnel excavation. e observed configuration is as shown in Figure 3. According to the corrected geologic longitudinal section map and rock exposed in the tunnel, only granite is developed between mileages of D1k 223 − 563 and D1k 223 + 463, and the parameters of the model are set to V p � 3358 m/s. e migration correction results are shown in Figure 9. In order to recognize the location and shape of the anomalous zones more clearly, the k-means++ method was adopted to further process the imaging results. e value of k was set to 4 because there are 4 possible classifications of rocks according to geological analysis. e descriptive statistical parameters of each cluster are summarized in Table 2. According to the method for judging the clusters corresponding to the broken area mentioned in experiment 1, cluster 2 of experiment 2 corresponds to the broken area. Because we focused only on anomalous zones, the image of cluster 2 is as shown in Figure 10. e geologic sketch is drawn according to the excavating records to verify the analysis results and is as shown in Figure 11.
As shown in Figures 9 and 10   us, it is assumed that the rock mass of this zone is the same as that exposed in the tunnel face.
To verify the efficacy of the proposed method, a geological sketch map was drawn during the tunnel construction. e automated analysis result and the geological sketch map are compared in Figure 11. Some cavity collapse and fissure exposed during excavation between mileages of D1k 223 + 563 and D1k 223 + 463 are as shown in Figure 12.

Conclusion
Existing seismic ahead-prospecting method for detecting anomalous zones ahead of the tunnel face may delay the TBM tunneling. erefore, in this paper, an automatic initial model establishment method with geological information is proposed to reduce time spent on data processing. Moreover, an automated analysis method based on k-means++ method is proposed to facilitate and reduce time spent on data analysis, which converts the seismic imaging results into several geological units. By adopting the proposed methods, the boundaries of the anomalous zones can be obtained, which could help develop reinforcement strategies. Moreover, instead of analyzing and comparing the detection results under similar engineering geological conditions, the anomalous zones can be identified according to the statistical parameters of different clusters in the k-means results. In the field tests, the boundaries of the anomalous zones were obtained, which match well with the exposed rock. ough the detecting results meet the engineering construction requirements, in the future, methods such as full waveform inversion would be studied to further improve our predicting results.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.