Piper-PCA-Fisher Recognition Model of Water Inrush Source : A Case Study of the Jiaozuo Mining Area

Source discrimination of mine water plays an important role in guiding mine water prevention in mine water management. To accurately determinewater inrush source from amine in the Jiaozuomining area, a Piper trilinear diagram based on hydrochemical experimental data of stratified underground water in the area was utilized to determine typical water samples. Additionally, principal component analysis (PCA) was used for dimensionality reduction of conventional hydrochemical variables, after which mutually independent variables were extracted. The Piper-PCA-Fisher water inrush source recognition model was established by combining the Piper trilinear diagram and Fisher discrimination theory. Screened typical samples were used to conduct backdiscriminate verification of the model. Results showed that 28 typical water samples in different aquifers were determined through the Piper trilinear diagram as a water sample set for training. Before PCA was carried out, the first five factors covered 98.92% of the information quantity of the original data and could effectively represent the data information of the original samples. During the one-by-one rediscrimination process of 28 groups of training samples using the Piper-PCA-Fisher water inrush source model, 100% correct discrimination rate was achieved. During the prediction and discrimination process of 13 samples, one water sample was misdiscriminated; hence, the correct prediscrimination rate was 92.3%. Compared with the traditional Fisher water source recognition model, the Piper-PCA-Fisher water source recognition model established in this study had higher accuracy in both rediscrimination and prediscrimination processes. Thus it had a strong ability to discriminate water inrush sources.


Introduction
The threat posed to coal resources by mine water inrush accidents is prominent, usually causing local and even complete submergence of the mining area.Consequently, mine production efficiency will degrade or stagnate and bring about enormous economic loss.Jiaozuo mining area is a famous large water-filling mine in China and a typical North China-type coal field.Water damage caused by large-capacity and frequent water inrush occurs during the production process, resulting in submerging accidents and increasing the water drainage cost for coal output, as well as the production cost of per ton of coal.Water damage can also lead to major injuries, property loss, and loss of life.The main cause of water inrush in coal mines is rock fractured channels coming into contact with aquifers and mining roadways.A large volume of groundwater from a coal seam floor or a roof aquifer rushes into the roadway.
Water-conducting passages and water-bearing structures buried before the roadway drivage face may easily lead to water bursts in coal mines.Therefore, rapid and accurate recognition of water inrush source has important practical significance in water prevention and control in coal mines.
Methods of water inrush source recognition used in mines in recent years include hydrochemical characteristic component analysis, isotype analysis, artificial tracing, multivariate statistical analysis, multiclass clustering functions, environmental isotope [1][2][3][4][5][6][7][8][9][10][11][12], and other relatively typical analysis methods.The following methods are applied mostly in multivariate statistical analysis: hierarchical clustering linear discriminant method based on Fisher theory [13][14][15] and hydrochemical concentration forecast method based on artificial neural networks [16][17][18].The distance discriminant analysis model was established to effectively predict the sources of mine water inrush at the base of the Jiaozuo mining area through measured data by selecting six discrimination factors [19].With the progress of production systems, extensive experimental studies have been carried out on the application of some nonlinear mathematical statistics analysis methods in recent years.For instance, Pen et al. [20] utilized fuzzy comprehensive judgment method to carry out water inrush source recognition in mines.Yang et al. [21] carried out a corresponding research on water inrush grading of the Hebi Mining Bureau.Yan et al. [22] used support vector machine theory to analyze the water inrush source of mines.Xu et al. [23] applied neural network to perform practical work on water inrush sources, and Zhang et al. [24] made a corresponding application of quantification theory to discriminate the mine water inrush source.
Numerous scholars have carried out considerable research on water inrush source models and obtained great success in their practical application, However, present discriminant methods have not considered the complicated information superposition problem between hydrochemical data, a problem that results in misdiscrimination of the established model in the practical application process, and their discrimination accuracy still needs further improvement.Therefore, in this paper, a new discriminant method (Piper-PCA-Fisher) is presented, which extracts and compresses the information of hydrochemical data, transforms original data into mutually independent new data without information superposition, and combines with the mathematical classification method to establish the discriminating model of water inrush source.In this way, the high-accuracy water source recognition model can be trained out.The Piper-PCA-Fisher water inrush source recognition model has important practical value in the Jiaozuo mining area, and it can also provide theoretical guidance for the prevention and control of water damage in other North China coal mines.

Description of the Study Area
Jiaozuo mining area is in the northwestern part of Henan Province, north of the Taihang Mountains and south of the Yellow River.This area has a temperate continental monsoon climate with four distinctive seasons and an annual average temperature of approximately 13.2 ∘ C [1].Influenced by meteorological conditions and topographic factors, the annual average rainfall in the northern mountainous area is about 700 mm, while that in the southern piedmont plain is about 600 mm.Rainfall mainly occurs in summer and autumn.The ground level of the northwestern mountainous area is 200-1790 m.The ground has elevations and depressions.The ground level of the piedmont alluvial plain in the southeast is 80-200 m.This topographic feature forms a noticeable control effect on underground water runoff.
The overall structural feature of strata in Jiaozuo mining area is monoclinic morphology with a tendency toward the southeast and an orientation at the southwest.Dip angle is 6 ∘ -16 ∘ and that of the local part is 25 ∘ -30 ∘ .This area has faulted structures, mostly normal faults with few folds.The principal fracture is a strong runoff zone of underground karst water with mature karstic development that intersects with a secondary fault to form a horizontally and longitudinally staggered stereoscopic fracture network.As a result, underground water in this area has close connections in both vertical and horizontal directions and hydraulic connections of different degrees between multiple aquifers to form complicated hydrogeological conditions (Figure 1).
From top to bottom, the main aquifers can be divided into four categories according to the lithology, thickness, water features, and burial conditions of the stratum.The first category is the Quaternary aquifer made up of Quaternary sandstone, clay, calcareous nodules, and the conglomerate bottom, which is the main aquifer for the segment.The second category is the coal-bearing sandstone aquifer that consists of sandstone, siltstone, and shale layers with a permeability () of 0.1-0.3m/d.The third category is the carboniferous limestone aquifer composed of Taiyuan limestone.The fourth category is the Ordovician limestone aquifer that is the sedimentary basement of the coal measure strata.Jiaozuo coal-mining area mainly contains Ordovician limestone aquifer groundwater (type I), carboniferous limestone aquifer groundwater (type II), coal sandstone aquifer groundwater (type III), and Quaternary system aquifer groundwater (type IV).

Data Analysis of Modeling Experiment.
Water samples collected in the study area included underground water at different aquifers in the coal-mining area (Figure 1).38 groups of water samples were collected from the coal-mining district, including 8 groups from type I, 10 groups from type II, 9 groups from type III, and 11 groups from type IV.Water samples were collected in clean 550 ml plastic bottles to determine hydrochemical ions.13 groups of data (numbers A1-A13) in this paper were quoted from Wang et al. [19] and were used to test the model.Table 1 shows the results of water sample analysis.The chemical composition of the water samples was determined in the State Key Laboratory of Hydrology, Henan Polytechnic University, using a Shimadzu CTO-10Avp ion chromatograph and ICP-MS with a relative error of 1%, while HCO 3 − was determined using dilute sulfuric acid-methyl orange titration.

Piper Trilinear Diagram Analysis.
Underground hydrochemical components at aquifers change with the movement of groundwater.Different aquifers always have different hydrochemical features, and hydrochemical components at the same aquifer will also be noticeably different because of varying hydrogeological conditions.However, their chemical components will maintain a dynamic equilibrium through a series of physical and chemical reactions.Therefore, water samples at the same aquifer will present the same hydrochemical features after hydrochemical analysis.During the practical analysis process, hydrochemical components at the same aquifer sometimes vary greatly because samples are influenced by the hydraulic connection between different aquifers, the movement of underground water, and so on.Hence, we must determine typical water samples and find the water sample that can best represent hydrochemical components at this aquifer to establish a high-accuracy water inrush source discrimination model.The Piper trilinear diagram of water samples can be used to determine typical water samples.If water samples significantly deviate from the formation center in the Piper trilinear diagram, these samples should be discriminated as abnormal and excluded in this study.

Principal Component Analysis.
Principal component analysis (PCA) is a multivariate statistical analysis method whose basic idea is data dimension reduction.Multiple observational variables with the original existence of information superposition are transformed into several mutually irrelevant aggregate variables through orthogonal transformation to extract feature information.Multiple correlation variables must also be simplified with the least information loss possible [25].

Fisher Mathematical Principle.
The basic idea of Fisher discriminant analysis is projection.Specifically, it projects high-dimensional points onto low-dimensional space and uses univariate analysis of variance to establish a linear discriminant function per criteria of maximum between-class distance and minimum inner-class distance.Sample class can be discriminated per corresponding criterion.Fisher discriminant analysis can be used to skillfully avoid the "curse of dimension" and solve high-dimensional problems using a 1D method.
The projection of  on the  axis is where  = ∑  =1   and  () and  are the sample mean and total sample mean, respectively.Therefore, the inner-group difference is as follows: where   is the sample dispersion matrix of   samples  ()  ( = 1, 2, . . .,   ) in   , and the intergroup difference is The equation is Φ = / =   /  .To make the maximum B and make the solution unique,    = 1 is set.
Therefore, the problem is transformed into solving , which causes    to reach the maximum under    = 1.The Lagrange multiplier method is used, and the following is set: The partial differential of the above equation is solved and set as 0; that is, Through further arrangement, the following equation is obtained: The equation shows that  should be the maximum eigenvalue of  −1 , and  is the eigenvector corresponding to  max .Hence, the Fisher discriminant function can be solved.

Verification of Fisher Discrimination Effect.
To ascertain whether the above criterion was suitable, back-substitution estimation method was used for rediscrimination to estimate the misdiscrimination rate.For training samples with capacity   from totality   (where  = 1, 2, . . .,   ;  = 1, 2, . . ., ), all training samples were successively substituted into the established discriminant function and the corresponding criterion was used for water source recognition.Total misdiscrimination number was , and the misdiscrimination rate  through back-substitution estimation was as follows:

Results and Discussion
4.1.Screening of Typical Water Samples.The Piper trilinear diagram (Figure 2) of Ordovician water samples showed that water samples numbers 2 and 7 significantly deviated from the formation center.Hence, these samples were discriminated as abnormal and excluded in this study.The remaining six groups were screened out as typical water samples in the Ordovician aquifer.The cation content in the remaining six groups of water samples, such as Ca 2+ , Mg 2+ , and Na + , was comparatively stable.Anion content was relatively large, except for the variation range of HCO 3 − .The variation ranges of both Cl − and SO 4 2− were small.The main water sample types of aquifer formation were Ca-Mg-HCO 3 , Ca-Mg-HCO 3 -SO 4 , and Na-SO 4 because the Ordovician limestone aquifer runoffs had favorable discharge conditions.Influenced by the dissolution filtration effect, carbonate-type ores centering on calcite and dolomite formed a hydrochemical type with cations, mainly Ca 2+ and Mg 2+ , and anions, mainly HCO 3 − .Meanwhile, influenced by iron pyrite (Fe 2 S) at the bottom of the coal strata in this area, oxidation caused the water pH to decline and SO Piper trilinear diagram of Archaean limestone water samples (Figure 2) showed that Ca-Mg-HCO 3 , which belongs to typical carbonatite karst water, is the main hydrochemical type in carboniferous Archaean aquifer formation.In addition, water samples numbers 9, 10, and 14 were far from the formation center and clearly beyond the Ordovician aquifer formation.Thus, these samples were excluded, and the remaining seven groups were screened out as typical water samples of Ordovician aquifer.
Water samples numbers 28, 29, and 38 of the Quaternary aquifer (Figure 2) were very far from the formation center and were thus excluded.The remaining eight groups were regarded as typical water samples of the Quaternary aquifer.

PCA Treatment of Data.
PCA was carried out for the sample data in Table 1.The six water source components had definite correlations, where the correlation coefficient between Mg 2+ ( 2 ) and Ca 2+ ( 3 ) was 0.901, and a noticeable information superposition existed between data.PCA treatment of sample data was necessary to establish the accuracy of the water inrush source discrimination model.The first five factors covered most of the information quantity of the original data and occupied about 98.92%.Therefore, the five given principal components could effectively represent the original data information of samples.

Fisher Discriminant Analysis
(a) Piper-PCA-Fisher Model.The data of principal components  1 ,  2 ,  3 ,  4 , and  5 obtained through PCA method were taken as input variables of the Fisher discriminant analysis model to make calculation for the Fisher discriminant analysis.The following Fisher discriminant functions were solved: The first discriminant function: The second discriminant function: The third discriminant function: Table 2 shows the central values of the first, second, and third discriminant functions in the distribution of water sources.Taking the first discriminant function as an example, the central values of the type I, II, III, and IV water sources were −3.126, −4.828, 7.812, and −0.266, respectively.Water source recognition was implemented by comparing distances from functional values of the water samples to be discriminated against the central values of the distribution of water sources.
(b) Fisher Model.The data of hydrochemical component were taken as input variables of the Fisher discriminant analysis model to make calculation for the Fisher discriminant analysis.The solved discriminant functions were as follows: where  1 ,  2 ,  3 , and  4 were the respective Fisher discriminant functions of types I, II, III, and IV;  1 ,  2 ,  3 ,  4 ,  5 , and  6 represent the contents of K + + Na + , Mg 2+ , Ca 2+ , HCO 3 − , Cl − , and SO 4 2− , respectively; and the final item of discriminant function was a constant.1 were rediscriminated one by one.The results showed that all water samples were discriminated correctly, with a discrimination rate of 100%.Meanwhile, compared with the Fisher water source recognition model established without data processing through PCA, the traditional Fisher water source recognition model incurred multiple errors in its rediscrimination steps; the correct discrimination rate was less than 90%.Therefore, the Fisher water source recognition model based on PCA was more accurate, had a higher degree of stability, and could meet the actual requirements of water inrush water source recognition.
In practical application, 13 water samples to be discriminated in the Jiaozuo mining area were substituted into the trained Piper-PCA-Fisher water source recognition model for discrimination (Table 3).Except for Quaternary water sample number 11, which was misdiscriminated as Ordovician water, the prediscrimination results showed that the prediction results of other water samples complied with actual classification, and prediscrimination success rate was 92.3%.However, the traditional Fisher water source recognition model misdiscriminated repeatedly for water samples, resulting in a prediscrimination success rate of less than 80%.Through comprehensive comparison, the Piper-PCA-Fisher water source recognition model was more accurate and had more extensive applicability than the other models.

Application of Water Inrush Source Recognition Model.
The identification method of water inrush source can be used in coal mines and can be carried out according to the following steps.
Step 1. Collect water sample data and aquifer information from the coal mine and select the representative water samples based on the Piper diagram.
Step 2. Analyze the hydrochemical ion concentration of the representative water samples through PCA and obtain the key principal component data.
Step 3. Use the Fisher classification model to train the key principal component data and establish the water inrush source recognition model of the mine.
Step 4. Test the water sample using the established water inrush source recognition model.
Step 5. Present the identification results of the water inrush source.
During the application process of the Piper-PCA-Fisher water source recognition model in the Jiaozuo mining area, as proposed in this study, misdiscrimination only appeared in one water sample from the Quaternary water source.The prediscrimination success rate was 92.3%, which indicated that the established water source recognition model was successful.The misdiscrimination of the water sample was primarily caused by the hydraulic connection between the Quaternary and Ordovician aquifers in the mining area.
During the model establishing process, the whole mining area was taken as the study object.During the application process, it might be impossible to implement accurate water inrush water source recognition for individual mines.Hydrogeological conditions and perfect underground hydrochemical database in aquifers must be sufficiently studied to apply the model to a single mine.
In this study, discrimination of mine water inrush water sources was based on finite data and was affected by data randomness, representativeness, and accuracy.Thus, we must extensively collect measured data, establish a corresponding training sample database, and enhance the applicability of this model.

Conclusion
Stratified sampling and experimental analysis of water quality were carried out based on the hydrogeological conditions of the mining area.The Piper trilinear diagram was then utilized to analyze and extract typical water samples.Finally, the

Figure 1 :
Figure 1: Hydrogeological map of the Jiaozuo coal-mining district in China.

3 Figure 2 :
Figure 2: Piper trilinear diagram of the water samples from aquifer I, II, III, IV.

4 2 − 2 −
ions to subsequently enter the Ordovician limestone aquifer and increase the SO 4 content of ions.

Table 1 :
Conventional hydrochemical ions in aquifers of Jiaozuo mining and samples to be measured (mg/L).

Table 1 :
Continued.Note.Ordovician limestone aquifer groundwater (simplified as type I), carboniferous limestone aquifer groundwater (simplified as type II), coal sandstone aquifer groundwater (simplified as type III), and Quaternary system aquifer groundwater (simplified as type IV).The water samples (numbers 1-38) are used to rebuild the model.The data (numbers A1-A13) in Table1are not used to build the model, which is only used to validate the model.

Table 2 :
Central values of the discriminant function in each category.

Table 3 :
Classification result of water inrush source discriminant model.