Blind Source Separation Model of Earth-Rock Junctions in Dike Engineering Based on Distributed Optical Fiber Sensing Technology

1State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai University, Nanjing 210098, China 2College of Water Conservancy and Hydropower Engineering, Hohai University, Nanjing 210098, China 3National Engineering Research Center of Water Resources Efficient Utilization and Engineering Safety, Nanjing 210098, China 4Department of Computer Engineering, Nanjing Institute of Technology, Nanjing 211167, China


Introduction
China is a country affected by monsoons significantly.Floods and other natural disasters are often caused for rainy season.A large number of dikes are built in order to reduce the loss caused by flooding and invasion of storm tide.However, the integrity and safety of dike projects are objectively undermined by crossing culverts and other hydraulic structures, especially earth-rock junctions [1].The operating experience of hydraulic structures of dikes indicated that strengthening real-time location and identification of leakage hazards for the earth-rock junctions of dike projects (ERJD) had special significances to ensure the safety of entire culverts and dike projects [2].Research and analysis of high-tech leakage monitoring in ERJD have gotten more and more attention in engineering and academia during improvement and optimization of traditional monitoring techniques and methods [3].The temperature tracing technology is a geophysical technique based on geophysics and it has broad application prospects in seepage monitoring [4].Compared to the traditional leakage monitoring technology, such as piezometer tubes and osmometers, the early temperature tracing technology was more sensitive and effective and had lower unit acquisition cost.However, the early temperature tracing method was still a point-monitoring technology.The singular area of internal temperature in buildings would not be detected inevitably because of a large layout spacing of thermistor thermometers and other reasons [5].
DTS, as a new approach, was produced by combination of optical fiber sensing technology and leakage risk monitoring technique [6,7].The new leakage monitoring method, DTS, provides a possible way to the distributed, long-distance, and large-scale monitoring [8,9].In general, the use of distributed fiber sensing technology in leakage monitoring of buildings is still in its infancy stage for theoretical research, experimental simulation, and engineering application research, especially for monitoring the combination between two mediums in ERJD.The consequent analysis of temperature monitoring data and processing aspects still need more studies.Therefore, an experimental model of ERJD was established and the distributed fiber leakage monitoring technology was developed according to the working features of ERJD.In-depth studies of models and methods for leakage identification of ERJD have important science and engineering applications based on features of monitoring data.

Denoising Research of DTS Monitoring Data with Wavelet Packet
Wavelet transform method was proposed by Morlet in the early 1980s, and then it changed rapidly to be a very powerful tool for digital signal processing, image processing, data compression, data denoising, and so forth.
Wavelet is a breakthrough to Fourier transform analysis.It is a time-frequency analysis method with fixed window size and changeable shape, time window and frequency window.Local features signal can be well characterized in both time domain and the frequency domain.
Its main advantage is the ability to do local refinement and analysis for signal, which overcomes the difficulties of not processing the nonstationary signals for Fourier transform technique and short-time Fourier transform method.
During practical applications, wavelet transform will disperse the stretching factor  and translation factor  in order to adapt to the computer calculation. takes  0 0 ,  1 0 ,  2 0 , . . .,   0 .The sampling interval of displacement amount  is Δ =   0  0 .Thus, wavelet function is where  and  are positive integers.
Taking  0 = 2 and  0 = 1 in practical operation, wavelet basis function,  , (), can be expressed as follows: Corresponding discrete wavelet transform is The signal can be decomposed into a high frequency portion named  and a low frequency part marked  by wavelet transformation.Compared to the wavelet transform, wavelet packet decomposition analysis is a more sophisticated approach, which well disassembles the high and low frequency parts of each layer simultaneously.It greatly improves the resolution of signal and overcomes the insufficient of wavelet decomposition in "low-resolution of high-frequency." The wavelet packet decomposition has more broad application prospects and the 3-layer wavelet packet decomposition process was expressed in Figure 1.As shown in Figure 1, the decomposition relationship of 3-layer wavelet packet is as follows: The denoising procedure of wavelet packet for DTS leakage measurement data is summarized as below.
(1) Wavelet packet decomposition.Select wavelets and determine the decomposition level  then decompose -layer wavelet packet for original signal .
During data processing, the wavelet packet decomposition level is determined by the wavelet packet noise reduction effect.(2) Thresholds quantization.Select the appropriate threshold quantization method to quantify treatment for each wavelet packet coefficients decomposed.(3) Reconstruction of wavelet packet.Reconstruct the low-frequency coefficients of wavelet packet -layer and its corresponding high-frequency coefficients.
Among the previous steps, the keys are selection of the wavelet packet decomposition level and approach to the threshold quantization.Wavelet packet decomposition level is generally not more than 5-layer.The threshold quantization can be processed by denoising with the soft threshold and hard threshold method.
Coefficient of wavelet packet of hard threshold estimated is Coefficient of wavelet packet built for soft threshold is given: For ( 5) and ( 6),  , represents wavelet packet coefficient and  is preset threshold or threshold value, which is calculated in the form where  is noise variance,  is decomposed scale, and  is signal length.Signal to noise ratio (SNR) and root mean square error (RMSE) are usually selected as a signal denoising performance evaluation, which are calculated by following equations: where () is original signal, () represents signal after denoising by wavelet packet, and  is signal length.

Blind Source Separation Models and Methods of DTS Temperature Monitoring Data
Seepage will lead to a change in the stability temperature field in the media and a partial volatile region can be generated [10,11].However, the temperature field distribution of structure body is affected not only by leakage field, but also soil characteristics, natural phenomena, optical fiber laying elevation, cross artificial cavities, and other factors.Further, the measured value of the optical fiber temperature sensing system is affected.Therefore, the temperature field of structure body is a result caused by multiple-factor under the combined effects.The positioning of leakage points inside ERJD under national condition can be attributed to the blind source separation problem for the influence of multiple-factor [12].
Leakages, soil characteristics or other factors causing changes in temperature monitoring data unit have a corresponding source component which is a function of the distance between the sampling point and the starting point.Soil characteristics, natural phenomena, and other factors should be possibly separated out of the temperature data except for the leakage factor.The separated data can be a result of temperature field change only affected by the leakage factor as much as possible.And the leakage factor is just the final concern.
The effects of leakage factor, soil characteristics, or other factors on the measured temperature data can also be considered to be independent units.The temperature data in most regions along fiber are not affected by leakages, gutters, and other factors, which are small probability events on time and space.Based on non-Gaussian distribution and statistical independent of the source signal, ICA is selected in blind source separation technique.In addition, PCA is determined as a pretreatment method for ICA technique in order to reduce the computational complexity and effective multiplevariable correlation.
The key of PCA is the certain degree of relevance and redundancy of information between the variables.At the same time, adjacent sampling points have similar work environment and strong correlation, which provide a possibility to the use of PCA.The mathematical model of PCA can be obtained by the following process.First, a  dimensional random vector is consisted by  scalars and it is expressed as  = ( 1 , . . .,   )  . is defined as an arbitrary number. is used for orthogonal transformation, ordering  =   , and  is orthogonal matrix.Each component of  is not relevant, and the variance of the first variable of  is the largest, the second variable coming after.This process can be represented by the following matrix: The definition of singular value decomposition (SVD) of PCA can be described:  represents a matrix with  rows and  columns, whose elements belong to the field of real numbers or complex field.There is the following decomposition: where  represents a unitary matrix with  ×  order and is called left singular matrix of ,  is a unitary matrix with  ×  order and is named right singular matrix of ,   is transpose of matrix , the same bellow,  represents a positive semidefinite diagonal matrix with  ×  order, and elements on the diagonal matrix are singular values of , which are equal to the square root of eigenvalues of   * .It is orthogonal matrix as the elements of unitary matrix are real numbers.Relationship between the main component of  and monitoring data matrix  can be obtained by Thus, where matrix of  is constituted by eigenvectors of diagonal matrix and    is orthogonal matrix.
From the theory of SVD, it can be obtained that Comparing formula ( 12) with ( 13), it can be known that  = ; then,  =  = .
Therefore, the main component of matrix is the product of left singular matrix and its corresponding singular values and is also equal to the product of monitoring data matrix and the right singular matrix.The number of principal components can be determined through the cumulative variance contribution rate.Variance contribution rate and cumulative variance contribution rate are described follows: where   represents the eigenvalues of covariance matrix and   and   are, respectively, called variance contribution rate and cumulative variance contribution rate.
In order to avoid inconsistencies between dimensions and units of data, the data is typically normalized: The mathematical model of ICA can be expressed by where  1 ,  Parallel algorithm calculation steps of several independent components are described as follows: (1) standardize the data so that the mean and variance of each column are, respectively, 0 and 1; (2) data whitened of  is gotten from the data obtained from Step (1); (3)  is selected as the number of independent component to be estimated; (4) the mixing coefficient of   is initialized,  = 1, 2, . . ., , and every   is changed to be a unit 2-norm; then, matrix  is orthogonalised by the method of Step (6); (5) update each   ,   ← {(   )} − {  (   )}  ; the function of  can be selected from formula (17) to formula (19); return to Step (5) if the result is not converged.
The function of  can be selected from the following functions: Leakage location The corresponding derivative functions are described as follows: where  1 is a constant, valuing in [1,2], and the value is usually taken as 1; tanh() represents hyperbolic tangent function.

Application Case Analyses
A model experiment platform was formed and shown in Figure 2 to simulate the existence of leakages for ERJD.The DTS monitoring system of Sentinel DTS-LR produced by Sensornet was applied.The DTS monitoring and heating system was shown in Figure 3.A schematic diagram of the model was given in Figure 4. Multimode four core armored cable of 50/125 um was used as the monitoring optical cable with a type of ZTT-GYXTW-4A1a.
The ZTT-GYXTW-4A1a was a special optical cable which could be used to be heated.And its structure section was drawn in Figure 5.The leakage location was washed by a water pipe and the velocity of seepage was controlled.The monitoring data was noted as the stable velocity was formed.
For simplicity, just one leakage point was introduced in the model, and the sampling distance was 1 m and the sampling interval was 2 hours for 5 days.Thus, each measuring point obtained 60 groups of data.First, the original temperature data should be denoised and a typical point was selected as an example analyzed.After calculation, the decomposition levels of the wavelet packet were 3-layer.Figure 6 was shown to compare the raw signal with the wavelet denoising signal.
The first principal component was calculated and listed in Figure 7.
The corresponding variance contribution rate of the first principal component,  1 / ∑ , was 87.21%, and it could be concluded that the contribution of the first principal component to information was larger.The difference of the principal component value between the leakage point presented and other points was much larger than the one of the first principal component.Therefore, the second and third principal components were selected to the next ICA, which meant  = 2.One of the two principal components was used to construct a separate component of the signal space, and the other one was made to build a separate part of the salvage space.The difference of the separate component 1 in the leakage point was more evident than the one of other points whose distribution was more stable.Thereby, the separate component 1 was considered to establish a salvage space.The separate component 2 constructed a separate component signal space.The corresponding grey-scale map of residual space was shown in Figure 8.
From the result of data processing, it can be known that the problem of the optical fiber temperature monitoring data could be analyzed by the blind source separation and ICA was used to process data.In Figure 8, the leakage area of 14 m was larger than the area of 32 m.And the test distance of the model was less than 32 m.Therefore, the area of 32 m was not considered.Finally the leakage location could be well implemented.

Conclusions
In this paper, ERJD was the studied object and its leakage identification was for the research emphasis.Based on leakage characteristics of monitoring ERJD by DTS, the affecting factors of optical fiber temperature monitoring data were analyzed.The wavelet packet denoising method and threshold determining method were also studied.A blind source separation model of optical fiber temperature data was finally built and the corresponding temperature change process of the leakage source was extracted.The implementation method of blind source separation technique was discussed.Indepth research on characteristics of optical fiber temperature data was carried out and optimal applicable conditions for realization method of the blind source separation techniques were also analyzed.The blind source separation technique implementation combining ICA with PCA was finally proposed.With leakage fiber-optic model experiment of ERJD, the reliability of optical fiber temperature measuring data based on the blind source separation method is validated.

Figure 6 :Figure 7 :
Figure 6: Comparative diagram between raw signal and the wavelet denoising signal.

Figure 8 :
Figure 8: Grey-scale map of residual space.