A PCA and ELM Based Adaptive Method for Channel Equalization in MFL Inspection

Magnetic flux leakage (MFL) as an efficient method for pipeline flaw detection plays important role in pipeline safety. This nondestructive test technique assesses the health of the buried pipeline. The signal is gathered by an array of hall-effect sensors disposed at the magnetic neutral plane of a pair of permanent magnet in the pipeline inspection gauge (PIG) clinging to the inner surface of the pipe wall.Themagnetic fluxmeasured by the sensors reflects the health condition of the pipe.The signal is influenced by not only the condition of the pipe, but also by the lift-off value of the sensors and various properties of electronic component. The consistency of the position of the sensors is almost never satisfied and each sensor measures differently. In this paper, a new scheme of channel equalization is proposed for MFL signal in order to correct sensor misalignments, which eventually improves accuracy of defect characterization. The algorithm proposed in this paper is adaptive to the effects of error on the disposition of the sensor due to manufacturing imperfections and movements of the sensors. The algorithm is tested by data acquired from an experimental pipeline. The results show the effectiveness of the proposed algorithm.


Introduction
MFL is a widely used nondestructive testing (NDT) methods for pipeline inspection.The inspection machine is usually called PIG.A PIG using MFL consists of several pairs of strong permanent magnets which magnetize the pipe along the axial direction of the pipe.Each pair of the strong permanent along with the yoke iron and the hall sensors and also brush is called a carrier.The carrier with the pipe consists of a magnetic circuit.Details of the PIG can be found at some famous inspection companies website, ROSEN, PII, and so forth.A lot of research work has been done to analyze the signal of MFL.Carvalho et al. [1], Christen and Bergamini [2], and Xiang and Tso [3] purposed neural network based methods to detect flaws.Mukhopadhyay and Srivastava [4] proposed wavelet based technique to denoise the signal of the MFL inspection.Mukherjee et al. proposed wavelet based inverse mapping system [5].Kathirmani et al. [6] using PCA [7] and wavelet [8][9][10][11] technique to compress data of the MFL signal.
But there is still one problem that needs to be studied.The sensing arrangement, that is, each sensor, its mechanical support, and the underlying electronics for acquiring the magnetic leakage flux data, commonly referred to as a channel, suffers mismatch among each other.A lot of factors can cause channel-to-channel mismatch, including the lift-off value between the pipeline, and position of hall and coil sensors and the various properties of electronic component.Other factors influence the factors mentioned above can also impact the output of the signal, such as the difference of the sensors location caused by assembly, the shake when the detector running in the pipeline, and so forth.All these factors make the output of the signal different even under same testing condition and testing object.This will lower the capacity of the detector especially during the post processing of the signal.Such kind of mismatch may also exist in other multisensor data processing.Commonly, it can be equalized by using adaptive techniques.An adaptive channel equalization algorithm to deal with the problem of channel-to-channel mismatch of MFL signals is given in 2 Mathematical Problems in Engineering [12].In [12], they assume that at least one sensor out of the sensor array is ideal and can be used as a reference.For implementation, this assumption imposes serious limitations on the performance of post processing algorithm as tolerance and misalignment of an individual sensor is not deterministic and needs to be accounted for in a stochastic framework for choice of a clear cut candidate qualifying as a reference channel.To solve the problem, Mukherjee et al. [13] gives an adaptive method which does not need to choose a reference channel.In [13], the reference channel required for channel equalization is replaced with the baseline estimation.The baseline estimation reflects the background leakage flux.But the baseline estimation is not available under a defect or feature.It is estimated by first order forward or back prediction of the neighboring MFL data.
In this paper, a new adaptive channel equalization algorithm to minimize channel-to-channel mismatch is proposed for MFL signals.In contrast to [12] our algorithm does not need to choose any reference channel.Because the ideal reference channel is not easy to get and the character of each channel may have little difference, equalizing the channels adaptively with no reference channel is reasonable.The signal of all channel is learned by a neural network.The training data set is selected as a clean pipe (almost no flaw), which reflects the character of the pipe.And distinguished from [13], we mainly focus on the signal around and including the flaw.In [13], the signal around the flaw is only estimated by first order forward or backward prediction of the neighboring MFL data, which may cause distortion of the flaw signal.As flaw evaluation needs the exact shape of the signal around and including the flaw, our algorithm has more advantage.A PCA based flaw detection algorithm is given in this paper to find the location of the flaw signal.A median corrected algorithm is also given from the engineering point of view.The simulation results show that the algorithm proposed in this paper is efficient.
The paper is organized as follows.In Section 2, an ELM based method is given to dynamically compensate each channel, and also with a PCA based statistic method to separate normal and flaw signal and extract signal characters.The details of our algorithm proposed in this paper are stated too.A median corrected algorithm is given, and simulation results are shown in Section 3. Section 4 concludes the paper indicating major achievements and future scope of this work.

PCA and ELM Based Channel Equalization
2.1.Channel Equalization Using ELM Neural Networks.Neural networks are very efficient and popular tool to do regression and classification.The advantages of the neural networks are mainly two points: one is that the model of the data does not need to be known, the other is its high capability to deal with nonlinear problems.There exist many types of neural networks; however, feedforward neural networks may be one of the most popular neural networks.The feedforward neural network usually consists of one input layer receiving the stimulin from external environments, one or multihidden layers, and one output layer sending the network output to external environments.Widely used neural networks include backpropagation (BP) neural network [14], radial basis function (RBF) neural network [15], and support vector machine (SVM) [16].Three main approaches are usually used to train feedforward networks including gradient-descent based method (e.g., BP neural networks), least-square based method (e.g., RBF network learning) and standard optimization method based method (e.g., SVM).Different from traditional learning algorithms, ELM [17] tends to reach not only the smallest training error but also the smallest norm of output weights.According to the neural network theory, for feedforward neural networks, smaller training error results in smaller norm of weights and better generalization performance.Since the hidden layer needs not be tuned in ELM and the hidden layer parameters can be fixed, the output weights can then be resolved using the leastsquare method.The model of channel equalization using ELM is given as Figure 1.
The output function of single-hidden layer feedforward networks (SLFNs) with  hidden nodes can be represented by where   () denotes the output function of the th hidden node.
For  arbitrary distinct samples (  ,   ), SLFNs with  hidden nodes are mathematically modeled as That SLFNs can approximate these  samples with zero error means that The parameters in   () can be trained according to (2).This can be written as where It is proved in [17] that given any small positive value  > 0, activation function  :  →  which is infinitely differentiable in any interval and  arbitrary distinct samples (  ,   ), there exists  ≤  such that for any {  ,   }

𝑖=1
(parameters need to be trained in   ()) randomly generated from any intervals of   ×  according to any continuous probability distribution, with probability one, ‖ ×  × −  × ‖ < .And from the interpolation point of view the maximum number of hidden nodes required is not larger than the number of training samples.In fact, if  = , the training errors can be zero.
Though the ELM has good regression ability, there is still one obverse problem that the results of the ELM rely on the training data set.But the flaw difference is from not only length and width, but also depth, which may cause the MFL signal variance.It is impossible to include all flaw signal in the training data set.A better way is to use a small training data set to train the ELM which can generate good compensation results.One solution is given in this paper in Section 2.2.

PCA Based Flaw Exclusion.
To train the ELM and solve the problem mentioned in Section 2.1, one solution is given.
Because the property of pipe is learned using ELM, we can use the channel with no flaw to predict the channel with flaw when flaw is detected.By using the predicted result to substitute the channels which detect flaw, the compensation result can be obtained using ELM.And then, using the ELM result, the compensation is given to each channel.This avoids training ELM with every type of flaw.To exclude the flaw, a PCA based algorithm is stated as follows, which detects which channels and which sampling points detest flaw signal.
PCA [18] as an efficient statistical learning algorithm is useful to deal with multivariable problems.The PIG has  sensors surrounding the vessel.Consider one sampling of all sensors as  = ( 1 ,  2 , . . .,   )  .The linear transform of the sensors result can be written as 1 is called the th principle component.The computation steps is as follows.
First,  samples from each sensor are collected and written as a matrix  ∈  × .The matrix is scaled to zero mean, and in addition to unit variance.
The second step is to compute the singular values.The covariance matrix is computed as An SVD (singular value decomposition) is used to compute the principal components and the associated singular vectors as where is the number satisfying 2  /Σ  =1  2  ≥  is called the significance level.And ( 10) is called the cumulated significance level, which shows how much the first  principle components can reflect data .The Hotelling  2 statistic is used to detect fault With a set threshold, flaw signal can be separated from normal signal.Suppose there are only two sensors.Take 500 sampling.The result is shown in Figure 2. The linear transform of  is also shown in this figure.Using the Hotelling  2 statistic, which is shown as formula (11), the flaw data can be detected.

Details of PCA and ELM Based Algorithm for Channel
Equalization.The data gathered using a PIG can be described as  ∈  × , where  is the sampling number and  is the number of sensors.The algorithm proposed in this paper treats data with two points of view, one is from the sampling view and the other is from the sensor view.From the sampling view, the PCA algorithm is used to detect flaw, which determines at which sampling points the flaw locates.For example, take  sampling, using PCA, flaw locates at [,  + Δ], where 1 ≤  ≤  and 1 ≤  + Δ ≤  can be detected with a preset  2 statistic.The normal part is tested using a trained ELM neutral network to remove channel mismatches, which is called ELM 1.And the flaw part is treated from the sensor point of view.The normal part is added to the training set dynamically in order to learn more character of this part of pipe.
From the sensor view, the flaw data is also treated using PCA to determine which sensors detect flaw.For example, only sensors  to  + Δ detect flaw, where 1 ≤  ≤  + Δ ≤ .The normal channel is used as input data, using ELM trained with channels of normal from training data set, the flaw part of signal can be forecasted as normal part.This step is to revert the flaw part to its normal condition, in order to determine how much each channel needs to be compensated with ELM 1.And using ELM 1, the channel mismatch of the flaw part can be removed.It is clear that each flaw needs an ELM neural network to compute the compensation, which is called the ELM p in Figure 3.
The flow chart of the algorithm is shown in Figure 3 with steps and details stated as follows.
Step 1.An ELM is trained using training data set to learn the character of the normal condition of the pipe for channel equalization.The target is each channels compensation.The ELM trained is marked as ELM 1.
Step 2. Using PCA with  = ( 1 ,  2 , . . .,   )  represents sensors to detect flaw which is stated in Section 2.2 with a threshold.The  2 statistic up over the threshold denotes the flaw signal.The signal will be separated into normal parts and flaw parts in Step 3.
Step 3. The signal of flaw detected from Step 2 is tested using PCA with  = ( 1 ,  2 , . . .,   )  represents sampling points.A threshold is also set automatically.The  2 statistic up over the threshold denotes the channels which detects flaw.It makes the flaw signal detected from Step 2 separated into two parts, the signal of normal channels and the signal channel of flaw.
Step 4. Signal of normal parts acquired from Step 2 is tested using ELM 1 trained in Step 1.
Step 5.For the flaw signal detected in Steps 2 and 3, another ELM is trained with training data set.The normal channels are used as inputs of the ELM, and the flaw channels are used as output of the ELM.For example, the flaw is detected at sampling point from  to  + Δ, and channels from  to  + Δ.
The training data set from channel 1 to  − 1 and  + Δ + 1 to  is used to train this neural network.The target of the ELM is set as the training data set with channels from  to  + Δ.Each flaw has an ELM.The ELM is marked as ELMp with  representing the number of flaw.
Step 6.The flaw signal from  to  + Δ, and channels from  to  + Δ is replaced temporarily by the test result with ELMp.
Step 7. The signal segment in Step 6 is tested using ELM 1.The output is compensated to the original signal from  to  + Δ and channels from  to  + Δ.
Step 8. Update the training data set with normal data acquired in Step 2.

Experiment Results
The PIG used to collect data in this paper consists of 15 carriers with 5 axial sensors on each carrier.An 8-inch seamless steel pipe with length of about 14 meters is used to do this experiment.9 exterior flaws were made on this pipe.The pipeline in our experiment is connected with two flanges and one weld.The sampling is controlled by an odometer wheel with the sampling frequency of 1 sampling per 2 mm.Several experiments were done with different load angel of the PIG.
In order to train our algorithm, some signal of MFL of normal condition pipeline with no flaw is needed.The training data is obtained in two way.(i) The first way is to obtain from the original training data.One test data is selected as original training data with flaw signal excluded manually.(ii) The second way of getting the training data is generated from each test using algorithms stated in Sections 2.1 and 2.2.The algorithm used in this paper treats data batch by batch.One batch of data treated using the algorithm can separate this batch of data into normal data and flaw data.The normal part of data is added to the training data set in order to reinforce the training result.By adding new training data, the training set is updated.New character of the normal condition is studied.This means as the process goes, the result of the algorithm proposed in this paper gets better.To avoid the training data set getting too large, the length of the data set is set to a certain length.When length of the training data set grows up to its limits, the earlier training data will be erased and new training data will be added to replace the vacancy.
And in order to show the efficiency of our algorithm, the length of the original training data is reduced to only less than 1/5 of one test, though theoretically a bigger training  data set will get better result.Also restricted to the experiment environment, it is not possible to build a pipeline with enough length of hundreds of meters or even miles.The results with less original training data also show that the algorithm proposed in this paper generates satisfying results.
And algorithm of median corrector is also adopted to compare with our results.Among many factors that influence channel-to-channel mismatch, the lift-off value and the difference of the baseline play the main role.The influence caused by difference of the lift-off value among sensors can be reduced by improving the assembly skills.And the influence of the baseline (zero output) of sensors can be reduced by calculating the average of sensors as baseline.The channel equalization can be computed as three steps.Assume  as data gathered by sensing a clean pipe (almost no flaw).First, calculate each channels average   .Second, calculate  mean the average of   .Third, compensate  mean −   to each channel.But as the MFL data is processed automatically, this method needs to be operated manually.And it is not that easy to find such steady state signal.To overcome these, we use median corrector to do rough channel equalization in engineering.Instead of calculating the average of each channel in the first step,   is each channel's median value.Other steps are same as the average method.
All data of one test is treated using the algorithm proposed in this paper.Figure 4 shows the PCA result of Step 2 in Section 2.3.The threshold is automatically generated with  distribution parameter of 95%.All the components and flaws are described in     parameter of 95%.The normal channels is used as input of ELM p, and test result is shown from Figures 7-12.
To show the result, the standard deviation (SD) and peak signal to noise ratio (PSNR) are adopted.The results are shown in Tables 1 and 2. Lower standard deviation reflects that the signal is cleaner with less channel mismatch.To illustrate the result, two segments of flaw signal are plotted.The baseline estimation algorithm proposed in [13] is also compared in Tables 1 and 2. From Table 1, it can be seen that the flaw in show is typical small size flaw.The PSNR results in Table 2 also indicate that our algorithm minimizes channelto-channel mismatches.
From Figures 7 and 10, it is clear that the signal has great channel mismatches which makes it impossible to evaluate the flaw size.Using median algorithm corrected data shown as Figures 8 and 11, the flaw signal is prominent and the signal of normal channel is smooth, but still some ripple exists.the results of algorithm proposed in this paper, the flaw signal is more prominent and the normal part is smoother, which results easy to evaluate the flaw.The SD shown in Table 1 also indicates that data treated by our algorithm has less channel mismatches and noise.

Conclusion
In this paper, a new adaptive channel equalization is proposed for processing of MFL signal prior to flaw characterization.
The scheme performs channel equalization by using single layer neural networks, and the fast learning algorithm of ELM is used to give excellent processing speed.Focusing on the signal of flaw, a PCA based flaw detect algorithm is given to locate the flaw signal.The algorithm proposed in this paper minimizes the channel-to-channel mismatch and reduces the distortion of the signal of the flaw.Both theory analysis and simulation results show the efficiency of our algorithm.For the flaw signal, the algorithm proposed in this paper needs to locate the flaw first.For shallow and small flaw, how to locate it and make less false detection is still a problem in both theory and engineering.

Figure 1 :
Figure 1: Neural network model for channel equalization.

Figure 3 :
Figure 3: Flow chart of algorithm proposed.

Figure 4 .
Details of raw signal of 3 flaws are shown to illustrate the following steps in Figure5.The dashed line shows the flaw sampling point intervals detected using PCA as shown in Figure5.By applying the sensor view of PCA stated as Step 3 in Section 2.3, the flaw signal channels are marked within the solid line in Figure5.The sensor view PCA result is shown in Figure6, with  distribution
n el in d ex S a m p li n g n u m b e r Sensor output (V)

2 . 9 C
ha n n el in de x S a m p li n g n u m b e r Sensor output (V)

Figure 12 :
Figure 12: PCA and ELM based algorithm treated data of flaw 2.

Figures 9 and 12
Figures 9 and 12  show the results of algorithm proposed in this paper, the flaw signal is more prominent and the normal part is smoother, which results easy to evaluate the flaw.The SD shown in Table1also indicates that data treated by our algorithm has less channel mismatches and noise.
Figures 9 and 12  show the results of algorithm proposed in this paper, the flaw signal is more prominent and the normal part is smoother, which results easy to evaluate the flaw.The SD shown in Table1also indicates that data treated by our algorithm has less channel mismatches and noise.

Table 1 :
SD of raw data and corrected data.

Table 2 :
PSNR of raw data and corrected data.