^{1}

^{2}

^{2}

^{3}

^{1}

^{2}

^{3}

One of the most efficient nondestructive methods for pipeline in-line inspection is magnetic flux leakage (MFL) inspection. Estimating the size of the defect from MFL signal is one of the key problems of MFL inspection. As the inspection signal is usually contaminated by noise, sizing the defect is an ill-posed inverse problem, especially when sizing the depth as a complex shape. An actor-critic structure-based algorithm is proposed in this paper for sizing complex depth profiles. By learning with more information from the depth profile without knowing the corresponding MFL signal, the algorithm proposed saves computational costs and is robust. A pinning strategy is embedded in the reconstruction process, which highly reduces the dimension of action space. The pinning actor-critic structure (PACS) helps to make the reward for critic network more efficient when reconstructing the depth profiles with high degrees of freedom. A nonlinear FEM model is used to test the effectiveness of algorithm proposed under 20 dB noise. The results show that the algorithm reconstructs the depth profile of defects with good accuracy and is robust against noise.

Magnetic flux leakage (MFL) is one of the most widely used NDT techniques, which has been widely used for inspection of oil and gas pipeline since the 1960s. It is efficient in finding defects caused by corrosion and mechanical damage and other metal loss defects for pipelines and storage tanks [

The solutions of MFL inverse problem could be classified as either non-model-based methods or model-based methods. Non-model-based methods solve this inverse problem by building a mapping between sampled signal and the shape of defect. Neural networks are usually used to build this mapping [6–10]. The input of this neural network can be the signal of MFL inspection, and the sizing information of the defect is set as the output. These methods are fast but highly rely on the data set used to train the neural network. The accuracy is highly impacted by the quality of training data set.

A forward physical model is involved in the model-based methods. The forward model is used to give simulated signal to a given depth profile. The simulated signal is used for comparison with the reference signal. The residual error between simulated signal and reference signal is used to give information for the iteration strategy. By minimizing the residual error, the size of the defect is repetitively computed [

Considering the state-of-the-art solutions, there are still some common problems in solving the problem of sizing the defects. First, for the non-model-based method, the mapping is trained according to the data without exploration to data not included in the training set. It makes the mapping highly rely on the distribution of the training data set. As the MFL inverse problem is ill-posed, the mapping from signal to defect profiles can also be troubled by the nonuniqueness of the mapping. Second, for the model-based method, the iteration strategy is designed based on the forward model in use and highly relies on it. For numerical model, it has high performance in simulating the inspection signal, but it is hard to build an iteration strategy based on it.

The similarity between the state-of-the-art machine learning technique of game play [

The principle of actor-critic based structure is introduced in Section

The principle of MFL inspection is based on electromagnetic theory. By magnetizing the test material into saturation, a magnetic flux leakage can be detected by Hall-effect sensors where a defect is located. Strong permanent magnets are usually used to magnetize the testing material. The Hall-effect sensors are usually located close to the surface of the tested material. The magnetizing and sensing principles are illustrated in Figure

Principle of MFL inspection.

The principle of MFL inspection is magnetic, where Maxwell’s equation can be used to describe its behavior:

Illustration of magnetic flux leakage of a simple defect.

As the analytical solution cannot provide enough accuracy for complex defects, numerical methods are usually used to get a numerical solution for these problems. FEM is a widely used method to get a numerical solution for partial differential equation. The general process of a FEM solution is as follows. First, the partial differential equation is transformed into corresponding variational functional equation. Then, the domain that needs to be computed is divided into certain number of finite elements. By assembling all the variational functional equations of all the elements within the domain, the solution can be obtained by solving

Commonly, the MFL inspection model is built with components as described in Figure

Illustration of simplified FEM model used in this paper with complex depth profile.

Reinforcement learning considers the paradigm of an agent interacting with its environment aiming to learn a behavior which maximizes the reward. The agent consists of an actor network and a critic network. The actor network is trained to decide which action should be taken at current state. The critic network evaluates each action based on its current state with reward and improves the strategy of the actor network. There are four definitions in RL, state

The actor policy is updated using sampled gradient as follows:

In the problem of sizing the depth profile of MFL inspection, four parameters are involved in the reconstruction process, the reference depth profile

This paper is inspired by the similarity between the training process of RL and model-based iteration method. The similarity is illustrated as shown in Figure

Similarity between the learning process of reinforcement learning and classic iteration method.

In this paper, an actor-critic structure-based RL method for complex depth profile reconstruction is proposed. The algorithm of Deep Deterministic Policy Gradients (DDPG) is adopted to train the actor-critic structure [

As the target of this MFL inverse problem is sizing the depth profile as precisely as possible, it means the subdefect

Illustration of pinning-based depth reconstruction within the ROI.

Flowchart of PACS learning process within one episode.

Initialize actor network

For episode = 1, M do

Initialize pinning subdefects

Get initial observation state

For

Generate an action

Execute action

Interpolate to get the full depth profile

Store

If capacity of replay buffer

Randomly sample

Update the critic network and actor network with (

Update the target networks:

end if

If error between each reference subdefect and reconstructed subdefect is less than

break

end if

end for

end for

From Algorithm 1, it can be seen that, within one episode,

To test the accuracy of the algorithm proposed along with robustness, a simplified nonlinear numerical forward model is adopted as in [

Dimension of the forward model with a defect of 49 subdefects.

In order to test the effectiveness of the algorithm proposed in this paper, three error measurements are given. These measurements are root mean squared error (RMSE), peak depth error (PDE), and maximum deviation (MD). These measurements are described in (

Illustration of PDE and MD.

The structure of actor network and critic network of PACS algorithm is described as follows. There are 82 neutrons in the input layer of actor network: 11 neutrons for the pinning subdefect normalized depth vector and 71 neutrons for the normalized reference signal. The sampling position of signal used as part of the state is illustrated in Figure

Illustration of sampling position of the inspection signal as part of state.

The number of episodes

There are 10000 randomly generated complex defects with corresponding sampled signal generated with COMSOL Multiphysics 5.3a with MATLAB. 5000 of the pieces of data are used as training data set and 5000 others as testing data set. The algorithm proposed is coded with Python and TensorFlow 1.15. All the data and algorithm are run on a laptop with Intel i7 10750H processor and 16 GB RAM.

Sizing results with different ranking of MD value. (a) 10% ranking. (b) 30% ranking. (c) 70% ranking. (d) 90% ranking.

From error definition (

The signals for corresponding reconstructed depth profile in Figure

Signal used to get the results in Figure

Error convergence for corresponding results in Figure

To show the robustness of the algorithm proposed in this paper, the algorithm is trained with different size of training data sets. The size of the testing data set is 5000, which is the same as that in Figure

Error distribution of results from different size of training data sets. (a) PDE. (b) MD. (c) RMSE.

The actor-critic structured DDPG, the direct Gauss–Newton optimization (DGNO) in [

Results shown as distribution of error definitions from different reconstruction methods. (a) PDE. (b) MD. (c) RMSE.

In this paper, a pinning actor-critic structure-based solution for sizing complex depth profile with high degree of freedom of MFL inspection is studied. By involving the actor-critic structure, a novel way of utilizing the fine numerical based forward model in reconstructing the depth profile for MFL inspection is given. To solve the problem of the performance of reward deficiency, which is measured as Euclidean distance, a pinning strategy is given. By introducing the pinning subdefects, the action space has less variability than giving every subdefect an action. The robustness of the reconstruction results is improved by involving PACS.

The effectiveness of PACS proposed in this paper is tested with simulation results from nonlinear numerical forward models of MFL inspection with FEM. The results that are shown in a statistic way show the effectiveness of PACS proposed in this paper. The depth profiles reconstructed from signal with 20 dB noise are close to depth profiles reconstructed from noise-free signal, proving the robustness of PACS proposed. The results also show good accuracy compared with representative solutions of depth profile reconstruction.

No data were used to support this study.

The authors declare that they have no conflicts of interest.

This work was supported by the National Natural Science Foundation of China (Grant nos. 61703087 and 71502029).