Remaining Useful Life Prediction Techniques of Electric Valves for Nuclear Power Plants with Convolution Kernel and LSTM

. Electric valves have signiﬁcant importance in industrial applications, especially in nuclear power plants. Keeping in view the quantity and criticality of valves in any plant, it is necessary to analyze the degradation of electric valves. However, it is diﬃcult to inspect each valve in conventional maintenance. Keeping in view the quantity and criticality of valves in any plant, it is necessary to analyze the degradation of electric valves. Thus, there exists a genuine demand for remote sensing of a valve condition through nonintrusive methods as well as prediction of its remaining useful life (RUL). In this paper, typical aging modes have been summarized. The data for sensing valve conditions were gathered during aging experiments through acoustic emission sensors. During data processing, convolution kernel integrated with LSTM is utilized for feature extraction. Subsequently, LSTM which has an excellent ability in sequential analysis is used for predicting RUL. Experiments show that the proposed method could predict RUL more accurately compared to other typical machine learning and deep learning methods. This will further enhance maintenance eﬃciency of any plant.


Introduction
Effective maintenance of nuclear power plants (NPPs) is required to avoid equipment failure. At present, corrective maintenance is the general mode of operation and maintenance of NPPs which requires large amount of human and material resources [1]. After the predictive maintenance is implemented in aerospace, process industry, and other fields, unplanned maintenance caused by system failure has been reduced significantly [2]. For NPPs, preliminary estimates suggest that hundreds of millions of dollars annually could be saved from predictive maintenance [3].
e operating life of an NPP is usually long, with a running time of about 60 years, while related components also age with protracted running time. At present, many nuclear reactors have reached or are about to reach their design life. erefore, there is an urge to evaluate the health status of key components in NPPs so that necessary operational and maintenance plans can be formed along with the decision of whether to extend the life span or not [4]. is makes it necessary to fully evaluate the aging and failure mechanism of related components according to their characteristics. Meanwhile, advanced perception technologies are being employed to compile operational and maintenance data of running NPPs. e data will form the basis for application of intelligent analysis techniques such as online health status assessment and RUL prediction [5].
Valves are one of the most common components in NPPs and play a significant role in their operation. In a typical 1000 MW pressurized water reactor, about 30,000 valves of various types including gate valves, globe valves, ball valves, butterfly valves, and check valves are installed [6]. During routine maintenance, most valves are inspected onsite while some are even required to be examined in their running state. is limitation can be overcome by sensing means and accurate RUL assessment methods for determining their status, so as to detect abnormalities in time while saving operational and maintenance cost. More importantly, nuclear and other nuclear safety-related valves along with other expensive valves cannot be simply replaced without knowing whether they are damaged or not because it may involve a large number of such valves or limitation of installation space, adding to the cost. erefore, effective techniques must be adopted to prevent the failure of critical valves as they cannot be disassembled for inspection.
Since the 1980s, rapid development of aerospace as well as nuclear industry has urged the continuous development of health management and life prediction techniques [7]. However, higher levels of system and equipment complexity have restricted further development of model-based techniques, mainly because of their dependence on precise analytical solutions [8]. On the other hand, data-driven methods could make use of data information to directly evaluate the status of equipment. Currently, deep learning methods have gained popularity in industry and academia, among which recurrent neural network (RNN) is one of the most popular ANN [9]. Furthermore, LSTM is an improvement to basic RNNs and has outstanding advantages in dealing with RUL prediction.
In line with task requirements and environmental characteristics of electric valves, this paper lays down the new requirements for RUL prediction along with designing of experimental platform for testing and then introduces nondestructive testing methods to detect its operational status. In order to make full use of measurements, the features implied in the raw data during aging process are processed as two-dimensional data matrix which allows us to treat the aging data in each time as a picture whereby its features are extracted through convolution kernel. Convolution kernel is particularly effective in this regard as it extracts each feature without artificial selection. On the downside, convolution kernel processes each time step separately, thus ignoring the order and correlation between different time series; hence, its output needs to be updated.
is is done by transmitting the results of convolution kernel to long short-term memory (LSTM) network. By utilizing the advantage of LSTM in dealing with time-series dataset, more accurate RUL prediction results are obtained in this paper. e structure of this paper is shown as follows. e first section is the introduction of this paper, and then the second section analyzes the theory and method of convolution kernel and LSTM network. Furthermore, aging and failure tests for electric valves are introduced in Section 3. In the fourth section, the complete scheme of RUL prediction is presented. Moreover, convolution kernel and LSTM would be tested and parts of hyperparameters are optimized with data gathered from electric valves experiment platform. Section 5 is the conclusion of this paper.

Methodology Overview.
Overall process of electricvalves RUL prediction is shown in Figure 1 where the sensors collect data for state and feature engineering analysis while data processing and feature extraction are carried out separately. ese steps are necessary for status monitoring and accurate RUL prediction. Once the relevant features exceed predetermined thresholds during online condition monitoring, intelligent algorithms are used to predict its RUL.
RUL is the length of time from the current time to the end of service life. e main task of RUL prediction is to predict the remaining time before it becomes inoperative on the basis of online monitoring information [10]. At present, RUL prediction techniques classify RUL into 4 categories as shown in Figure 2.
Instead of building complex physical models, rapid development of artificial intelligence and big data mining techniques has led to the popularity of deep learning techniques in the field of prediction where aging and degradation patterns of components from existing historical data can be used for machine learning [11]. In this paper, relevant intelligent algorithms based on deep learning are mainly used for RUL prediction due to their superiority in dealing with highly nonlinear regression problems.
Shallow neural networks have been proposed and developed rapidly since the 1960s. Among them, feedforward neural network is the most commonly used. Wang et al. used three-layer feedforward neural network to develop trends of health indicators. ese results were used to estimate the status of components on the basis of proportional model [12]. Although shallow neural networks could learn more complex nonlinear relations, they cannot accurately describe degradation over time. Support Vector Regression (SVR) was proposed based on the theory of statistical learning and structural risk minimization and was applied to minimize the empirical risk and confidence range according to the limited information [13]. Liu et al. developed an improved probabilistic SVR model to predict the degradation of components in NPPs [14]. However, SVR also has some limitations. Firstly, it could only provide single-point prediction. Secondly, it is more suitable for regression with limited samples but has limited applicability for big data. On the other hand, DNN could easily handle big data.
DNN has stronger pattern regression ability than shallow neural network and its analytical accuracy is obviously higher when there is enough data. Currently, there are various types of DNN models such as autoencoders, convolutional neural networks (CNNs), RNN, and variants of them. Among them, autoencoder is usually associated with feature extraction and manifold learning. CNN is generally used for image recognition and video tracking. RNNs could effectively remember the historical information so as to process the explicit time-series datasets which is widely used in RUL prediction. Zemouri and Gouriveau proposed RNNbased radial basis function (RBF) and used it to predict the RUL of mechanical equipment [15]. Malhi et al. proposed an RNN training method based on competitive learning, aiming to improve the long-term prediction accuracy of RNN [16].

Convolution Kernel.
Although some scholars have applied RNN or its variants for RUL prediction and its variants, they were unable to extract features efficiently. erefore, this paper integrates the convolution kernel into RUL prediction process. Convolutional neural network (CNN) is a common neural network architecture. In 2012, Krizhevsky et al. won ImageNet image recognition challenge by AlexNet [17]. After AlexNet achieved success, people regained their enthusiasm in DNN research and successively proposed many highly efficient CNNs [18]. e difference between CNNs and fully connected neural network is that they utilize convolutional kernel and pooling layer for feature extraction. e characteristics of convolutional layer are that the sharing of their cooperative weights and bias makes it similar to biological neural network [19]. If fully connected neural network is adopted, a neuron needs to connect to each single point in the twodimensional data, i.e., the number of neurons in hidden layer is equal to the amount of two-dimensional data. erefore, as the number of layers increases, the parameter scale of the entire network becomes extremely large. However, CNN could alleviate the above problems. On the one hand, the convolutional layer adopts local connection and scans the global dataset in accordance with the fixed step size, thus reducing the number of neurons. On the other hand, the same convolution kernel keeps the weight sharing when traversing the whole dataset. e pooling layer is usually added after a convolutional layer. Its function is to filter the redundant features and reduce parameters to prevent overfitting, which is essential for performing aggregate statistics on the features. By stacking convolution layer, pooling layer, nonlinear activation function, and dropout layer in a specific order, CNN  Science and Technology of Nuclear Installations could be constructed to any arbitrary depth [20]. Based on the above principles, this paper mainly adopts convolution kernel and pooling layer to extract features and treats them as a preprocessing module for LSTM. is allows LSTM to extract deep features while avoiding excessive extraction at the same time. us, loss of sequential information is avoided. is paper adopts two convolutional layers and one pooling layer to design the RUL prediction model. e structure of convolution kernel formed in this paper is shown in Figure 3.

LSTM.
LSTM neural networks are an effective solution to sequence regression problems and are suitable for RUL prediction. eir superiority lies in their ability to effectively cater vanishing or exploding gradient problems common in other deep networks. is also allows the network to remember long-term dependencies [9]. e structure of RNN is designed in a way that information in nodes of hidden layers will be recycled to achieve time-series memory.
is means that each time, new information is processed, a decision is made to retain or forget previous information dependencies. us, final output can be calculated on the basis of long-term dependencies [21]. In theory, any length of time-series data could be trained and predicted.
LSTM is an improvement to the basic RNN which adds "cell state" and "processor" throughout the whole time series to judge whether information is useful or not. It includes input gate, forgetting gate, and output gate as depicted in Figure 4 [23]. "Forgetting gate" utilizes sigmoid function to determine what needs to be dropped from C t−1 to get the output f t . "Input gate" adopts sigmoid and tanh function to determine what needs to be reserved from C t−1 to get the output, while O t means the results of "output gate." e current output "h t " and "C t " could be calculated after activation by tanh function [22]. It has been proved that LSTM is an effective technique to solve the problems of long sequence dependence and vanishing or exploding gradient existing in the basic RNN. In Figure 4, x t is input vector, h denotes cell output, and C denotes cell memory. For each time step, four vectors f t , i t , C t , and O t are calculated by concatenating dot products of previous hidden states, and input vectors as given by Here, f t is forget gate, i t is input gate, C t is candidate gate, and o t is output gate. Also, u and W are the weights for inputs in each gate. Depending on the value of forget gate, the value of the next memory gate is calculated as is allows for required memory state to be transmitted to next cell unaffected. is way, long-term dependencies are passed through the whole network unaffected as forget gate is applied to filter only redundant parameters. Finally, output h t of each cell is calculated as

Aging and Failure Tests for Electric Valves
In this paper, convolution kernel and LSTM are combined to predict RUL, among which the most critical link is the need for a large number of aging and fault data. is chapter first describes the typical aging modes and characteristics of electric gate valves. Afterwards, we describe the experimental setup built for gathering run-time data by simulating normal and aging state of electric valves.

Typical Aging and Failure Modes of Electric Valves.
e electric valve is in direct contact with the working fluid and controls the flow rate by cutting or connecting the pipeline. Improperly used valves are prone to a variety of faults such as valve body wear, internal valve leakage, external valve leakage, actuator degradation, and so on.
Valve body wear: unreasonable selection of power parameters for electric valve is the main cause which affects the normal operation of electric valve. If the torque is less than the required level, the valve will not open or close normally which exposes the motor to risk of burn down. If the torque is too large, valve will lose control, thereby destroying the valve structure.
Internal valve leakage: the electric valve stroke adjustment is improper which causes the valve to close loosely. Secondly, due to direct contact of valve with working fluid, it constantly scours and will wear the valve parts.
External valve leakage: the leakage is caused by loose seal between the electric valve and the coupling joint or screw or it can result from loose seal between the valve stem and the gran or by the trachoma on the oil circuit board.
Actuator electrical components damaged: the actual valve position is inconsistent with the feedback signal due to which the valve fails to open or close.

Design of Experiment Platform.
In order to gather different states of data, multiple electric valves of the same type are installed on a platform. is allows for different aging and failure modes to be inserted simultaneously. Accordingly, the aging and degradation of electric gate valve is simulated by setting up water circulation test bed as shown in Figure 5.
e three electric gate valves installed are all Z941H-25P straight screw gate valves. In order to obtain sufficient adjustment ranges, the nominal pressure of 2.5 MPa was selected, and rigid single-gate valve was connected with a flange, driven by a three-phase squirrel cage coil motor. e valve takes 46 s from opening to close in a full stroke driven by motor. Furthermore, two pressure gauges, a differential   Figure 4: e flowchart of LSTM algorithm [22].  Science and Technology of Nuclear Installations gauge, and a flowmeter are installed in the main pipe for measuring the process parameters.

Test and Measurements of Aging Electric Valve.
In this paper, analysis of normal operation, internal leakage by wear, and external leakage by cracking was carried out. For internal leakage by wear, valve position is intentionally set in such a way that it leaves a gap between sealing surface through operation of a hand wheel. For external leakage by cracking, the aging phenomenon is simulated by the lax sealing between electric valve and coupling screw. e cracks could be adjusted by light screwing. e measurements of flow and pressure alone are not enough to represent the status of electric valve. us, it is necessary to adopt other methods to measure the characteristic parameters of various aging modes. Common nondestructive testing methods of electric gate valve include bubble testing, ray testing, and acoustic emission testing. Compared with bubble testing and ray testing, acoustic emission testing is a better nondestructive testing method for gate valves. It measures the transient stress waves on the body surface when fault occurs. erefore, this paper mainly uses acoustic emission technique for state perception. In Figure 6, the arrangement of acoustic emission sensors on electric valves is shown. e acoustic emission sensor is installed as close as possible to the sound source as it could be triggered by turbulent field generated near the cracking after aging.
Monitored acoustic emission signals include ringing count, amplitude, duration, energy, root mean square value (RMS), and average signal level (ASL), as shown in Figure 7.

Complete Flowchart of RUL Prediction
e flowchart of the proposed RUL prediction is shown in Figure 8. Before RUL prediction, offline training should be implemented in which training datasets are acquired from the experimental platform through a large number of repeated experiments as shown in Section 3. e training processes are listed as follows: Step 1.
e characteristic parameters obtained from acoustic emission sensors and process measurements are acquired. After that, those parameters changing slightly during degradation are removed and training data is normalized and standardized.
Step 2. Define RUL label. Different from common regression problems, we do not have the precise RUL label beforehand. It is usually impossible to evaluate the precise health condition and estimate the RUL at each time step without an accurate physics-based model. For this kind of applications, a piecewise linear degradation model has been proposed [24]. e piecewise linear degradation model assumes that the engines have a constant RUL label in the early cycles and then the RUL starts degrading linearly until it reaches 0. In the manuscript, we assumed a constant RUL value at the early stages.
Step 3. Preprocessing of input data: in order to reflect the time-series characteristics of LSTM calculation, the input data with N * D dimension needs to be converted to (N-num_steps + 1) * (num_steps * D), where N means the total length of time while D is the dimensions of measurements; num_steps refers to the length of time-series sliding window which could be used for obtaining the two-dimensional data block x in each time. Since there is overlap between the data series during each slide, the total data input length is (N-num_steps + 1). In this way, the input data in each moment is not isolated and the combination of timeseries is used which could better reflect the continuity of timing characteristics during degradation.
rough the TensorFlow framework, several layers of convolution kernel and pooling calculation are constructed as shown in Section 2.2 which is more conducive for LSTM to learn from these deeper-level features.
Step 5. On the basis of Step 4, LSTM tuple model is first established; meanwhile, the dropout operation is used which could make the LSTM network more "robust." After obtaining the LSTM tuple unit, the stack function is adopted to construct the entire LSTM network.
Step 6. Training the proposed network: during the training, datasets are divided into a number of batches to accelerate training efficiency. Randomly scrambling of datasets is also adopted to reduce uncertainty.
Step 7. Loss function and parameters optimization: mean square error (MSE) is used to evaluate the network. In order to optimize the weight and bias of the proposed network, SGD optimization algorithm was used in the training process which further makes the loss values as small as possible. During optimization, learning rate of the first five iterations was set to 0.001 in each backpropagation iteration while the attenuation rate of each subsequent iteration was set to 0.99.
When the offline training process is completed, the randomly selected test data is standardized as shown in Steps 1-3. After that, the optimized convolution kernel and LSTM models could be used to predict the RUL for electric valves.

Simulation Analysis
In this section, the aging and degradation is simulated by leakage from the valve body into the environment due to cracks under the assumption that the valves are in a single aging mode. And then a large number of experimental tests were conducted to obtain datasets under different degrees of aging so as to verify the practicability and accuracy of the RUL prediction method. is kind of aging phenomenon is selected mainly because it is the most common aging and degradation mode of electric valves and may have serious impact on the environment by release of radioactive substances. Keeping in view that this method is still in the research stage, manufactured parts for such faults are not available. erefore, to save the cost of manufacturing valves with such fault simulating parts, a connection screw on the electric valve is added that could be adjusted repeatedly. is allowed us to mimic degradation of electric valves repeatedly and continuously so that a large number of aging data could be acquired.

Experimental Tests and Data Acquisition.
e normal operation and degradation (leakage from the valve body into the environment due to cracks) of the electric valves are measured during the whole experimental tests. After confirming that the data collection software and hardware work correctly and the instrument works normally, relevant parameters of acoustic emission card are set as follows: sampling frequency, 5000 kHz; digital filter band, 15 kHz∼70 kHz; the interval of parameters, 500 μs; hangover time, 1000 μs; peak interval, 300 μs; locking time, 1000 μs; single-channel waveform threshold, 40 dB; and singlechannel parameter threshold, 40 dB.
During the aging experiment, the leakage can be continuously adjusted by slowly rotating the tightness of the coupling screw which could better simulate the aging process of electric valves. Additionally, the process could be measured repeatedly by effectively increasing the size of aging and fault data. is ensured that reliable and sufficient amount of data is extracted for RUL prediction.
In the experiment, the tightness of the screw is slowly rotated under a certain circulating frequency and opening degree of the electric valve, making the leakage increase gradually. Different pump frequencies and opening degrees represent different operating conditions of electric valves. During experiments, a total of 5 different circulating pump frequencies and 8 valve openings were set with a total of 40 operating conditions. Under each operating condition, 30 groups of experiments were carried out with various levels of screw tightness. In this way, effective randomness and uncertainty in the aging process could be mimicked. In each group of experimental data, the measurements include the  Science and Technology of Nuclear Installations frequency of the circulating pump, the actual opening degree of the electric valve, the pressure difference between the front and rear of the electric valve, the fluid flow rate through the valve, and 5 kinds of acoustic emission signals, namely, amplitude, ringing count, energy, RMS, and ASL. Moreover, the length of time for each group varies from 3 hours to 6 hours, in which the amplitude and ringing counts during degradation were measured and are shown in Figure 9.
Comprehensive analysis of all acoustic emission signal measurements during the aging process show approximately the same trend, i.e., the degradation deteriorates gradually under a certain pump frequency and valve opening degree. When the degradation exceeds a certain value, the parameters show an obvious trend of change. While after a critical value, the parameters remain approximately unchanged as shown in Figure 9.
is is because the flow through the valves is no longer in turbulent state as the valves are completely open after degradation crosses certain value.   In estimating the RUL of components and equipment, a linear reduction in RUL over time is often used as equipment inevitably runs to failure. However, the reality is that aging and degradation at the beginning of operation is often negligible. erefore, RUL label was adopted as a piecework function in this paper. Specifically, the initial maximum life was set as a fixed value, and then the RUL label was set to gradually decrease. When the leakage volume increases but the acoustic emission parameters do not change, the RUL is considered to be stable.

Comparison of Different Hyperparameters in Convolution
Kernel and LSTM. During the training process of convolution kernel and LSTM networks, numerous hyperparameters need to be set. However, setting up nonoptimized parameters would influence network accuracy which would in turn induce a great deal of uncertainty in RUL prediction. erefore, hyperparameters of convolution kernel and LSTM are optimized first to ensure RUL prediction accuracy. e default sliding window size for inputs is 40. For LSTM network, the default number of LSTM_size in each layer is 256, and the default number of LSTM layers is 2. For convolution kernel, 2 hidden layers with 64 kernels per layer are set by default. For the training process, the total training epochs are set as 200, while the data is divided into 64 batches. Moreover, the loss function is mean square error (MSE), and SGD algorithm is utilized for optimizing weights and bias. Finally, explained variance score, mean absolute error, mean squared error, and R 2 score are adopted for evaluating the performance of RUL prediction.
(1) e number of kernels in convolution kernel layers: During the simulation test, the number of corresponding convolution kernels was compared firstly. e results of training loss for 32, 64, or 128 kernels are all tending to converge. But, as can be seen from Table 1, when the number of kernels is 64, we have approximately the smallest average MSE along with other indicators. us, in order to ensure the accuracy and calculation speed for further RUL predictions, the number of kernels is set to 64.
(2) Comparison of num_layer in LSTM: When the num_layer is set as 1, 2, and 3, respectively, the changes of training loss and test loss in all three cases tend to converge as shown in Figure 10, which means that all three structures could achieve the goal.
However, from the average test losses under these three conditions in Table 2, the better RUL prediction error is achieved with num_layer � 2.
(3) Comparison of LSTM_size in LSTM: After obtaining the optimal value of num_layers, the LSTM_size is set to 64, 128, 256, and 512, respectively. e changes of train loss and test loss are shown in Figure 11.
In all four cases, the train loss tends to be constant, but when the LSTM_size is 64, the train loss cycles up and down without convergence, indicating that this structure is too simple to learn the trends. As can be seen from Table 3, when LSTM_size is 256, the average MSE and other indicators are relatively lower than others.
Network architecture applied during the study along with hyperparameters is summarized in Table 4. is would help in future comparison studies as well as during any reproduction of such experiments.

Comparison of the Proposed Method with Other Typical
Algorithms. In order to verify the performance of proposed convolution kernel and LSTM model, Support vector Regression (SVR), Convolutional Neural Network (CNN), and LSTM model were implemented and compared for the same dataset.
SVR is a classical machine learning model which has relatively good prediction results under limited samples [25]. erefore, by comparing with SVR model, the advantage of deep learning methods could be demonstrated. Due to   advantages of translation invariance and sparse connections, CNN is widely used in a large number of areas such as computer vision and image recognition [26]. In order to show the merits of RUL prediction for such sequential processes by the proposed method, the classic CNN model is presented with the structure of two convolution layers and pooling layers, a standard feedforward neural network. Also, the original data is transferred into two-dimensional sliding matrix (time step � 40, feature � 9) as same as what we do for our proposed RUL model. RUL prediction results of these models are calculated and shown in Table 5. Table 5 clearly shows that the proposed model of convolution kernel and LSTM has better results than the other three models. Figure 12 shows the comparison between the predicted RUL and the corresponding real labels under randomly selected conditions with optimized hyperparameters in convolution kernel and LSTM networks. Since the degradation will not decay significantly in the early stage, the real RUL during this period is constant. At this stage, the predicted RUL and the real RUL are nearly consistent with each other apart from minor fluctuations as shown in Figures 12(a)-12(c). Moreover, after the degradation begins, the predicted RUL and real RUL all have downward trends and are in sync. In the middle of the degradation processes, due to the rapid changes in features, the predicted RUL has downward trend but it does not fit completely with the real labels. At the end of degradation, there is a deviation between predicted and real RUL, but this error remains within 5% which is acceptable. erefore, it can be concluded that convolution kernel and LSTM-based proposed methodology could accurately predict the RUL in different conditions.

Conclusion
is paper mainly focuses on RUL prediction for electric valves and presents relevant technical architecture based on convolution kernel and LSTM. e originality and useful features of this study are summarized as follows: (1) is study has analyzed typical degradation modes of electric gate valves. Moreover, during the course of study, an experimental platform having reusable faulty valves was designed. is setup was able to provide large degradation data for RUL prediction. (2) Acoustic emission sensors and other process sensors were integrated for sensing any degradation, which could perceive the aging characteristics in multiple dimensions. (3) Convolution kernel was integrated with LSTM in such a way that demerits of both could be avoided. Furthermore, hyperparameters were compared and optimized.
In conclusion, the proposed methodology achieves significant improvement and relatively good accuracy in RUL prediction. Moreover, to maintain NPP's safety and reliability, this method could be integrated into the prognostics and health management system of NPP. Our future work will focus on improvements to the existing experimental platform, analysis of further degradation modes along with an improved hybrid RUL prediction model.

Data Availability
e "aging data of electric gate valves" used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this work.