Fault Detection of Wind Turbine Sensors Using Artificial Neural Networks

This paper proposes a method for sensor validation and fault detection in wind turbines. Ensuring validity of sensor measurements is a significant part in overall condition monitoring as sensor faults lead to incorrect results in monitoring a system’s state of health. Although identifying abrupt failures in sensors is relatively straightforward, calibration drifts are more difficult to detect. Therefore, a detection and isolation technique for sensor calibration drifts on the purpose of measurement validation was developed. Temperature sensor measurements from the Supervisory Control and Data Acquisition system of a wind turbine were used for this aim. Low output rate of the measurements and nonlinear characteristics of the system drive the necessity to design an advanced fault detection algorithm. Artificial neural networks were chosen for this purpose considering their high performance in nonlinear environments. The results demonstrate that the proposed method can effectively detect existence of calibration drift and isolate the exact sensor with faulty behaviour.


Introduction
In the last decades, both the size and capacity of wind turbines have increased by virtue of technological developments in wind energy field.This situation resulted in an increasing focus on topics such as wind turbine fault detection [1][2][3][4][5][6].Condition monitoring and fault detection algorithms of wind turbines are important systems that contribute to reducing maintenance costs and downtime of wind power plants.Furthermore, wind power plants are generally located in distant sites which makes reliability even more important.Wind turbine faults lead to the need for repair and/or replacement actions and result in loss of energy production.Moreover, in some cases, failure in a component also affects other components and even the entire wind turbine.Considering all these facts, it is significant to detect and isolate wind turbine faults as early as possible to take required actions for preventing such undesired results.Besides, with the decrease in maintenance costs, wind energy will become commercially more competitive compared to other energy sources.
Fault detection methods can be classified into two different categories, namely, model-based and data-driven approaches.In model-based methods, firstly a mathematical model expressing wind turbine dynamics is generated.The output of that model is compared to real measurements from wind turbine sensors, and a fault alarm is given by analyzing residuals between outputs of the real system and model.Some examples of model-based fault detection studies in wind turbines can be found in [7,8].Model-based methods have the advantage of not requiring high sampling rate measurements [3].However, the success of these methods depends on the consistency of the mathematical model and the real behaviour of system, and this approach meets challenges because in addition to systematic complexities, wind turbines have complex control systems which increase the difficulty of obtaining a reasonable mathematical model of the whole system [9].
Data-driven methods do not require explicit mathematical models identifying the physical system.Instead, they are based on the analysis of measurements gathered from sensors mounted on different parts of wind power plants.Different alternatives exist in terms of choosing sensor data to be processed.The first alternative is to use sensor outputs from the Supervisory Control and Data Acquisition (SCADA) system which is a built-in part of most modern wind turbines with no additional costs.Many attempts have been made to benefit from SCADA data for fault detection aims [9][10][11][12][13][14][15].The other alternative is to use purpose-designed measurement systems having sensors specifically mounted for condition monitoring purposes.Common sensor types employed for this aim are vibration, acoustic emission, strain, torque, and bending moment sensors.Some past researches on wind turbine fault detection using specific sensors can be found in [16][17][18][19][20][21][22][23][24].Both two data gathering methods have certain advantages and disadvantages.Sensor data from the SCADA system are of low frequency and generally have 10 min sampling intervals.Such a low frequency brings difficulties due to the loss of noise characteristics which carry important information on fault formation.Besides, generally there are many missing values and imperfections in data collected by the SCADA system.But as an advantage, the SCADA system is a built-in part of most of the large-scale wind turbines; therefore, no additional cost is required to reach these data.On the other hand, data output frequency for purpose-built wind turbine condition monitoring sensors can be selected high enough not to lose information in signal.However, this approach causes extra cost for maintenance and, as a result, becomes a disadvantage in terms of cost competitiveness against other energy conversion methods.
In this paper, data gathered from the temperature sensors of a wind turbine SCADA system were used to detect and isolate sensor faults.To build an effective fault detection system, it is crucial to distinguish between sensor errors from faults originated from other components of the turbine, because faulty behaviour in sensors can lead to errors in data evaluation and cause false alarms and performance degradation of fault detection system.
To compensate for the disadvantages of low-frequency sensor measurements from the SCADA system, a sophisticated signal processing method is required.Artificial neural networks (ANN) were selected for this purpose considering their high performance in modelling nonlinear systems.Different ANN architectures were built and compared to obtain the best option among different alternatives.
The layout of this paper is as follows.In Section 2, sensor selection and data collection parts of the research are presented.In Section 3, ANN architectures designed in this research are explained, and in Sections 4 and 5, results and discussions are presented, respectively.

Data Collection
Wind turbines are comprised of different interconnected subsystems.The primary subsystems can be listed as aerodynamic, mechanical, electronic, and control systems (Figure 1).A sensor fault in any of these components results in a degradation in a turbine's performance.Detecting and isolating sensor faults are an important part of the overall fault detection system as distinguishing the cause of abnormality is required for repair or replacement actions.
Various types of sensor faults can be encountered in subsystems of wind turbines.Typical sensor faults can be listed as multiplicative, additive, offset faults, and faults resulting in changing dynamics in the system [25].In this paper, multiplicative fault in temperature sensor measurements is investigated.Multiplicative faults may arise from calibration drifts and act like a scaling factor on sensor measurements.This type of sensor fault was selected to be investigated because unless scaling factor of a multiplicative fault is too big, it does not cause an easily recognizable change comparing to normal response of sensors which makes them harder to detect.SCADA measurements used in this work supply some statuses of main faults such as mains failures, feeding faults, and pitch control errors; however, the SCADA system does not provide information on sensor faults.But ensuring measurement validation is a compulsory step in the design of an overall fault detection system.Therefore, the multiplicative sensor fault to be detected was artificially created in this research.Measurement differences in a form of a multiplicative effect can be originated either from sensor calibration drifts or real temperature deviations, so it is required to distinguish these two situations.For this reason, the method designed was tested in different cases representing both these situations.
Temperature measurements from the SCADA system of a 900 kW wind turbine were collected to train ANNs designed in this work.The data have 10 min sampling intervals and were gathered between 01 November 2015 and 30 November 2015.The SCADA system includes 10   Journal of Sensors temperature sensors mounted on different components of the turbine which are presented in Table 1.
Sensors with the * sign in Table 1 were selected for the sensor validation purpose in this work.They measure rear hub bearing temperature (S 1 ), control cabinet temperature (S 2 ), tower temperature (S 3 ), and transformer temperature (S 4 ).The algorithm proposed detects calibration drifts in one of these sensors based solely on the measurements of them without using any operational or environmental data.Therefore, this subset of sensors was selected considering the similar temperature characteristics of the areas they were installed in.Each of them is monitored by operators to find out if any abnormalities appear in the components around them.Locations of the sensors used in this research are presented in Figure 2.
Using measurements from these sensors, firstly ANNs were trained.For training purpose, 75% of the data was used.The remaining 25% of the data was used to form test data set to evaluate and compare the performance of the networks.ANN models were designed and trained in MATLAB environment.

Artificial Neural Network Architectures
ANNs have proven to be powerful algorithms for various aims including classification, estimation, and fault detection.Due to their capability of dealing successfully with nonlinearities and their advantages in real-time applications, it is convenient to design ANN models for detecting wind turbine faults.Several ANN models were trained and analyzed to obtain a successful network architecture that satisfies the requirements of this research.
In terms of input-output relations, 2 approaches have been used, namely, autoassociative and MISO (multipleinput single-output) structures.The autoassociative case, where the input vector is to be associated with itself is presented in Figure 3.As can be seen in the figure, all the sensors were set both as inputs and outputs of the autoassociative network.In the MISO case, 3 out of 4 selected sensors were set as inputs and the remaining sensor is set to be the output.The structure is repeated 4 times with each one of the sensors used as output separately.Figure 4 shows the MISO structure with S 4 as the output of the system.Similar networks with other sensors in the output layer were also created.
The input-output relations presented in Figures 3 and 4 were used in different neural network types, namely, backpropagation neural network (BPNN), radial basis function neural network (RBFNN), and general regression neural network (GRNN).Table 2 presents the computational principles and input-output relations of the networks designed.3 Journal of Sensors 3.1.Backpropagation Neural Network.BPNN is a type of feedforward neural network where information flows from inputs to outputs without any feedback connections, whereas error propagates from outputs to inputs.Detailed mathematical foundations of BPNNs can be found in [26,27].This kind of neural network has been commonly used for fault detection purposes.
A typical BPNN consists of an input layer, one or multiple hidden layers, and an output layer.By taking requirements of problem into consideration, different activation functions can be used.With the help of nonlinear activation functions existing in hidden or output layers, neural networks become capable of learning nonlinear relationships between inputs and outputs.In this research, several BPNN architectures with one hidden layer were designed.Firstly, the type of activation functions for hidden and output layers was determined.For this reason, networks with different activation function pairs were designed and evaluated using a smaller sample data set, such as "logarithmic sigmoid-linear," "logarithmic sigmoid-logarithmic sigmoid," "tangent sigmoid-linear," and "tangent sigmoid-tangent sigmoid."The former function represents the hidden layer's activation function, and the latter represents the output layer's activation function.The best results were obtained by "logarithmic sigmoid-linear" activation function pair.Therefore, the detailed network developments in the following parts of the research were continued with these activation functions.
As presented in Table 2, autoassociative and MISO structures were used in the design of BPNN models.To obtain an autoassociative BPNN with high performance, different network architectures with 2 and 3 hidden neurons were designed and 10 trials were made with each architecture.The reason of multiple trials is that the performance of the network alters depending on the initial values of weights connecting different layers and to ensure the reach of a minimum point in the cost function space.
For MISO BPNNs, the number of hidden neurons was increased from 2 to 15 and again 10 trials with different random initial conditions for each architecture were held.The selection of the best network to use in the next parts of the research was made based on the comparison of R 2 goodness of fit values.Amongst the autoassociative BPNNs, the R 2 values of the best network were 0.999, 0.999, 0.996, and 0.993 for sensors 1 through 4, respectively.For MISO BPNNs, the highest scores were obtained as 0.836, 0.979, 0.984, and 0.985.Figure 5 shows the regression graphs of the test set for autoassociative and MISO BPNN with the best performances.

Radial Basis Function Neural
Network.RBFNN is a type of feedforward neural network typically having a single layer of hidden units that are connected to linear output units.They are trained in a hybrid manner.Computational parts in the hidden layer use unsupervised learning, and each of them is described by a radial basis function.The size of the units in the hidden layer is the same as the size of the training vector.The output layer is trained in a supervised manner.Mathematical details of RBFNNs can be found in [28].
The output of a neuron of RBFNN is given in the following: where N is the number of neurons in the hidden layer, φ k is the kernel of radial basis function for each unit, c k is the center of radial basis function vector for neuron k, and w k is the weight of neuron k in the output layer.
The most common choice for RBFNN kernel is Gaussian activation function which was also used in this research.
It is presented in the following: where x − c k 2 is the squared Euclidian distance between the associated center vector and the input vector and σ k is the width factor of the k th hidden unit in the hidden layer which controls the smoothness properties of interpolating function.
Several autoassociative and MISO networks with RBFNN approach were developed in this research.Width factors of the networks were selected amongst varying values between 0.1 and 100.The performances of the networks were evaluated based on the R 2 values.The best R 2 scores amongst the autoassociative RBFNNs were bigger than 0.999 for all sensors.The highest scores for MISO RBFNNs were 0.824, 0.979, 0.982, and 0.984.In the detailed analysis for the next parts of the research, the networks with the best performances were used.Regression plots for the test set of RBFNN networks with the highest performance results are presented in Figure 6.

General Regression Neural Network. GRNN computes the most probable value of an output y given only training vectors x.
In this approach, a specific functional form to describe the relation between inputs and outputs is not required, instead the appropriate form is expressed as a probability density function which is determined from observed data.Since the parameters are directly calculated using examples, an iterative computation is not required [29].
The output is calculated by where ŷ is the estimate of the output which is a weighted average of all the observed samples y i , x is the input vector, n is the number of sample observations, and σ is the width factor.Like the approach used in RBFNNs, GRNN models were also designed with varying width factors from 0.1 to 100.The highest R 2 values for autoassociative GRNN networks were 0.992, 0.990, 0.990, and 0.994 for sensors 1 through 4, respectively, and the best scores for MISO GRNNs were 0.808, 0.976, 0.973, and 0.981.Figure 7 shows the regression plots for autoassociative and MISO GRNN models with the highest goodness of fit results.

Results and Discussion
As stated in Section 2, SCADA data do not supply information on incipient temperature sensor faults.Therefore, a multiplicative fault representing a calibration drift was artificially created in this work.The models do not evaluate singlesensor measurements independently but instead evaluate  5 Journal of Sensors them by observing their relationship with other sensors in the group which were selected from physically relevant places, so it becomes possible to distinguish variations caused by calibration drifts from variations that appeared due to the real temperature changes.A temperature rise originated from the environmental effects would be seen in more than one sensor, whereas a calibration drift would result in a change only in the faulty sensor.To ensure that the proposed algorithm distinguishes the root reason for the temperature change, networks were tested in 3 different cases.Performances of the ANN models for the test set without any temperature drifts were observed in Case 1.In Case 2, a calibration fault was simulated by multiplying only one of the sensor's outputs with a constant factor.An overall shift in all temperature measurements was simulated in Case 3 which represents a change originated not from a fault but a real environmental temperature change.
The details and different expectations from ANN outputs for all cases are summarized as follows.
Case 1.No fault case.
In this case, temperature measurements gathered from the sensors were directly used as network inputs.25% of the data that had been separated as the test set was used for this aim.The data presents the normal behaviour of the turbine without any faults, therefore the expectation from the networks is to produce as close output values as possible to the real sensor measurements.
Case 2. Multiplicative fault in one of the sensors.
Case 2 was designed to introduce a fault to be detected to the measurement system.A multiplicative fault was artificially created in one of the sensors.The test set for the measurements of the control cabinet temperature sensor (S 2 ) was multiplied by the constant term 1.2.The performances of the networks for this case were evaluated by analyzing the residuals between network outputs and sensor measurements.The expectation is to obtain greater residuals between  In this case, the measurements from all 4 sensors were shifted by multiplying by the constant term 1.2.The aim for this overall shift is to ensure that networks used for this fault detection algorithm do not produce a false fault alarm when none of the sensors are faulty but instead a real temperature rise is recorded.The expectation from networks for this case is again to produce as close estimation values as possible to actual temperature values.
Input data for all 3 cases were implemented to the resulting 6 models with the best R 2 values.Outputs of the networks were analyzed based on different requirements for each case.As shown in Figure 11, the real data and the network outputs are consistent in the normal operation case.Figure 12 shows that in Case 2, unlike the results from the autoassociative network, this time the residuals for the control cabinet temperature are significantly bigger than the residuals for other sensors with an RMSE value of 7.2 °C.Therefore, this network can be used for the isolation of fault location.For  9 Journal of Sensors Case 3 (as presented in Figure 13), the networks produce residuals which is undesired for this scenario; however, the decision of fault existence would be made with the support of autoassociative BPNN to prevent false alarms.By this combined decision-making algorithm using different networks, the fault detection system would be able to distinguish faulty and nonfaulty situations in a more sensitive way and give information on the exact location of faults.

Conclusions
A data-driven method using temperature sensors of a wind turbine SCADA system to detect and isolate sensor faults is proposed in this paper.A significant advantage of this method is the use of sensor data from the SCADA system which is a built-in part in most large-scale modern wind turbines; therefore, the data collection system does not bring any additional hardware requirements and extra costs.However, since SCADA measurements are of low frequency for fault detection, an intelligent model was needed that was provided by artificial neural networks in this research.Several artificial neural network architectures were trained for this aim.Based on the evaluation of simulation results, it is proposed to benefit from the strengths of different architectures.Autoassociative BPNN has proven to be a successful network to distinguish fault situation from nonfaulty ones, and MISO BPNN gives satisfactory results in terms of finding the exact sensor with faulty behaviour.The results demonstrate that the proposed method is feasible and effective for sensor fault detection and isolation.As a future work, the research can be extended to cover other measurements gathered from wind turbines to build an overall fault detection system.

Figure 1 :
Figure 1: Subsystems of the wind turbine.

Figure 2 :Figure 3 :Figure 4 :
Figure 2: Locations of the sensors used in this research.

Figure 8 :
Figure 8: Measured and network values for Case 1 with autoassociative BPNN.Rear hub bearing temperature

Figure 9 :
Figure 9: Measured and network output values for Case 2 with autoassociative BPNN.

Figure 10 :
Figure 10: Measured and network output values for Case 3 with autoassociative BPNN.Rear hub bearing temperature

Figure 11 :
Figure 11: Measured and network values for Case 1 with MISO BPNN.

Figure 12 :
Figure 12: Measured and network values Case 2 with MISO BPNN.Rear hub bearing temperature

Figure 13 :
Figure 13: Measured and network values for Case 3 with MISO BPNN.

Table 1 :
Temperature sensors from the SCADA system.
Case 3. No fault case.Overall shift in all measurements due to environmental temperature rise.