Nonlinear Cointegration Approach for Condition Monitoring of Wind Turbines

Monitoring of trends and removal of undesired trends from operational/process parameters in wind turbines is important for their condition monitoring. This paper presents the homoscedastic nonlinear cointegration for the solution to this problem. The cointegration approach used leads to stable variances in cointegration residuals. The adapted Breusch-Pagan test procedure is developed to test for the presence of heteroscedasticity in cointegration residuals obtained from the nonlinear cointegration analysis. Examples using three different time series data sets—that is, one with a nonlinear quadratic deterministic trend, another with a nonlinear exponential deterministic trend, and experimental data from awind turbine drivetrain—are used to illustrate themethod and demonstrate possible practical applications. The results show that the proposed approach can be used for effective removal of nonlinear trends form various types of data, allowing for possible condition monitoring applications.


Introduction
Recent forecasts show that renewable energy sources will be generating more than 25% of world's electricity by 2035, with a quarter of this coming from wind [1].The data imply that wind energy is one of the fastest growing renewable energy sources.The growing interest in wind energy sector has led to the rapid expansion of onshore and offshore wind farms.This expansion has drawn attention to operation and maintenance of wind turbines (WTs), especially when turbines are deployed offshore [1][2][3].In addition, accurate forecasting of long-term wind speed and annual wind power production is greatly desired to minimize scheduling errors and in turn increase the reliability of electric power grid and reduce power production costs [4,5].
It is well known that unexpected failures of turbine components (or subsystems)-such as gearboxes, generators, rotors, and electric systems-can lead to costly repair and often months of machine unavailability, thereby increasing operation/maintenance costs and subsequently cost of energy.Therefore, condition monitoring (CM) and fault diagnosis of WTs-in particular at the early stage of fault occurrence-is an essential problem in wind turbine engineering [2,3].
Many CM techniques have been developed to detect and diagnose abnormalities of WTs with the goal of improving gearbox reliability and increasing turbine availability, thereby reducing operation and maintenance costs, as reviewed in the literature [2,[6][7][8].This includes vibration analysis, oil monitoring and analysis, acoustic emission, ultrasonic testing techniques, strain measurement, process performance monitoring, radiographic inspection, and thermography.Another solution-based on the use and analysis of Supervisory Control And Data Acquisition (SCADA) data-has been recently employed in [3,[9][10][11][12][13].This technique is cost-efficient, readily available, does not require investments related to dedicated CM systems, and is beneficial for identifying abnormal WT components since only key operational or process parameters need to be tracked [3,11,12].Monitoring of trends and removal of undesired trends from these parameters is one of the most important problems when SCADA approaches are used.Various methods have been developed for data trend analysis.Recent years have attracted numerous applications based on cointegration.The major idea used in these investigations is based on the concept of stationarity.In a simplified description, nonstationary processes are cointegrated if a linear combination of these processes leads to a stationary process.When cointegration is used for SHM and damage detection, monitored variables (signals or features) are cointegrated to create a stationary residual whose stationarity represents intact (or normal) condition.Then any departure from stationarity can indicate that monitored processes or structures are no longer operating under normal condition.
The cointegration approach-originally developed in the field of Econometrics in the late 1980s and early 1990s [14][15][16]-has been successfully employed as a reliable tool for dealing with the problem of operational and environmental variability in Process Engineering [17] and Structural Health Monitoring (SHM) [18][19][20][21][22][23][24].All these applications utilized the linear cointegration concept that is intimately connected with the concept of linear error correction models.More recently, research on linear cointegrated time series was extended in Econometrics to two major nonlinear approaches, as overviewed in [25].The first approach focused on nonlinear short-run dynamics in error correction models with the goal being to model potentially nonlinear adjustment mechanisms to deviations from long-run equilibrium relations.The bestknown example of this approach is the concept of threshold cointegration and its smooth versions that were intensively studied in [26,27].The second approach attempted to make the cointegrating relations themselves nonlinear.The model used in this context is a nonlinear cointegrating regression or a nonlinear regression with integrated regressors, as discussed in [28,29].The work in [30] brought the concept of nonlinear cointegration to SHM where data trends have nonlinear characteristics.This attempt has proposed two possible approaches to nonlinear cointegration, that is, an optimisation-based method and a variation of the well-established Johansen's procedure that is based on the use of an augmented basis.Both methods were examined using a simple theoretical example (i.e., time series with a nonlinear quadratic deterministic trend) and experimental vibration bridge data.Although this study demonstrates some interesting results, two major problems were observed.Firstly, with respect to the theoretical example, the variance of cointegration residuals increased with time, although cointegrated variables were mean stationary.This behaviorknown in mathematics as the heteroscedasticity-implied that strictly stationary cointegration residuals could not be obtained.Secondly, with respect to the bridge case study, to avoid the problem of nonlinearity between modal frequencies and temperature, the entire data set was not used in the analysis and thus the nonlinear temperature dependent trends were not completely removed.It is clear that reliable trend removal and damage detection/monitoring methodsbased on nonlinear cointegration-will require homoscedastic cointegration residuals-that is, residuals that are strictly stationary-to avoid false monitoring and detection results.
The paper addresses the problem of trend removal/analysis of wind turbine operational data.A homoscedastic (or variance stabilizing nonlinear cointegration) nonlinear cointegration approach is proposed for this task.The objective is to demonstrate a new approach that could be potentially used for condition monitoring and fault detection of wind turbines in the presence of nonlinearity between operational parameters.It is important to note that-in the context of material presented-the homoscedasticity relates to the stable behavior of variance in cointegration residuals.
Previous approaches generally dealt with the existence of heteroscedasticity in the primitive or original data before performing any further analysis.However, in this paper, we coped with the existence of heteroscedasticity in cointegration residuals obtained from nonlinear cointegration process of time series data.In more detail, we have solved the problems of increasing (or unstable behavior) of the variance of cointegration residuals.To the best of the authors' knowledge, the mentioned problems as well as heteroscedasticity in nonlinear cointegration in general have not been previously investigated in the literature.
The paper is structured as follows.Sections 2 and 3 introduce the concepts of linear and nonlinear cointegration, respectively.The latter addresses two existing problems with heteroscedasticity and nonlinear trend removal in nonlinear cointegration method when used for trend monitoring/analysis.Section 4 presents a new variance stabilizing nonlinear cointegration method to overcome these problems.An adapted procedure to test for the presence of heteroscedasticity in cointegration residuals-obtained from the nonlinear cointegration analysis-is proposed.Examples using three different time series data sets-that is, one with a nonlinear quadratic deterministic trend, another with a nonlinear exponential deterministic trend, and one utilizing experimental wind turbine data-are given in Section 5 to illustrate the method and demonstrate possible wind turbine condition monitoring applications.Finally, the paper is concluded in Section 6.

Linear Cointegration
For the sake of completeness this section briefly introduces the concept of linear cointegration.Firstly, stationarity and nonstationarity of time series are discussed.
In mathematics the concept of stationarity can be introduced using time series analysis.A given time series   can be presented in the form of the first-order autoregressive AR(1) process, which is defined as [31] where   is an independent Gaussian white noise process with zero mean, that is,   ∼ IWN(0,  2 ).Then three different time series can be distinguished for different values of coefficient  [31].These are (1) stationary time series (|| < 1); (2) nonstationary time series ( > 1); and (3) random walk ( = 1).Any time series   that exhibits the form of random walk without a trend is considered as an integrated series of order 1, denoted as (1) [32].For such a series (1) yields Equation (2) shows that the first difference of   , that is,   −  −1 , is just a stationary white noise process   .In other words, a nonstationary (1) time series becomes a stationary (0) time series after the first difference.By analogy, a nonstationary (2) time series would require differencing twice to induce a stationary (0) time series.The number of differences required to achieve stationarity is called the order of integration and therefore time series of order  are denoted as ().
Following this short introduction, the concept of linear cointegration can be introduced using a vector   of (1) time series defined as   = ( 1 ,  2 , . . .,   )  .This vector is linearly cointegrated if there exists a vector  = ( 1 ,  2 , . . .,   )  such that In other words, the nonstationary (1) time series in   are linearly cointegrated if there exists (at least) a linear combination of them that is stationary, that is, having the (0) status.This linear combination, denoted as     , is referred to as a cointegration residual or a long-run equilibrium relationship between time series [32].The vector  is called a cointegrating vector.The action of creating the cointegration residual (  =     ) is considered as the action of projecting the vector   on the cointegrating vector .The cointegration relationship given by (3) can be extended to multiple cointegration.Then the vector   is cointegrated with  (where 0 <  < ) linearly independent cointegrating vectors if there exists a matrix  such that The stationary linear combinations   =     are referred to as the  cointegration residuals that are formed through projecting the vector   on the cointegrating matrix .
In essence, testing for linear cointegration is testing for the existence of long-run equilibriums (or stationary linear combinations) among all elements of   .Such tests have two important requirements [32].Firstly, any analysed time series must exhibit at least a common trend.Secondly, the analysed time series must have the same degree of nonstationarity, that is, being integrated of the same order.
In general, the linear cointegration test consists of two steps.
(1) The first step is to determine the existence of cointegration relationships and the number of linearly independent cointegrating vectors among multivariate (nonstationary) time series and to form the cointegration residuals.
(2) The second step is to perform unit root tests on the cointegration residuals found to determine if they are stationary series (i.e., testing for stationarity).
For the first step, the Johansen cointegration methoddeveloped in [15]-has been widely used.It is a sequential procedure based on maximum likelihood techniques, which basically is a combination of cointegration and error correction models in a Vector Error Correction Model (VECM).Two test statistics (i.e., trace and maximum eigenvalue statistics) for determining the existence of cointegration and the number of linearly independent cointegrating relationships among the time series in   were developed in [15].These test statistics are quite complex and thus are not presented in this paper.For more detailed description of the entire procedure, potential readers are referred to [15].For the second step, the augmented Dickey-Fuller (ADF) test-described in [33]is the most popular unit root test.The ADF test checks the null hypothesis that a time series is nonstationary against the alternative hypothesis that it is stationary, assuming that the dynamics in the data have an Autoregressive Moving Average (ARMA) structure [32].

Nonlinear Cointegration
It is well known that time series responses from engineering structure often exhibit nonlinear behavior.Moreover, operational and/or environmental common trends are typically believed to be nonlinearly related to response data used for damage detection.If this is the case, then the linear cointegration theory-described in Section 2-is in practice no longer suitable for condition monitoring and structural damage detection and therefore a nonlinear approach to cointegration is needed.This section provides a brief introduction to nonlinear cointegration and recalls one previously investigated example from the literature.The latter is shown to demonstrate the major difficulty associated with nonlinear cointegration.
In the last twenty years, nonlinear cointegration has been studied in many different contexts, as discussed in [25][26][27][28][29][30][34][35][36].Previous research work-summarized in [34,35]has demonstrated that nonstationarities and nonlinearities should be analysed simultaneously because in time series analysis nonlinearities often exist in a nonstationary context.However, it is not easy to reach this goal because the inherent difficulties in analysing nonlinear time series models within a stationary and ergodic framework are enhanced in nonstationary contexts.This issue is also true for cointegration analysis when it is used for nonlinear and nonstationary processes.Hence, as discussed in [34,35], other definitions of stationarity and nonstationarity are needed in order to characterise better the usual notion of stationary (0) and nonstationary (1) time series and cointegration in nonlinear contexts.The concepts of short memory and extended memory variables are commonly used to ease this task.
A time series is said to be short memory if its information decays through time.In particular, a variable is short memory in mean (or in distribution) if the conditional mean (or conditional distribution) of the variable at time  given the information at time  − ℎ converges to a constant as ℎ diverges to infinity.Shocks in short memory time series have transitory effects.In contrast, a time series is said to be extended memory in mean (or in distribution) if it is not short memory in mean (or in distribution).Shocks in extended memory time series have permanent effects.This means that the concept of short memory in this context can be considered as a somewhat stronger condition than stationarity; and the concept of extended memory can be thought of as a fairly weaker condition than nonstationarity.
Following this introduction, a general definition of nonlinear cointegration has been proposed in [35]: "If two or more series are of extended memory, but a nonlinear transformation of them is short memory, then the series are said to be nonlinearly cointegrated."However, a simpler and more common definition of nonlinear cointegration is used in the current investigations.Two nonstationary time series   and   are nonlinearly cointegrated if there exists a nonlinear function  such that   = (  ,   ) is stationary.
This simplified definition is still quite general to be fully operative, and, moreover, identification problems might arise in this general context [35].Hence, in practice some classes of function  are often used to avoid such identification problems.For example, one can consider a function of the form   = (  ) − ℎ(  ) and estimate  and ℎ by using nonparametric estimation procedures, as performed in [36].Another approach is to consider transformations of the form   = (  ) +   ,   = (  ) −   , or   =   − (  ), as discussed in [25,35].The second approach is believed to be convenient for exploring nonlinear cointegration; therefore it has been used in the current paper.However, the question how to construct a nonlinear function still remains.This problem is further discussed in the following sections.
Nonlinear cointegration has been recently proposed for SHM applications in [30].The results showed that nonlinear cointegrating vectors were created, the nonlinear trend was successfully removed, and stationary residuals were found for the analysed time series.However, the variance of cointegration residuals was increasing with time, although cointegrated variables were mean stationary.The analysed cointegration residuals were not strictly stationary.It is important to note that, regardless of the nature of the driving trend, the approach used in [30] will always result in cointegration residuals that lead to variances dependent on that trend, as concluded in [30].As a result, heteroscedasticity will be always present in cointegration residuals obtained from the proposed approach.When the method is used for condition monitoring and damage detection, this can lead to serious consequences.
It is well known that the variance-or volatility that is the square root of variance-of time series often changes over time [31,32].This characteristic-referred to as heteroscedasticity-was firstly recognized in the early 1960s [37].The complementary notion of heteroscedasticity is called homoscedasticity.In regression analysis, homoscedasticity means a situation in which the variance of the dependent variables is the same for all analysed data, whereas heteroscedasticity means a situation in which the variance of the dependent variables varies across the analysed data.Consequently, homoscedasticity facilitates analysis because most methods in regression analysis are based on an assumption of equal variance, whereas heteroscedasticity complicates analysis [38][39][40].It is well known that serious violations in heteroscedasticity, that is, the assumption that a given distribution of data is homoscedastic when actually it is heteroscedastic, can lead to invalid, imprecise, and ineffective analyses of heteroscedastic time series, as explained in [32,[38][39][40].For example, when statistical uncertainty or probability of damage detection was analysed in SHM under the assumption of homoscedasticity while the time series data were actually heteroscedastic, the resulting confidence intervals could be erroneous.It is also well known in regression analysis that in the presence of heteroscedastic disturbances the loss of efficiency in using ordinary least squares could be substantial and, more importantly, the biases in estimated standard errors could lead to invalid inferences [32,39].In addition, the presence of heteroscedasticity may signal inadequacy of the estimated model [32].Hence, it is important to test for the presence of heteroscedasticity in time series before any analysis.

Homoscedastic Nonlinear Cointegration
4.1.Theoretical Background.A homoscedastic nonlinear cointegration method is proposed in this section.The method overcomes the heteroscedastic problem related to cointegration residuals by offering a variance stabilizing nonlinear cointegration.
Following the work presented in [30] two time series can be defined as where   is some deterministic trend caused by the external disturbance;  1, and  2, are independent and identically distributed random processes; and function (  ) has a continuous and differentiable first derivative.It is assumed that  1, and  2, have zero mean and they are relatively small to   .Then nonlinear cointegration can take the form Substituting ( 5) and ( 6) to ( 7) yields the cointegration residual as It is clear that for   to become a zero mean series the cointegrating function  =  can be used to obtain The application of the first-degree Taylor approximation formula-defined as for (  + Then, substituting (11) into (9) yields The above equation can be approximated as Equation (13) shows that the cointegration residual   is zero mean, but its variance is not constant and strongly depends on the deterministic trend   .Since  2, is independent of  1, and   , the variance of   can be estimated as It should be noted that the term Var(− 2, ) in ( 14) was replaced by the term Var( 1, ) in (15).This can be done properly because of the fact that  1, and  2, are independent and identically distributed random variables and as mentioned above that  2, is independent of  1, and   .Therefore without loss of generality one can make Var( 1, ) ≈ Var(− 2, ).Furthermore, substitutions in (15) were properly made because   is deterministic and by using the formula Var( * ) =  2 Var() for a constant .
From (5) one can take that   ≈   and then (15) becomes Equation (16) shows that variance Var(  ) is not constant because it depends on √1 + [  (  )] 2 .This is where the problem of heteroscedasticity appears.In order to solve this problem, the transformation  for the cointegration residual   can be proposed as Finally, one obtains the transformed cointegration residual     that has the form Equation (18) shows that √1 + [  (  )] 2 is constant if and only if   (  ) is constant.Moreover, when   (  ) is constant then (  ) is linear, which thus implies that   and   are linearly related.This explains why cointegration residuals-created in the context of linear cointegration-are homoscedastic without any modification.When ( 7) is met together with the condition for   to become a zero mean series (i.e., when  = ) then ( 18) becomes This equation presents the modified cointegration residual  *  that is approximately zero mean and homoscedastic.The proposed method is general and therefore can apply to any heteroscedastic time series data.

Adapted Breusch-Pagan Test Procedure for Heteroscedasticity in Cointegration Residuals.
Various tests for heteroscedasticity can be used in practice [38][39][40].The Breusch-Pagan test [39] is one of the most widely used procedures in practice.In principle, the Breusch-Pagan test checks for conditional heteroscedasticity; that is, it checks whether the estimated variance of the residuals from a regression is dependent on the values of the independent variables.The procedure is based on the Lagrange Multiplier (LM) test statistic with an assumption that the error terms are normally distributed [32].
The linear time series regression model for one independent variable  1 can be written as where   is a dependent variable,   is a random error term (or a residual), and ( 0 ,  1 ) are coefficients.In order to test for the presence of heteroscedasticity in the residual   the auxiliary regression model is formed as where ( 0 ,  1 ) are coefficients.The Breusch-Pagan heteroscedasticity test is performed by regressing the squared residuals directly on the independent variables.In the linear time series regression model, one can assume that the mean of the residual   is zero.Hence, the estimated variance of the residual (i.e.,  2  ) in ( 21) is constant if and only if it is independent of the independent variable  1 .If this is the case, then  1 should be close/equal to zero.The LM test statistic is used to evaluate the significance of  1 .
It should be noted that only one independent variable has been used in the current investigations.In general casewhen more than one independent variable is employed-the test statistic equals  2 , where  is the sample size and  2 is the coefficient of determination in the auxiliary regression.For more detailed description of the original Breusch-Pagan test procedure in general case, potential readers are referred to [39].
Because the original Breusch-Pagan test can only be used to test for heteroscedasticity in a linear regression model, hence the test has been adapted to be suitable for the work presented in this paper, that is, to test for the presence of heteroscedasticity in the cointegration residuals obtained from nonlinear cointegration analysis.In order to achieve this, the linear regression model in ( 20) is rewritten to the form   −(  ) =   , where (  ) =  0 + 1  1 .Next, in general case, the mean of   should not be assumed to be equal to zero so that the estimated variance of the residual   can take the form [  − (  )] 2 , where (  ) is the mean of   , which can be estimated by taking the average value of all residuals.Then the auxiliary regression model can be formed as Following the same discussion as above, (22) shows that the residual   is homoscedastic if the term on the left, that is, [  − (  )] 2 , is independent of   .This implies that the coefficient  1 should be equal to zero.In this current work, the significance of  1 is assessed by using the Student -test statistic (instead of the LM test statistic) since it is more common.The -test statistic used for the adapted Breusch-Pagan test procedure can be described as follows.
The hypotheses to be tested are the following.
(i) Null Hypothesis.The variances of cointegration residuals of the auxiliary regression model are constant ⇒ Heteroscedasticity is not present in the cointegration residual.
(ii) Alternative Hypothesis.The variances of cointegration residuals of the auxiliary regression model are unequal ⇒ Heteroscedasticity is present in the cointegration residual.More specifically, the null hypothesis is true (the cointegration residual   is homoscedastic) if the coefficient  1 is insignificant ( value > 0.05).Conversely, the alternative hypothesis is true (the cointegration residual   is heteroscedastic) if the coefficient  1 is significant ( value ≤ 0.05).
It should be noted that  1 can be considered in the auxiliary regression model in (22), instead of   .Since   has been considered in the auxiliary regression model, the correlation between the absolute values of   and  1 can be checked to determine how the values of  1 deviate from   .

Application Examples
Three examples that explain the homoscedastic nonlinear cointegration method and illustrate its application to nonlinear trend removal and a possible condition monitoring solution for wind turbines are presented in this section.These examples use three different time series data sets, that is, one piece of data with a nonlinear quadratic deterministic trend, another with a nonlinear exponential deterministic trend, and one more piece of experimental data from a wind turbine.

Quadratic Cointegrating Function.
This section recalls the nonlinear cointegrating function (  ) =  2   that has been used in [30].The objective is to demonstrate that the homoscedastic nonlinear cointegration method-presented in Section 4.1-can remove the heteroscedasticity from cointegration residuals.
When the nonlinear cointegration form given by ( 7) is used the original cointegration residual can be calculated as Similarly, the homoscedastic nonlinear cointegrationgiven by ( 19)-can be also used to obtain the modified cointegration residual: Figures 1(a) and 1(b) present the original cointegration residual   and the modified cointegration residual  *  , respectively.The results show that the nonlinear quadratic deterministic trend was successfully removed in both cases.However, the variance of   increases with time (i.e.,   is heteroscedastic), whereas the variance of  *  is relatively stable (i.e.,  *  is homoscedastic).The adapted Breusch-Pagan test procedure-described in Section 4.2-was used to confirm these results.Consequently, the test statistic for   is significant because  value < 0.0001 < 0.05; therefore   is heteroscedastic.In contrast, the relevant test statistic for  *  is insignificant because  value = 0.579 > 0.05, so that  *  is homoscedastic.In addition, the correlations between   and the absolute values of   and  *  were calculated as 0.567 and −0.006, respectively.This means that   contains less information about deviation of  *  in comparison with   .This simple example demonstrates that the proposed homoscedastic nonlinear cointegration method can successfully remove heteroscedasticity from cointegration residuals.

Exponential Cointegrating Function.
The data used in this example consist of two different time series variables   and   .The first variable   reacts linearly with respect to time , whereas the second variable   reacts nonlinearly-in an exponential way-with time .The given variables take the form where  1, and  2, are independent and identically distributed random processes.Following the homoscedastic nonlinear cointegration method-presented in Section 4.1-the nonlinear cointegrating function (  ) =    /50 has been selected for this case.The original cointegration residual-calculated by using the nonlinear cointegration form given by ( 7)-has the form In the same way, when the homoscedastic nonlinear cointegration form-given by ( 19)-is used, the modified cointegration residual takes the form Figures 2(a) and 2(b) present the original cointegration residual   and the modified cointegration residual  *  , respectively.The results show that the nonlinear exponential deterministic trend was successfully removed in both cases.However, the variance of   increases with time (i.e.,   is heteroscedastic), but the variance of  *  is fairly stable (i.e.,  *  is homoscedastic).The adapted Breusch-Pagan test procedure-described in Section 4.2-was used to confirm these results.As a result, the test statistic for   is significant because  value < 0.0001 < 0.05; therefore   is heteroscedastic.In contrast, the relevant test statistic for  *  is insignificant because  value = 0.977 > 0.05, so that  *  is homoscedastic.Moreover, the correlations between   and the absolute values of   and  *  were calculated as 0.384 and 0.0002, respectively.This implies that   contains less information about deviation of  *  in comparison with   .Thus the second example also demonstrates that the homoscedastic nonlinear cointegration method has effectively removed the heteroscedasticity from the cointegration residuals.

Experimental Wind Turbine Data from a Drivetrain.
Wind turbines are designed to operate in remote onshore and/or offshore areas, where strong winds are available.The WT converts wind kinetic energy into useful electrical energy.The main components of a typical utility-scale WT drivetrain consist of the gearbox, main shaft, main bearing, brake, generator shaft, and generator.The gearbox is placed between the hub and the generator and used to convert the low-speed high-torque power from the WT rotor to highspeed low-torque power used by the generator [2].The wind turbine data used in this paper originate from a series of experimental measurements for a WT drivetrain-shown in Figure 3 operational parameters were monitored.These parameters can be grouped into three categories as follows.
(i) Speed Parameters.This category consists of three speed parameters related to the wind speed (in mps, i.e., meters per second), the rotor speed (in rpm, i.e., revolutions per minute), and the generator speed (i.e., the generator shaft rotational speed in rpm).
(ii) Energy Conversion Parameters.This category includes six parameters related to the energy conversion process, that is, the active power (in kW), the active power delivered (i.e., the generated power in kW), the reactive power (in kW), the reactive power delivered (in kW), the generator voltage (in V), and the generator current (in A).
(iii) Temperature Parameters.This category consists of three temperature parameters (in ∘ C) measured at turbine components, that is, one in the gearbox and two at the generator (one in the front and another in the back of the generator).
The SCADA data for the WT were collected at 10-minute intervals in a period of thirty days.As a result, 4320 data samples (or records) were acquired for each parameter for a variety of different operating conditions.The wind speed is a key operational parameter in wind energy systems [3][4][5]11].Therefore the relations between this parameter and the other parameters should be identified.Figures 4(a)-4(d) display examples of four nonlinear relations between the wind speed and the generator speed, the generated power, the front generator temperature, and the gearbox temperature, respectively.Since the relation between the generated power (i.e., the power output) and the wind speed-referred to as the power-wind speed curve [3,11]-is generally used to evaluate the health of WTs [3], this feature has been selected for the analysis in this study.This nonlinear relation-shown in Figure 5-has a shape of the sigmoid function representing the relationship between the power produced by the WT (in kW) under normal operating conditions for the wind speed in the range between the starting speed (about 3.5 mps) and the rated speed (about 13.5 mps).This characteristic-for the positive generated power-was used for the nonlinear cointegration analysis in the presented example.It is important to note that only the experimental data from an intact wind turbine was available and used in these investigations.The objective was to demonstrate that the proposed method leads to the homoscedastic cointegration residuals for an unfaulty wind turbine drive, regardless of changes in the nonlinear relation between the wind speed and the generated power.
The homoscedastic nonlinear cointegration methodpresented in Section 4.1-was applied to the selected experimental data.The original cointegration residual   was computed using (7), whereas the modified cointegration residual  *  was calculated using (19).In this example, wind speed is the variable   , generated power is the variable   , and function (  ) describes the relation between the generated power and the wind speed.Since (  ) is unknown, a local regression algorithm was used to estimate this function.An estimate of the function  at  0 -that is, ( 0 )was calculated using the nearest twelve values of  0 from the training data.Then, the least squares regression was employed for this subset of values to estimate or predict the value of ( 0 ).The gradient (or slope) of the calculated regression was considered as the first derivative of the function  at  0 , that is,   ( 0 ).The entire procedure was repeated for all remaining values, that is,  1 ,  2 , . .., which resulted in the estimate of the function .Only 40% of the experimental data-corresponding to the first twelve days of the condition monitoring process-were used for training to avoid overfitting.Another important reason is to validate the estimated function (  ) when the entire data were used to create cointegration residuals.
Figures 6(a because  value < 0.0001 < 0.05, indicating that   is heteroscedastic.In contrast, the test statistic for  *  was insignificant because  value = 0.425 > 0.05, indicating that  *  is homoscedastic.Moreover, the correlations between the wind speed (  ) and the absolute values of   and  *  were calculated as 0.36 and 0.07, respectively.This means that the wind speed   contains less information about deviation of  *  in comparison with   .
In summary, when the data from an intact wind turbine were analysed, the modified cointegration residual  *  was homoscedastic for the whole data set investigated, regardless of changes in the relation between the wind speed and the generated power.Clearly, the homoscedasticity is a sign of undamaged condition.It is anticipated that the homoscedastic characteristics would be broken if the data for faulty wind turbine drives were available and analysed, as explained in [18][19][20][21][22][23][24].

Conclusions
Monitoring of trends and removal of undesired trends from operational/process parametric data in wind turbines has been addressed in this paper.The recently proposed homoscedastic nonlinear cointegration has been applied.The method has been illustrated using three different time series data sets, that is, one with a nonlinear quadratic deterministic trend, another with a nonlinear exponential deterministic trend, and one experimental data set from a wind turbine drivetrain.
The results show that the proposed method can effectively remove nonlinear trends from the analysed data and also remove heteroscedasticity from cointegration residuals.For the case study using experimental wind turbine data, the modified cointegration residuals have been shown to be homoscedastic for the data representing undamaged condition.It is expected that these modified cointegration residuals Mathematical Problems in Engineering   would instantly become heteroscedastic for the data from damaged wind turbines, thereby providing an effective condition monitoring and fault detection tool for wind turbines.It is clear that further research work is required to test the method for different types of wind turbine data and trends.In particular, the proposed methodology should be investigated for operational data representing various types of faults in wind turbine drivetrains.
In this paper, the homoscedastic nonlinear cointegration method was effectively used for condition monitoring of wind turbines.However, because the proposed method is a general approach-which is simply based on the analysis of measurement data in terms of time series responses acquired from investigated processes or structures by sensors-the authors believe that this method can be properly applied to other engineering applications.It is also possible that researchers and practitioners from the field of Econometrics might benefit from employing our proposed method when they have problems with the existence of heteroscedasticity in economic and financial time series in general and particularly in residuals obtained from linear or nonlinear cointegration process.

Figure 3 :
Figure 3: Wind turbine drivetrain used in the current study: (a) generator and phase marker; (b) gearbox and transmission systems.

Figure 4 :
Figure 4: Experimental wind turbine data showing the nonlinear relations between wind speed and other operational parameters: (a) generator speed versus wind speed; (b) generated power versus wind speed; (c) generator temperature (front part) versus wind speed; (d) gearbox temperature versus wind speed.

Figure 5 :
Figure 5: The nonlinear relation between the generated power (or the power output) and the wind speed.

Figure 6 :
Figure 6: Nonlinear cointegration results obtained for the experimental wind turbine data: (a) original cointegration residual   ; (b) modified cointegration residual  *  .