SPA-Based Modified Local Reachability Density Ratio wSVDD for Nonlinear Multimode Process Monitoring

. Many industrial processes are operated in multiple modes due to diﬀerent manufacturing strategies. Multimodality of process data is often accompanied with nonlinear and non-Gaussian characteristics, which makes data-driven monitoring more complicated. In this paper, statistics pattern analysis (SPA) is introduced to extract low-and high-order statistics from raw process data. Support vector data description (SVDD), which can deal with nonlinear and non-Gaussian problems, is applied to monitor multimode process in this paper. To improve detection performance of SVDD for training multimode data with outliers, modiﬁed local reachability density ratio (mLRDR) is proposed as a weight factor to be embedded in the weighted-SVDD (wSVDD) model, in which the local neighbors in terms of both space and time are considered. Finally, the eﬀectiveness and superiority of our proposed method are demonstrated by the Tennessee-Eastman (TE) process and wastewater treatment process (WWTP).


Introduction
e operation conditions of industrial processes will inevitably change with diverse customer requirements, setpoints variation, and different intrinsic features, which leads to multiple modes [1][2][3]. Because of the complexity of multimodal process, it is difficult to obtain satisfactory monitoring results. In the past decades, effective monitoring of multimode processes has attracted a lot of attention. e data-driven monitoring methods of multimode process were recently reviewed extensively by Quinones-Grueiro et al. [4]. ey indicated that multimodel scheme is one of the widely applied monitoring methods, in which the identification of different modes should be performed in advance. For instance, Du et al. [5] applied K-means to address the multimode clustering problem. Similarly, Zhang and Zhao [6] clustered multimode data into different clusters through the kernel fuzzy c-means (KFCM) algorithm. In these methods, the cluster number should be confirmed in advance. As an alternative, the moving window-based methods, for example, recursive local outlier factor (LOF) [7], measure the similarity between windows with the spatial and temporal information of the features to identify the clusters. After mode identification, each mode is monitored by such traditional methods as principal component analysis/independent component analysis (PCA/ ICA) and their extensions and so forth. [8][9][10]. Recently, several efforts have been made on applying the multiple models with Bayesian fusion to multimode monitoring. Specifically, the Gaussian Mixture Model (GMM) was deployed to accomplish fault diagnosis by considering multiple models simultaneously [11,12].
Compared with the multimodel scheme, single-modelbased method simplifies the modeling and monitoring procedure. Hidden Markov model-(HMM-) based methods were proposed in [13,14], whereas their parameter setting is still a complex task. Zhu et al. [15] proposed a recursive mixture factor analyzer for multimode time-variant process modeling and monitoring. Considering that there are interconnections among different modes, a strategy was developed to grasp common characteristics in them to construct a monitoring model. Hwang and Han [16] proposed the super principal component analysis (PCA) method, which can be considered as the first attempt based on this idea. en, Zhang et al. [17] built the common and individual monitoring model by PCA-based and partial least squares-(PLS-) based scheme. Besides, Zhang et al. [17] utilized the common basis vectors to extract the common features among multimode data. For nearest-neighbor approaches, different modes should be standardized [18]. For example, Ma et al. [19] presented a neighborhood-based global coordination (NBGC) method by aligning the local models into a global one for multimode process monitoring.
Among the single-model-based methods, SVDD model and its extensions are widely applied. Developed from support vector machine [20], SVDD is proposed based on the idea of one-class classification. SVDD projects the data into high-dimensional feature space without the requirement of Gaussian distribution [21]. erefore, the SVDD method can deal with both nonlinear and non-Gaussian data [22]. However, SVDD is insensible to a very small fraction of outlier, and the trained hypersphere will sometimes tend to encircle outliers that are far away from the normal dataset. Besides, the normal samples with different density have no different effects on SVDD modeling. In the multimode training dataset, the sampling number and data density vary dramatically with different modes. us, SVDD is not suitable to be directly applied to multimode monitoring. e wSVDD method was developed to solve this problem, in which the weight factor is usually determined by nearestneighbor information. To reduce the impact of outliers on modeling in single-mode process, Chen et al. [23] proposed robust-SVDD by introducing cutoff-distance-based local density of each data sample and the ε-insensitive loss function with negative samples. Wang and Lan [24] used the SD outlyingness to assign lower weight values to outliers. For multimode process, Zhao et al. utilized the weighted mean and standard deviation of each sample's neighbors to standardize the dataset and applied weighted local standardization (WLS) strategy to wSVDD [25]. Li et al. [26] proposed a local density ratio weighted support vector data description (LDR-wSVDD) to maintain the monitoring efficiency of a single hypersphere model. Nevertheless, the spatial information is still not fully mined, and the temporal information is not employed in the construction of the weight for wSVDD, which may lead to unsatisfactory performance of the existing methods. Consequently, wSVDD still needs to be further studied.
Recently, Breunig et al. [27] defined local reachability density (LRD) and indicated that the LRD of a sample is related to the local information of not only itself but also its nearest neighbors. By containing more comprehensive density information, LRD has the potential to be transformed into a new weight factor for wSVDD. However, LRD varies drastically with different modes; thus, it does not yet reflect their local information equally. In addition, conventional LRD only considers spatial information and does not take temporal information into account.
Moreover, the data in different modes may overlap each other in actual industrial process, so that the original data sometimes fails to embody the unique characteristics of certain modes. Statistics pattern analysis (SPA) was firstly introduced by Wang and He [28], which is able to mine underlying statistic features of data and has been applied in process monitoring [29]. Zhu and Gu integrated SPA into local kernel principal component analysis (LKPCA) to enhance fault detection performance [30].
In this paper, SPA is applied to obtain statistical characteristics for fully mining the underlying information of multimode dataset firstly.
en, wSVDD is selected as a monitoring model, and modified local reachability density ratio (mLRDR) is proposed as a new weight for wSVDD to remove the multimodality of process and reduce the impact of outliers. e local information in terms of both space and time is considered in mLRDR calculation to further enhance monitoring performance. Ultimately, a novel scheme called SPA-based modified local reachability density ratio weighted support vector data description (SmLRDR-wSVDD) for nonlinear multimode monitoring is established. e remainder of the paper is organized as follows: in Section 2, the proposed nonlinear multimode process monitoring scheme is elaborated in detail; in Section 3, the results and discussion of TE process and WWTP are presented; finally, conclusions are given in Section 4.

Methodology
In this section, SmLRDR-wSVDD method is proposed to monitor nonlinear multimode process, in which statistics pattern dataset is formed by SPA, and a new weight factor named mLRDR is proposed based on LRD for wSVDD monitoring method.

Construction of Statistics Pattern.
Although specific characteristics of different modes may not be embodied from the original data, they can be expressed in terms of the statistics. In the SPA framework [28], different statistics patterns for observed variables are selected to capture the dominant process characteristics such as dynamics and nonlinearity and are used as the modeling and monitoring object in the proposed method.
Suppose that X ∈ R n×m is the original measured dataset with n samples and m variables, and a window of samples is denoted as where h is the window width and t is the time index. Generally, statistics pattern (SP) consists of three groups of statistics: first-order, second-order, and high-order statistics, which can be expressed as 2 Complexity where μ denotes the first-order statistics, that is, the mean, and its elements are calculated from the data in a window: where V denotes the second-order statistics, which includes variance (v i ), correlation (r i,j ), autocorrelation (r i ), and cross-correlation (r i,j ) of different variables in the window as with d denoting the time lag between the variables; Ξ denotes the high-order statistics including skewness (c i ) and kurtosis (κ i ) to measure the degree of nonlinearity and quantify the non-Gaussianity of the process variables [31] as e SP is constructed by stacking λ selected statistics by equations (1)-(9) into a row vector. In the paper, three kinds of statistics μ i (t), v i (t), and κ i (t) are selected from oneorder, two-order, and high-order statistics to construct statistics matrix S ∈ R N×M with N � n − h + 1, M � λ × m, and λ � 3.

mLRDR Weighted SVDD.
e main idea of SVDD is to project data into high-dimensional space for constructing a minimum hypersphere. e training dataset S � [s T 1 , s T 2 , . . . , s T N ] T ∈ R N×M is obtained according to equation (2), and it is normalized as S. In the procedure of SVDD modeling, S is firstly mapped from the original space to a higher feature space by a nonlinear transformation function Φ(s i ). For conventional SVDD, all the training data has the same impact on the model construction, which makes it insensible to outliers and data density. erefore, wSVDD is adopted by introducing a weight for each training data, and the corresponding hypersphere can be calculated by solving the following primal optimization problem: min R,a,ξ where a and R are the center and the radius of hypersphere, respectively; trade-off parameter C � (1/(p × N)) is introduced to make the hypersphere as small as possible while preventing the misclassification of the samples, in which p denotes the percentage of outliers permitted in the training set [26]; ξ i is slack variable that allows outliers of the hypersphere; and w i is weight of the i th data, where smaller w i indicates more possibility of an outlier. For multimode processes, the weight factor of wSVDD should fully reflect the local distribution characteristics of each piece of training data. Density is an indicator reflecting the data distribution, which is widely used in the design of weight factor. Local reachability density (LRD) [27] of a data point depends on its distance to not only the point's neighbors but also its neighbors' neighbors.
Besides the space density, the current data is also related to the previous and the next points in time sequence, and monitoring performance will be improved by taking both the temporal information and spatial information into account [32]. Consequently, the modified LRD is proposed in this paper by considering the local neighbors in both time and space to comprehensively extract the local features of the data.
Firstly, all K-nearest neighbors of s i (i � 1, 2, . . . , N) are selected to construct a set KNN(s i ), which satisfies K � K 1 + K 2 , where K 1 and K 2 are the numbers of neighbors in space and time, respectively.
Suppose that K − distance(s i ) is the Euclidean distance from s to its K th -nearest neighbor. en the distance d(s i , s k i ) between s i and s k i (k � 1, 2, . . ., K) is computed, where s k i is the k th -nearest neighbor of s i . e reachability distance of s i with respect to s k i is defined as us, mLRD is constructed as e main property of weight factor for multimode process should equalize different mode and reduce the influence of outlier in modeling. According to equation (12), if distribution characteristics vary greatly with different modes, mLRD in different modes will change a lot; and some outliers will be hardly separated from normal data; that is, the local density of normal data with lower global density may be similar to that of the outliers around the data with Complexity 3 higher global density. Accordingly, to weaken the multimodality feature and widen the gap between outlier and the normal data, a modified local reachability density ratio (mLRDR) is developed to be the weight factor of wSVDD as It is obvious that w(s i ) is kept within [0, 1]. According to equation (13), if s i is a normal point, its mLRD is close to its neighbors and the corresponding mLRDR approaches 1. If s i is an outlier, its mLRD is significantly smaller than that of its normal neighbors and mLRDR is close to 0, which indicates that outliers will have trivial influence on the determination of the hypersphere.
Compared with the existing weight factors such as LDR [26], mLRDR contains more comprehensive spatial and additional temporal information, and it can distinguish normal data with lower density and the outliers around the data points with higher density data; thus it will improve the monitoring performance of wSVDD.
For wSVDD modeling, equation (10) can be transformed into a dual-optimization problem by introducing Lagrange multipliers α � [α 1 , α 1 , . . . , α N ] T as follows: where , is kernel function and σ can be determined via detecting the "tightness" of the decision boundaries [33]. After solving equation (14), only objects s i with α i > 0 are called the support vectors (SVs), and their subscripts set is SV � i|α i > 0, i � 1, 2, . . . , N . Hence, the center and radius of the hypersphere are respectively.
When h new samples are obtained for a window, SPs are calculated by equation (2) and normalized as s new , and the distance between s new and the center R can be calculated by Ker s i , s j .
If Dist(s new ) ≤ R, the system is considered in normal state. Otherwise, a fault occurs.

2.3.
e Procedure of SmLRDR-wSVDD Scheme. In this subsection, the proposed SmLRDR-wSVDD scheme is illustrated in Figure 1. e specific offline modeling and online monitoring steps are listed as follows: Offline modeling procedure: Step 2: normalize statistics matrix S and obtain S. (iii) Step 3: compute the mLRDR as a weight factor W(s i ) by equations (11)-(13). (iv) Step 4: set up weighted-SVDD model with the weight W, solve equation (14), and obtain the center and the radius of the hypersphere by equations (15) and (16), respectively.
Online monitoring procedure: In this paper, if Dists of 3successive windows are over the control limit R, the system is considered in faulty condition, and the last sample in the first faulty window is determined as the beginning of the fault.
It must be noted that the identification of new normal modes is not considered in the proposed method. Although a new normal mode will be detected as a fault by SmLRDR-wSVDD, it may be redefined by analyzing the stationarity of sampling data after enough data are collected.
In summary, the proposed method adopts SPA to mine more features from different-order statistics of the original data compared with other existing SVDD-based methods. Besides, a new weight factor is proposed for wSVDD modeling, in which the additional neighbors' local spatial information and temporal information are considered, which makes the proposed method more sensitive to outliers and density compared with other methods.

Case Study
In this section, the proposed SmLRDR-wSVDD method is applied on Tennessee-Eastman (TE) benchmark process and wastewater treatment process (WWTP). In both cases, the proposed method is compared with the existing methods including KPCA [34], conventional SVDD, and LDR-wSVDD [26].

3.1.
Tennessee-Eastman Benchmark Process. Tennessee-Eastman process (TE process) was put forward in 1993 by Downs and Vogel [35], which has been widely adopted for scientific research [36].
ere are five units, reactor, condenser, compressor, stripper, and separator, in this nonlinear process. According to different (G/H) mass ratios (G and H are two products), 6 steady modes can be obtained as listed in Table 1. e control scheme and normal operations were provided by Ricker [37] and the corresponding simulation platform can be downloaded from http://depts.washington.edu/control/ LARRY/TE/download.html. Mode 1 and Mode 3 are simulated to generate modeling and testing datasets. ere are 53 variables: 12 manipulated variables, 22 continuous process variables, and 19 composition measurements. Since the steady-state values of the recycle value and steam value in Mode 1 and the agitator rates in both modes do not change, they are not included in the monitored variables, and the remaining 9 manipulated variables and 22 easily measured continuous variables are chosen in this case. In addition, TE process simulation platform contains 20 faults. Excluding faults 3, 9, and 15, which are slight disturbances, and unknown faults 16-20, 12 faults given in Table 2  e confidence level of KPCA is chosen as 95%. For the other three algorithms, the Gaussian kernel width parameter σ is determined by the method in [33], and p � 5% is selected for the penalty parameter C. In the proposed method, the window width h is assigned as 10, and the numbers of neighbors in space and time K 1 and K 2 are chosen as 8 and 2, respectively. e above-mentioned parameters are the same in the two cases. e four methods, that is, KPCA, SVDD, LDR-wSVDD, and the proposed SmLRDR-wSVDD, are applied and compared on the TE process. e missed alarm rates (MARs) [38] for each fault type of Mode 1 and Mode 3 are shown in Table 3, in which the best results are marked in bold, and the false alarm rates (FAR) are shown in Table 4 Figure 2(c), Dist fluctuates slightly, and some fall below the limit by LDR-wSVDD; meanwhile, with the proposed method shown in Figure 2(d), all the statistics are far above the control limit. Similarly, the proposed SmLRDR-wSVDD can separate normal condition and fault more obviously than the other 3 methods as shown in Figures 3 and 4. e monitoring result of fault 13 for Mode 3 is shown in Figure 5. For this fault, there are 48, 47, 27, and 9 samples of detection delay by KPCA, SVDD, LDR-wSVDD, and SmLRDR-wSVDD. us, one can infer that SmLRDR-wSVDD is more sensitive to the four faults than the other methods.
FARs of all methods are summarized in Table 4. For all faults of Mode 1, FARs of KPCA and LDR-wSVDD are within [0.01, 0.03] and [0.01, 0.02], respectively, while those of conventional SVDD and SmLRDR-wSVDD are within [0, 0.01]. In Mode 3, FARs of SPE in KPCA are the largest for all faults, sometimes even larger than 0.05, and those of LDR-wSVDD for all faults range from 0.015 to 0.035. Note that all FARs of the proposed SmLRDR-wSVDD method are less than 0.02, though most of them are higher than those of conventional SVDD within 0.01. Figure 6 shows LRD and the weight of each training sample based on LDR and the proposed mLRDR, respectively, in which blue, green, and red dots are from Mode 1, Mode 3, and outliers, respectively. In Figure 6(a), it is easy to find that the normal modes vary greatly in terms of LRD, and there is no distinctive gap between Mode 3 and outliers. As shown in Figures 6(b) and 6(c), LDRs and mLRDRs of samples in two normal modes distribute around 0. 6    Step 2 Composition, A/C ratio constant (stream 4) Step 4 Reactor cooling water inlet temperature Step 5 Condenser cooling water inlet temperature Step 6 A feed loss (stream 1) Step 7 C header pressure loss-reduced availability (stream 4) Step 8 A, B, C feed composition (stream 4) Step 10 C feed temperature (stream 4) Random variation 11 Reactor cooling water inlet temperature Random variation 12 Condenser cooling water inlet temperature Random variation 13 Reaction kinetics Slow drift 14 Reactor cooling water valve Sticking               Complexity lower than those of normal data. erefore, the decision boundary is not susceptible to outliers by the proposed method, which will improve its monitoring performance compared with other methods.

Wastewater Treatment Process (WWTP).
In this subsection, benchmark simulation model no. 1 (BSM1) of WWTP [39] is considered. is benchmark is a complicated nonlinear system, which consists of five compartments (two anoxic and three aerobic) in the biological reactor and a tenlayer clarifier. e layout of the plant is shown in Figure 7, and more detailed information about BSM1 can be found at http://www.benchmarkwwtp.org/.
By changing the flow rates and oxygen transfer coefficients in biological reactor, three different modes can be obtained [40]. Without loss of generality, Mode 1 and Mode 2 are chosen in this case, and the corresponding parameters are shown in Table 5. e process contains 223 variables, 20 of them is related to the biological phenomena, and they are selected in this example as shown in Table 6.
ere are two disturbance types in the benchmark simulation [41]: external disturbances and internal disturbances. Specifically, dry weather, storm events, and prolonged rain are three weather situations, among which the first one is regarded as normal and the latter two are external disturbances, which can be detected by the influent characteristics. As for the internal disturbances, decreasing nitrification, decreasing settling velocity, nitrate sensor failure, and setpoint change of DO controller were mentioned in [41].
In this example, the samples are collected every 15 minutes. 1344 samples of dry weather of Mode 1 and Mode 2 shown in Table 7 are mixed with 20 additional outliers as training dataset. 5 simulation cases (Case 1-Case 5) are designed for the purpose of testing [40,41], which are also illustrated in Table 7.
e four above-mentioned algorithms are performed for the 5 cases, and the parameters of them are the same as those in the TE process study. e MARs for all faults are shown in Table 8. Specifically, the monitoring results for Case 1 are shown in Figure 8. As shown in Figures 8(a)-8(c), the fault is detected with 6, 8, and 7 points' delay by SPE of KPCA, the conventional SVDD, and LDR-wSVDD, respectively; meanwhile the proposed method can detect the fault immediately after its occurrence, as shown in Figure 8(d). For Case 2 shown in Figure 9, the fault occurs after the 672 nd sample, and it changes slowly away from normal. From Figures 9(a) and 9(b), there are a lot of statistics falling below the control limits after the 672 nd sample under KPCA and the conventional SVDD. As shown in Figure 9(c), though the weight factor, that is, LDR, is applied in wSVDD, many statistics are still under the limit at the beginning of the fault. As shown in Figure 9(d), the fault is detected quickly by the proposed method, and the following statistics are far above the control limit. Compared with other methods, SmLRDR-wSVDD is more effective for the faults with slow drift in the budding stage.
As for FARs shown in Table 9, the proposed SmLRDR-wSVDD method has the lowest FARs in Cases 2, 3, and 4. Although its FARs are slightly higher than those of conventional SVDD for Cases 1 and 5, both of them are fewer than 0.03, which is too trivial to be considered.
Similarly, LRD, LDR, and mLRDR of each training sample are visualized in Figure 10 to explain the superiority of our proposed method. Blue, green, and red dots are from Mode 1, Mode 2, and outliers, respectively. Obviously, LRD changes with different modes, and that of outliers is close to normal in Figure 10(a). Although LDRs can remove multimodality as shown in Figure 10  and normal samples are not distinguished. As a comparison, mLRDRs of outliers are far away from those of normal as shown in Figure 10(c). It is demonstrated once again that the proposed method can deal with multiple modes evenly and reduce the influence of outliers on modeling.

Conclusion
In this paper, a nonlinear multimode process monitoring method named SmLRDR-wSVDD is proposed to further improve the monitoring performance. Specifically, the SPA is used first to extract low-and high-order statistics; then, weight-SVDD is adopted to build a monitoring model, in which a new weight factor mLRDR is proposed based on LRD. Compared with the traditional methods, more features that are difficult to display are mined from raw data by the proposed method, and more comprehensive local space and time information is integrated into the weight factor, which can remove the multimodality uniformly and greatly weaken the influence of outliers on the boundary of the hypersphere. e applications in the TE process and WWTP have demonstrated that the proposed method is effective in nonlinear multimode process monitoring with lower MAR/ FAR and higher fault sensitivity compared with the other three existing methods.
Data Availability e data in the manuscript are generated according to http:// depts.washington.edu/control/LARRY/TE/download.html and http://www.benchmarkwwtp.org/.

Conflicts of Interest
e authors declare that they have no conflicts of interest.