Statistical Process Monitoring with Biogeography-Based Optimization Independent Component Analysis

Independent Component Analysis (ICA), a type of typical data-driven fault detection techniques, has been widely applied for monitoring industrial processes. FastICA is a classical algorithm of ICA, which extracts independent components by using the Newton iteration method. However, the choice of the initial iterative point of Newton iteration method is difficult; sometimes, selection of different initial iterative points tends to show completely different effects for fault detection. So far, there is still no good strategy to get an ideal initial iterative point for ICA. To solve this problem, a modified ICA algorithm based on biogeography-based optimization (BBO) called BBO-ICA is proposed for the purpose ofmultivariate statistical processmonitoring. The Newton iteration method is replaced with BBO here for extracting independent components. BBO is a novel and effective optimization method to search extremes or maximums. Comparing with the traditional intelligent optimization algorithm of particle swarm optimization (PSO) and so on, BBO behaves with stronger capability and accuracy of searching for solution space. Moreover, numerical simulations are finished with the platform of DAMADICS. Results demonstrate the practicability and effectiveness of BBO-ICA. The proposed BBO-ICA shows better performance of process monitoring than FastICA and PSO-ICA for DAMADICS.


Introduction
Industrial processes play an extremely significant role in the development of economy.Once any part of the process breaks down, the production efficiency will be greatly influenced, and more seriously disasters may be caused.Therefore, it is important for plants to detect and diagnose the fault as early as possible.Then, proper measures would be taken for minimising economic loss.
Process monitoring technologies can be divided into three types: model-based, knowledge-based, and data-driven method [1].More detailed classifications are shown in the Figure 1.
Model-based method is a type of process monitoring technology, which is established on mathematical models of industrial process systems.To some extent, the efficiency of the process monitoring is largely dependent on the accuracy of mathematical models.Nevertheless, it is difficult to get accurate models from complex industry processes.In addition, it is troublesome to establish different models for different industrial processes.
The knowledge-based method requires large amount of empirical knowledge.Machines can learn from these knowledge and make their own judgements and decisions on the basis of the actual situations.For a complex industrial process, it is impossible to consider all the possibilities and hard to get some knowledge of industrial processes.
Data-driven method can determine whether there is a fault according to the operation data obtained by large amount of sensors.With the arrival of the era of big data, a large number of data of running information of industrial processes have been stored in computers; however, these data have not been fully utilized.So, mining useful information from a large amount of data has become very important.Data-driven method of process monitoring without needing models and prior knowledge has attracted more and more attention in the field of process monitoring.ICA, also as a method of data-driven monitoring techniques, has been widely applied into several fields [2] involving image processing, signal extraction [3], and medical science [4,5] including electroencephalograph (EEG) [6] and neuroscience [7].ICA is originally proposed for Blind Source Separation (BSS) [8][9][10] and the mathematical description of ICA is defined [11].ICA is also introduced for process monitoring [12].The classical FastICA algorithm based on Newton iteration method is employed to extract independent components [13].More recently, several process monitoring methods based on FastICA have been proposed.Kernel Independent Component Analysis (KICA) [14] behaves with better performance in detecting faults of nonlinear characteristic of process systems.Considering the autocorrelation characteristics of process variables in time series, Dynamic Independent Component Analysis (DICA) [15] is utilized to eliminate the sensitivity to dynamic feature of processes.Canonical Variate Analysis (CVA) algorithm is combined with ICA [16] for data preprocessing, which improves the efficiency of fault detection.Fault classifier based on Support Vector Machines (SVM) and ICA [17] is used to reveal the hidden factors from massive process data.There is no mathematical expression of noise in the traditional ICA.Aiming at this point, a Noisy-ICA method [18] is proposed to fix the classical ICA model.Dynamic Bayesian Independent Component Analysis [19] is proposed to detect faults of multimode process.Probability is introduced to improve performance of ICA [20].Jarque-Bera test is used to drive the observed data into different data blocks [21]; then DICA and DPCA are used to handle different data blocks.
Although the ICA algorithm has been improved by many scholars, the influence of the initial iteration point on the performance of the algorithm is neglected.It is worth noting that the effect of the FastICA algorithm largely depends on the choice of the initial iterative point.Different initial iterative points would cause completely different convergence properties.Mostly, it is difficult to obtain an ideal initial iterative point and it is inconvenient and unadvisable to obtain the initial iterative point by trying many times.
When the FastICA algorithm is applied to process monitoring, different initial iterative points would lead to opposite result of diagnosis.This is unacceptable in practice.Taking the safety system of aircraft as an example, the alarm of the safety system means that there is a fault in the aircraft.If the result of diagnosis is unreliable, it would result in the panic of the passengers when there is a false alarm.Furthermore, the safety of passengers is not guaranteed.Thus, to some extent, the result of diagnosis of FastICA algorithm is unreliable because of sensitivity of initial iterative point.
Biogeography-based optimization (BBO) is a heuristic optimization method based on the collective behavior simulation of biological population, which is firstly proposed and compared with seven other traditional intelligent optimization algorithms, such as Genetic Algorithms (GA), Particle Swarm Optimization Algorithm (PSO), and Ant Colony Algorithm (ACO) [22].The performance of various algorithms on 14 standard function benchmarks show that BBO is a more efficient method to solve optimization problems.
Though PSO-ICA has been proposed [23] to separate independent components, it cannot separate independent components accurately, which would behave with poor performance in the process monitoring.
A method named BBO-ICA is proposed in this paper to monitor processes by replacing the Newton iteration method for independent components extraction by BBO and achieve more accurate results.A comparison among FastICA, BBO-ICA, and PSO-ICA in the performance of process monitoring is made in DAMADICS benchmark [24].This paper is organized as follows: Section 2 gives the basic procedure of process monitoring based on ICA.
Then, BBO-ICA method is proposed in Section 3. A simulation study on the DAMADICS benchmark is presented in Section 4. Finally, conclusions are drawn in Section 5.

Process Monitoring Based on ICA
Here, A ∈ R × is a mixing matrix, which is unknown and full-rank.It is usually assumed that  is equal to  and samples number is ; the expression can be rewritten as Usually, the measured data X need to be whitened before applying the ICA algorithm, which can eliminate the cross-correlation among measured variables.The whitened variables matrix Z can be obtained by where Q = Λ −1/2 U  and Λ and U can be acquired by eigendecomposition of the covariance matrix R x , which is given by where  represents expectations.After whitening transformation, it is easily verified that the covariance matrix R z = (ZZ T ) is an identity matrix and B is an orthogonal matrix as verified by the following relation: Thus, (3) can also be transformed into the form of The object of ICA is to estimate the independent components S or the mixing matrix A from the measured data matrix X without any knowledge of S or A. It is equivalent to find a demixing matrix W which can reconstruct the independent components Ŝ from the measured data matrix X.The reconstructed system of ICA is illustrated in Figure 2.
Considering ( 3), (6), and ( 7), we can derive The principle of estimating B is to make Ŝ independent of each other as possible.To calculate B, each row vector b i of B is randomly initialized and then updated so that the th independent component ŝi = b i  Z has maximum independence.
Independence can be reflected by the non-Gaussianity.Negentropy is a common measure of non-Gaussianity, which is based on the information theoretic quantity of differential entropy.A flexible and reliable approximation of negentropy is as follows: where  is assumed to be of zero mean and unit variance, V is a Gaussian variable of zero mean and unit variance, and  is a nonquadratic function.By choosing  properly, one obtains good approximations of negentropy.Hyvärinen and Oja suggested three types of functions for  [3]: where 1 ≤  1 ≤ 2,  2 = 1.Among these three functions,  1 is a good general-purpose contrast function and is therefore selected in this paper.So, in order to find the maximum independence of th independent component, the problem can be simplified as satisfying The constraint can be transformed into 2.2.Process Monitoring with ICA.After solving this optimization problem, we get  independent components Ŝ.But we do not need to use all the independent components for process monitoring, because the independent components may contain noise and some redundant information.A standard is proposed to choose main independent components, which sorts  independent components by the Euclidean norm of each b i and selects  several dominant independent components Ŝd = B T d Z. Two types of statistics are calculated from the process model in normal operation: the I 2 -statistic for the systematic part of the process variation and the SPE-statistic for the residual part of the process variation, which is defined as follows: A limit value is also needed to determine whether the process is in control or not.Note that independent components seldom follow the Gaussian distribution.Hence, the confidence limit of I 2 and SPE statistics cannot be determined directly from a particular approximate distribution.
Kernel Density Estimation is utilized to determine the confidence limit, which can estimate the density function of the normal I 2 and SPE statistics.A univariate kernel estimator is given by where  is the value under consideration,   is an observation point, ℎ is a smoothing parameter,  is the number of observations, and { * } is a kernel function.The point, which is upper quantile of confidence level of 95%, can be obtained as the control limit of normal operating data.The flowchart of process monitoring with ICA is illustrated in Figure 3.

BBO-ICA
3.1.Biogeography.The science of biogeography can date back to nineteenth century.Naturalists, such as Darwin and Beer [26] did pioneering research on it.In 1960s, MacArthur and Wilson began [27] to focus their attention on the distribution of species among neighboring islands and were dedicated to establishing mathematical models for the extinction and migration of species, which greatly improve the development of the subject of biogeography.It was not until 2008 that this method was introduced to solve engineering problems by Simon [22].In nature, thousands of species are distributed in different habitats all over the world and different habitats contain different numbers of species.Geographical areas that are well suited as residences for biological species tend to have a high habitat suitability index (HSI) [28].Indices to measure suitability of habitats are concluded as follows: temperature, water resource, land area, vegetation, and topographic features.These factors that reflected the habitability are called suitability index variables (SIV).So, SIV and HSI can be considered as independent variables and dependent variable, respectively.
Habitat with a high HSI tends to have a large number of species.Because of saturating with species, habitat with a high HSI has high possibility of emigrating to neighboring habitats.Habitat with a low HSI would have a small number of species, which means that species have a high species immigration rate thanks to sparse populations.The immigration of new species to low HSI habitats may raise the HSI of the habitat, owing to the improvement of the biological diversity.If the habitat's HSI remains low, the species will go extinct.Therefore, low HSI habitats are more dynamic and high HSI habitats are more static in species distribution.The process of migration of species between habitats is indicated in Figure 4.
Biogeography reveals the nature's rules of distributing species, which can also be used to solve general problems.A good solution is analogous to a habitat with a high HSI, while a poor solution represents a habitat with a low HSI.
Figure 5 illustrates that the immigration rate  and the emigration rate  vary with number of species linearly.At the equilibrium point  0 , the immigration rate is equal to the emigration rate.The balance would be disturbed because of a sudden spurt of immigration from neighboring habitats or a sudden burst of disease.In nature, it would spend a long time for numbers of species to regain equilibrium [29,30].
The mathematical relationship between number of species and migration rate can be also described in the following formula: where   and   denote immigration rate and emigration rate of  species,  max and  max represent the max immigration rate and the max emigration rate, here,  max is considered equal to  max , and  is equal to the max number of species.
Considering (17), we can derive Now,   () is introduced to represent the probability that a habitat contains s species at any time .At time  + Δ, the probability that the number of species remains unchanged in the habitat is This equation must hold for the following conditions: (1) If there were  species in a habitat at time , no immigration and emigration of species would happen between  and  + Δ.
(2) If there were  − 1 species in a habitat at time , only one species would immigrate between  and  + Δ.
(3) If there were  + 1 species in a habitat at time , only one species would emigrate between  and  + Δ.
(4) The probability of more than one immigration or emigration can be ignored.

Biogeography-Based Optimization.
Biogeography-based optimization algorithm has been applied to solve many engineering problems such as different economic load dispatch problems [31], parameter optimization of neural network [32], smart energy management [33], pathological brain detection [34], and optimal VAR control in the circuit [35].Biogeography-based optimization algorithm involves two important steps: migration and mutation, which are closely related to the theory of biogeography.The procedure of emigration and immigration of each solution is used to share information among habitats probabilistically.Mutation is another significant scheme, which can increase diversity among habitats.However, best solution is easily corrupted by migration and mutation; thus, some numbers of elites are kept until next generation.Figure 6 is a general flowchart of biogeography-based optimization.
Take the following function as an example: BBO algorithm is described as follows: (1) Initialize number of habitats, SIV, elite and iteration, probability of habitat migration, maximum rate of mutation, immigration, and emigration, denoted, respectively, as , , ,  max ,  mod ,  max , , and .
(2) Initialize  solutions of the function randomly within feasible region; then,  initial habitats are obtained; the whole habitat can be written in the form of matrix: (3) HSI value for th habitat can be calculated by (4) Sort  habitats according to HSI value;  elite solutions are keep until next generation.Initialize count number  =  =  =  = 1.
(5) Select a habitat   according to the immigration rate of th habitat   ; if  < , go to step (6),  =  + 1; else go to step (9).The immigration rate of th habitat   is rewritten as where   stands for fitness rank of th island after sorting of HSI value.
(6) Generate a random number between 0 and 1, denoted as num rand 1.If num rand 1 <   , then go to step (7); else go to step (5).( 7) Select a habitat   according to the emigration rate of th habitat, which can be rewritten as (8) Replace the selected   's SIV with randomly selected   's SIV; if  < ,  =  + 1, go to step (7); else go to step (5).( 9) Compute the mutation probability of th habitat, represented as   ,  =  + 1.
(10) Generate a random number between 0 and 1, num rand 2. If num rand 2 <   , replace SIV of   with a randomly generated SIV within feasible region.
(11) Sort  habitats again according to HSI value; keep  elite solutions until next iteration.
(12) If  <  max , then go to step (5); else go to step (13).( 13) Output the best solution of the general problem.

BBO-ICA Algorithm.
Classical FastICA has a difficult problem of acquiring an ideal initial iterative point, which has a great influence on the performance of the FastICA algorithm.Thus, in order to avoid this problem, BBO is a substitute for Newton iteration method to find independent components.BBO is an optimization method for finding optimum solution, which means that BBO-ICA can obtain only one independent component of maximum independence.
Nevertheless, ICA need to estimate several independent components.A strategy described as follows can solve this problem.
It is considered that an independent component s i is related to some information of X.So, we can update the observation matrix X by subtracting the information which independent components are corresponding to.It can effectively prevent the algorithm converging to the same b i .The formula can be expressed as ICA algorithm based on biogeography-based optimization can be described as shown in Algorithm 1.

Simple Example.
In this section, Newton iteration method is used to solve the solutions of a simple explicit function, which is The function is plotted in Figure 7.The purpose of this example is to verify the influence of selection of different initial iterative points on the Newton Iterative Method.Newton iteration method is originally used to search roots, which can also get extremums by solving Select a habitat H i with   This simple example is set with three different initial iterative points which are  0 = −2, 1, and 5, respectively.The maximum iteration is 1000 and the iteration accuracy is 1 − 6.The process of iteration and the convergence of different initial iterative points are illustrated in Figure 8 and Table 1.
In contrast, to find roots of () using Newton iteration method, we utilize BBO to find extremums of ().
The process of iteration of BBO can be seen in Figure 9.It can be concluded from the Figure 8 and Table 1 that the effect of the Newton iteration method is largely dependent on the choice of the initial iterative point.Different initial iterative points tend to result in diverse convergence rates.More surprisingly, some initial iterative points would lead to nonconvergence.
Nevertheless, the algorithm of BBO can avoid this problem because of initializing of some populations randomly.It can search the extremums of () accurately within 22 times of iteration.BBO is able to obtain global optimal solution in the solution space, while sacrificing the expense of operation time of program.

DAMADICS Process.
In this section, the monitoring performance of the BBO-ICA method is compared with that of conventional FastICA by the DAMADICS process simulation data.Two generally used indices, fault detection rate (FDR) and false alarm rate (FAR), are mainly considered here for evaluating monitoring performance.
DAMADICS is the abbreviation of Development and Application of Methods for Actuator Diagnosis in Industrial Control Systems.The general structure of benchmark actuators is shown in Figure 14, which consists of three important parts: control valve, pneumatic servomotor, and positioner.The control valve is installed in the pipeline system to control the flow of the passing fluid.Pneumatic servomotor is a device, which converts electrical signal to force of pushing the movement of valve stem.Positioner can detect the actual displacement of valve stem and control the valve more precisely.The benchmark is set with 19 different faults [30].These faults can be classified into four classes: control valve faults (1-7), pneumatic servomotor faults (8-11), positioner faults (12-14), and general faults/external faults (15-19).
6 process variables are chosen for process monitoring, which are listed in Table 3.Only monitoring performance of several typical faults are shown here, including fault number of 7, 10, 13, and 17, which represent four different types.Faults It can be seen from Table 2 that FDR and FAR of 19 kinds of faults of BBO-ICA and FastICA are compared in Table 2.With regard to FastICA, 2 initial iterative points are set to test the influence of selection of initial iterative point on the algorithm.Initial iterative point 1 is an identity matrix and the other is a null matrix.Under the circumstance of initial iterative point 2, the performance of the fault detection of FastICA algorithm is poor.Compared with the performance of process monitoring of FastICA and BBO-ICA, it can be concluded that BBO-ICA shows higher FDR and lower FAR for most of faults.Parameters of BBO of , , ,  max ,  mod ,  max , , and  are set as 100, 6, 2, 200, 1, 0.9, 1, and 1, respectively.
In addition, performance of fault detection of BBO-ICA is also compared with PSO-ICA.Most of parameters of PSO are correspondent with BBO, such as the numbers of particles and iteration.Although the PSO-ICA can also solve the problem of selection of initial iterative point of FastICA, PSO-ICA behaves with lower FDR and higher FAR than BBO-ICA and FastICA under initial iterative point 1.So, it can be inferred that BBO have stronger capability and accuracy of searching for solution space than PSO.So, it means that BBO-ICA is a more suitable method to spread than PSO-ICA.
The proposed BBO-ICA can avoid the problem of selection of initial iterative points, which can improve the stability and robustness of the ICA algorithm.Moreover, BBO-ICA can also improve the performance of ICA algorithm.However, the BBO-ICA algorithm also has its drawbacks because of its great amount of calculation.When handling high dimensional complex problems, the process of computation will be quite slow.
It should be noted that half of faults have high FDRS and low FARS.But for some faults, that is, fault number of 3, 4, 5, 6, and so on, the efficiency of fault detection is poor, which may be attributed to little influence on variables.

Conclusion
BBO-ICA algorithm for process monitoring is proposed in this paper.Some drawbacks of FastICA and PSO-ICA are analysed.Compared to FastICA and PSO-ICA, the proposed BBO-ICA is more robust and stable than FastICA and separates independent components more accurately than FastICA and PSO-ICA.The proposed method is applied to process monitoring of the benchmark platform of DAMADICS; results demonstrate that BBO-ICA detects faults more efficiently than FastICA and PSO-ICA.Future research would be focused on the application of BBO-ICA in other subjects.

Figure 1 :
Figure 1: Categories of process monitoring technologies.

Figure 5 :
Figure 5: Relation between number of species and migration rate.

2 0Figure 8 :
Figure 8: Process of iteration of different initial iterative points of Newton iteration method.

Figure 14 :
Figure 14: Structure of benchmark actuator system.
measured variables x() = [ 1 ,  2 , . . .,   ]  are obtained.Measured signals are often mixed with noise and multiple source signals.Suppose that  measured variables can be expressed as linear combinations of  ( ≤ ) unknown independent components  1 ,  2 , . . .,   .The mathematical relationship between them can be expressed by 2, . . ., N) j Select an SIV in H k with probability

Table 1 :
The convergence of different initial iterative points of Newton iteration method.Employ BBO to extract independent component of maximum independence and get bi (8) Output vector bi, update the observation matrix X (9) Go back to step (5), if  ≤ ,  =  + 1 Algorithm 1: Process monitoring using BBO-ICA.