Spacecraft Actuator Diagnosis with Principal Component Analysis : Application to the Rendez-Vous Phase of the Mars Sample Return Mission

This paper presents a fault detection and isolation (FDI) approach in order to detect and isolate actuators (thrusters and reaction wheels) faults of an autonomous spacecraft involved in the rendez-vous phase of the Mars Sample Return (MSR) mission. The principal component analysis (PCA) has been adopted to estimate the relationships between the various variables of the process. To ensure the feasibility of the proposed FDI approach, a set of data provided by the industrial “high-fidelity” simulator of theMSR and representing the opening (resp., the rotation) rates of the spacecraft thrusters (resp., reaction wheels) has been considered.The test results demonstrate that the fault detection and isolation are successfully accomplished.


Introduction
In industry, an embedded system can be subjected to various promptings that may significantly affect the performance and the security of the system in question.Then, any dysfunction in the process can lead to serious consequences.It is therefore essential to establish a rigorous monitoring process assuring the detection of any failure and the automatic reconfiguration of the system in many domains (monitoring industrial facilities, automotive, aerospace, satellite autonomy, . ..) [1][2][3].
In this regard, aerospace is considered the most critical area of applications.Thus, autonomous spacecrafts have become a key technology for increasing their survival capability.Various researchers have developed diagnosis systems to detect and isolate faults that can appear in a spacecraft [4][5][6].
The primary purpose of fault protection is to ensure that anomalies or operational problems encountered during the operation of the spacecraft do not result in a permanent reduction in the spacecraft's capabilities or loss of the mission itself.To avoid these risks and any deviation from the nominal orbits, it is indispensable to provide a system with a fault detection and isolation (FDI) functionality and an automatic reconfiguration onboard a spacecraft.
The hardware redundancy is considered the most adaptable method that can provide the FDI functionality onboard spacecrafts.Nevertheless, such an approach may be very expensive and complex due to the additional weight and volume of the redundant elements.Model-based FDI methods, based on the analytical redundancy, address this drawback by modeling the monitored system to make a fault/no-fault decision [7,8].In fact, the used models can be, either, of quantitative, expressed as mathematical, or qualitative equations, expressed as the form of logical relations [9,10].
The only remaining problem with model-based methods is that it is difficult or sometimes impossible to find the right mathematical model for a real system due to numerous reconfigurations involved in the production process or to 2 Journal of Control Science and Engineering the complexity of the system.The solution is then to use the analytical redundancy based on a model-free method that usually does not require a detailed knowledge of the system like the Principal Components Analysis (PCA).
Actually, the PCA is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called "principal components." Monitoring and diagnosis, using the PCA, have been applied successfully in many complex systems [11][12][13], for sensor/actuator fault detection and isolation and also for operating mode change detection.To diagnose the faults, the PCA approach includes a PCA model for normal operation conditions and a PCA model for each faulty situation.By comparing these models, residual signals may be generated.These are compared with their threshold minimum, with and without faults.The only one that does not exceed its threshold can identify the faults and the actual system situation.
Historically, the thrusters and reaction wheels' failures are considered the most frequently failures that can occur on orbit.Numerous fault diagnosis systems have been developed to detect and isolate the thrusters' failures.Most of these systems have been based on an analytic method using a mathematical model [6,14].
As mentioned before, working in the aerospace domain demands a certain level of precision and assurance.Then, with the difficulty of any human intervention in a presence of anomalies during a spaceflight mission, achieving the diagnosis of the spacecraft actuators can be seen as one of the most important goals of this work.So, in this paper, we will present a diagnosis approach for spacecraft actuators (thrusters and reaction wheels) based on the PCA technique analyzing their performance during the rendez-vous phase of the MSR mission.
The paper is organized as follows.In Section 2, the PCA technique is briefly summarized.In Section 3, the methods for fault detection and isolation are introduced.The Mars Sample Return (MSR) mission is described in Section 4, with the results of thrusters and reaction wheels' fault detection and isolation with the PCA method.Conclusions and future developments are presented in Section 5.

The PCA Principle
The PCA has been frequently used for fault detection and isolation.Indeed, it is considered a very effective statistical method in extracting information from the measured data [12,15].
Considering the data matrix ,  = [(1) (2) ⋅ ⋅ ⋅ ()] ∈  (×) gathering  samples of  observed variables recorded on the system under normal operation.At an instant time  = 1, . . ., , where  ∈   .The linear transformation of the data matrix  into a new set of variables  is then given by where  = [ 1 ,  2 , . . .,   ] ∈  (×) and  = [ 1 ,  2 , . ..,   ] ∈  (×) is the matrix of the eigenvectors (called also principal vectors) associated with the eigenvalues of the covariance matrix Σ: The right choice of the number of principal components to retain is a very important step in the process of developing a PCA model.Once  is fixed, the number of used data is then reduced to the first  ( < ) variables which are the most significant components.The data and the two matrices  and  can be partitioned into a principal part and a residual one: The data matrix can be then decomposed in the following form: The matrices X and X represent, respectively, the modeled and the nonmodeled variations of  from  components.The first  eigenvectors forming the matrix P ∈  × constitute the representation space whereas the last ( − ) eigenvectors forming the matrix P ∈  ×(−) constitute the residual space with Ĉ = P P and C = P P = (  − Ĉ), which are the two matrices that form the PCA model.
In literature, several methods can be adopted to determine the required number () of principal components [16].In this paper, the adopted approach is based on minimizing the variance of the reconstruction error (VRE) [17].
The choice of  using the VRE approach is based on the best reconstruction of the variable.In fact, this index shows a minimum corresponding to the best reconstruction.
According to Qin and Dunia [17], the measurement vector, presented by (7), is corrupted with a fault along a direction   .At an instant time  = 1, . . ., , where  * is the fault-free measurement vector,  is the fault magnitude, and   ∈   is the fault direction.In order to find an estimate for  * in the direction of   to correct the effect of the fault , we try to find   such that where  rec is the reconstructed measurement vector.
The Reconstruction Principle.Considering that all equations are in function of , with  = 1, . . ., , More explicitly, Thus, to reconstruct the th variable,   should be replaced by x .The measurement vector  is then reestimated using ( 9).This process is repeated until x converges to a value denoted by  rec  and representing the reconstruction of the th variable.
Once the th variable has been reconstructed, (10) can be written as follows: For row , equation can be rewritten as follows: Accordingly, The nonreconstructed variance quantifies the information lost during the reconstruction process.In fact, there is always a part of the measurement variation that cannot be reconstructed.This part is called the reconstruction error and, in the direction   , it can be computed using Inputs: , , .Outputs: , Ĉ, C. (1) Calculate Σ: where ξ = ( − Ĉ)  and ξ  ξ = (1 − ĉ ).The variance of the reconstruction error, in the direction   , is then defined as follows: The number of PCs to retain is obtained by minimizing the VRE with respect to the number ,  = 1, . . .,  − 1: To equalize the importance of each variable, their contributions to the criteria have been weighted by their variances    Σ  .Algorithm 1 summarizes the different steps to perform a PCA model.

Fault Detection.
Once the PCA model identification is achieved, it is necessary to proceed to the fault detection step.Several indices are used to represent any variations in the data and thereby to detect faults either in the principal subspace, in the residual one, or in both spaces [13,18,19].
A general representation of these indices is presented by where Υ is as follows.
(i) For the Squared Prediction Error (SPE) index, (ii) For the Squared Weighted Error (SWE) index, (iii) For  2 index (Hotelling index), (iv) For  index (Combined index), (v) For  index (Mahalanobis distance), In order to decide whether the system is faulty or not, the used fault detection index should be compared to its threshold.Table 1 resumes the expression of each index's threshold.where  2 , is a Chi-square distribution with  degrees of freedom,  being a given significance level (usually  = 99%), these parameters are explained in [20].
(i) For the SPE threshold, (ii) For the  threshold, Only the SPE index that operates in the residual subspace will be considered in the rest of this paper.To defend this choice, a comparison between the detection results using different indices will be presented in Section 4.
The residual vector is defined by the projection of the measurement vector onto the residual subspace.
At an instant time , The matrix P is constructed by the ( − ) last eigenvectors of .The index SPE is then defined by Once inequality (27) is checked, the system is declared in failure: In order to delimit the fault occurrence interval, we have included a new variable   as a detected fault.Once (27) is valid, 1 is assigned to   indicating that a fault has been detected.
Algorithm 2 summarizes the different steps of the fault detection process, noticing that   is the faulty data matrix.

Fault Isolation.
After achieving the fault detection step, it is necessary to identify and isolate the faulty variables.Among various strategies of fault localization, the variable reconstruction approach is adopted in this application [13,19,21].This method assumes that each variable is faulty and suggests to reconstruct it using the PCA model from the remaining variables [15,22].
Let Ξ  be the matrix indicating the reconstruction direction, considering that  is a subset containing the indices of  reconstructed variables.This matrix is built with 0 and 1 indicating the reconstructed variables denoted by 1 from the other ones denoted by 0.
For example, for a system of 4 variables ( = 4), to reconstruct the first and the third variables simultaneously  = {1, 3} among four variables, Ξ  is formed as follows: At an instant , a reconstructed variable is given by with Reconstruction Condition.Let us note that if Ξ has full column rank, then ( Ξ  Ξ ) −1 exists and the variables of the subset  are completely reconstructible.This condition implies that the number of reconstructed variables  must satisfy  ≤  −  and that the columns of matrix Ξ are neither null nor collinear.
The residual vector can be defined by with In order to isolate the faulty variable, the SPE index may be computed by spe  () = x  () P P x () .
The isolation index is then defined, using the detection index spe  and its threshold  2  , by where  = 1, . . .,  corresponding to the reconstructed variable.The variable for which the isolation index is lower than one is declared faulty.Algorithm 3 summarizes the different steps of the fault isolation process.The reconstructed data matrix is given by  rec  () = x ().

Mars Sample Return Mission. The Mars Sample Return (MSR) mission intends to collect several samples from the
Martian surface and return them back to Earth for detailed analysis.This flagship mission is included in the ESA's Aurora program and envisioned to take place in the time-frame of 2020-2025 [23,24].Actually, 9 steps, presented in Figure 1, are to be considered in order to ensure the success of this mission.The diagnosis process developed in this application concerns only the 5th step called the "rendez-vous." This phase consists in capturing the Orbiter Sample (OS) that is carried by the Mars Ascent Vehicle (MAV).The OS remains attached to the MAV during the OM rendez-vous effort.A command from the OM will then release the OS.Once separated from the MAV, the OS is then captured by the OM and is placed in the Earth Reentry Capsule (ERC) [25].
Eventually, only the last 100 meters of this rendezvous/capture phase is considered the most critical for the occurrence of failures.
In this phase, in order to control the orientation of the chaser spacecraft (see Figure 2 the OM sensors, the difficulty will be in controlling the attitude, the movement, and the two axes describing the motion relative to the orbital plane.
The thrusters and reaction wheels are the most important features of the system actuators assuring the attitude control of the spacecraft.
On the one hand, the thrusters (Figure 2(b)) are considered the most important features of the propulsion system assuring the attitude control of the spacecraft.These propulsive devices are capable of generating a thrust (force) to control the attitude and a torque to control the rotation (about the three axes of the spacecraft: roll, pitch, and yaw).To provide the system stabilization, the thrusters must be placed about all three axes with at least two thrusters in each one.
On the other hand, the reaction wheels of a spacecraft (Figure 2(c)) are powered by the spacecraft's electrical power supply through electric motors.They are managed and controlled by the spacecraft's on-board attitude control computer.These types of flywheels provide the possibility to rotate a spacecraft, based on the principle of angular momentum transfer and Newton's third law of action-reaction.
Therefore, the failure of one or more reaction wheels or thrusters can cause a spacecraft to lose its ability to maintain position and hence potentially causing a mission failure.So, the objective of this application is to detect and isolate the thrusters and reaction wheels' faults of the MSR chaser spacecraft, considering that the propulsion system is composed of 8 thrusters and 4 reaction wheels.
The fault diagnosis process of each spacecraft's actuators requires two sets of data, one to build the PCA model and another to test the influence of a fault in the residual and/or the principal space.In order to represent the measurement noise, realizations issued from centered normal distributions with the same standard deviation equal to 0.01 are added to these sets of data.

PCA Model.
To build the PCA model, the data matrix  ∈  (×) , in the absence of faults in the system, has been generated using a MATLAB/SIMULINK simulator.The simulation block describing the opening rates of the thrusters is presented in Figure 3, while the block describing the rotation rates of the reaction wheels is presented in Figure 4. To test the effectiveness of the fault detection and isolation method, described previously, two types of faults have been put forward for each actuator (thrusters and reaction wheels).
For the reaction wheels' fault detection, (1) fault1 in the first reaction wheel between 200 s and 220 s ( 1 ), (2) fault2 in the second reaction wheel between 400 s and 460 s ( 2 ).
In this case study we consider  1,1 ∈  (×) and  2,1 ∈  (×) , denoting the data matrix in the presence of fault1 and fault2, respectively, where  = 13514 is the number of observations of the  = 8 chaser's thrusters or  = 4 chaser's reaction wheels.The measurement vectors represent the opening rate at a  time of the 8 thrusters and the rotation rate at a  time of the 4 reaction wheels, respectively.
Once the data matrix in the absence of faults is scaled (using mean and standard deviations of the variables), the PCA model can be then built using the VRE criterion.
Figures 5 and 6 present the evolution of the VRE according to the number of principal components (PCs).The first figure shows a minimum for  TH = 4 and the second one shows a minimum for  RW = 2.The retained PCA models of the thrusters and the reaction wheels include, respectively, 4 and 2 components.

Fault Detection and Isolation.
Once the modeling phase is achieved, we will present the fault detection and isolation results.The 3 algorithms (algorithms 1, 2, and 3) summarize the different steps of the adopted approach for the actuator fault diagnosis.
As explained previously, two faults in two different time zones have been injected to each actuator.Figures 7 and 8 represent the simulation results of fault detection using different indices after the injection of faults, where (i) the blue curve refers to the results of the  detection index after the fault injection; (ii) the red line refers to the  index's threshold.
According to these results, the  2 index that operates in the principal subspace cannot allow the fault detection.However, the SWE index and the   index provide not only a fault detection but also some false alarms that can affect the diagnosis process.Eventually, using the SPE index seems to be the best solution for accurate fault detection.
The simulation results in the first graphs of both Figures 7  and 8 show two principal peaks exceeding the SPE threshold.Thus, two failures in the thrusters and reaction wheels may be considered.
Once a fault is detected, it is then necessary to locate the faulty actuator.To do so, we have used the variable reconstruction approach.
Figures 9 and 10 present the evolution of the thrusters' isolation indices A spe  , computed after the reconstruction of each variable at an instant  of  1 and  2 , respectively.The evolution of the reaction wheels' isolation indices A spe  , computed after the reconstruction of each variable in  1 and  2 , is presented, respectively, in Figures 11 and  12.As we can see, for both actuators, the two indices A spe 1 and A spe 2 are lower than 1.These results confirm that the thrusters number 1 and 2 and the reaction wheels number 1 and 2 are the faulty ones.

Conclusion
In this paper, we have presented a fault diagnosis process based on the Principal Components Analysis (PCA) technique.This approach has been used to perform the fault detection and isolation of spacecraft actuators (thrusters and reaction wheels) during the rendez-vous phase of the Mars Sample Return mission.
In order to estimate the PCA model, a data matrix consisting of all measurements of the opening/rotation rates of, respectively, the nominal spacecraft thrusters and the reaction wheels has been used.
In fact, PCA reduces the data representation space and enables the determination of the redundancy relationships.
The redundancy relations are then used to detect and isolate the faults.
Once the PCA model is built, the predefined faults can be put in evidence using a detection index based on the reconstruction principle, in residual space with the SPE index.
After the detection phase, an isolation fault approach, based on the reconstruction principle, has highlighted the predefined actuators' faults.
Detecting and isolating a fault injected in the first and then in the second thrusters and, respectively, in the reaction wheels, have been successfully achieved by this application.The results of the diagnosis process have proved the liability of our choices since this method has been efficient for fault detection and isolation.
Succeeding the diagnosis of the spacecraft actuators using the PCA method was the goal of this work.But still, as known, the PCA approach has been mainly developed for the analysis of single valued variables without considering any uncertainty in the systems that certainly can falsify the diagnosis results.Developing a new approach extending the principal component analysis method to interval valued  data is our work in progress and constitutes the heart of a future article.The intended purpose of this research is then to develop a new optimized approach that generalizes the principal component analysis to interval valued data.In addition, we wish to develop accommodation approach in order to safely conjugate the necessary robustness/stability of the spacecraft control, trajectory dynamics, and the vehicle nominal performance.

Figure 3 :
Figure 3: Simulation block measuring the thrusters opening rates.

Figure 4 :Figure 5 :
Figure 4: Simulation block measuring the reaction wheels rotation rates.