Aluminium Process Fault Detection and Diagnosis

The challenges in developing a fault detection and diagnosis system for industrial applications are not inconsiderable, particularly complex materials processing operations such as aluminium smelting. However, the organizing into groups of the various fault detection and diagnostic systems of the aluminium smelting process can assist in the identification of the key elements of an effectivemonitoring system.This paper reviews aluminium process fault detection and diagnosis systems and proposes a taxonomy that includes four key elements: knowledge, techniques, usage frequency, and results presentation. Each element is explained together with examples of existing systems. A fault detection and diagnosis system developed based on the proposed taxonomy is demonstrated using aluminium smelting data. A potential new strategy for improving fault diagnosis is discussed based on the ability of the new technology, augmented reality, to augment operators’ view of an industrial plant, so that it permits a situationoriented action in real working environments.


Introduction
A variety of fault detection systems for the aluminium smelting process can be found in the literature.This diversity is contributed to principally by the way in which each system utilizes the resources available by using an approach which is appropriate for the process control system in question.Investigating these systems by identifying elements that shape the systems may help us to understand the different kinds of fault detection system in the aluminium smelting process.Thus, we classified these elements in the following groups.
(1) Fault detection and diagnostic knowledge: what knowledge is used in the fault detection and diagnosis systems of the aluminium smelting process?(2) Fault detection and diagnostic techniques: how is the system built by utilizing the knowledge?(3) Usage frequency: how frequently can the system monitor the process?(4) Results presentation: how are the results of the system presented to the operators?
The aim of this work is to identify taxonomy of aluminium process fault detection and diagnosis system with four key elements: techniques, knowledge, usage frequency, and results presentation.This work also aims to identify the potential ability of augmented reality as one of the techniques in results presentation.This paper will first describe a fault detection and diagnostic taxonomy that has been developed from reviews of the literature and knowledge pertaining to the aluminium smelting process.Secondly, the groups and elements that comprise the taxonomy are explained.Next, the key elements of the new system for the aluminium smelting process that have been identified in this research based on the taxonomy are discussed and demonstrated with an example.Finally, in order to further assist in fault diagnosis, the integration of augmented reality that can be used as a potential new strategy is discussed.

The Proposed Taxonomy for Aluminium Process Fault Detection and Diagnosis
The groups and elements that create a fault detection and diagnostic taxonomy for the aluminium smelting process are illustrated in Figure 1 detection and diagnosis system.The groups and elements of this taxonomy are briefly described in the following section.

Fault Detection and Diagnostic Knowledge.
The first group is comprised of fault detection and diagnostic knowledge.The elements of this group represent particular knowledge in the aluminium smelting process that has been used, and can be used, to develop fault detection and diagnosis systems.A brief explanation of each element is given below.
(1) The first element in this group is a spectrum of resistance in which the specifications of the spectra in three cases were identified for assisting in fault diagnosis.The cases are normal cell, aluminium roll, and abnormal anode [1].
(2) Patterns of noise constitute the second element in this group.Three different patterns of noise were recognized, to assist in fault diagnosis.These are bubble noise [2], shortcircuiting noise, and metal pad roll noise [2,3].
(3) The third element in the group is a theoretical resistance/alumina concentration curve.There have been researchers who have selected data for developing fault detection systems by using this curve as an important reference such as Meghlaoui et al. [4], Yurkov et al. [5], and Nagem et al. [6].
(a) The first example stems from research by Meghlaoui et al. [4] in which two dynamic trend indicators were generated based on the theoretical resistance/alumina concentration curve.(b) The second example comes from research carried out by Yurkov et al. [5] in which selected data were deemed appropriate for analysis based on feeding cycles.These cycles were formed following the controlling of alumina feeding based on the theoretical resistance/alumina concentration curve.(c) The third example is from research by Nagem et al. [6] in which data were divided into four regions based (4) The fourth element is a set of colour and textural features grouped according to the varying alumina content of anode cover materials.These colour and textual features were identified using multivariate image analysis techniques.These features can be used to estimate the alumina content of anode cover materials [8].
(5) The fifth element is the diagnosis and correction of operating cells that were recorded by operators and engineers.This knowledge can be used to form a knowledge database in an expert system (e.g., [9,10]).It can also assist in the discovery of new knowledge for fault diagnosis and then for validating that new knowledge by using the procedure for knowledge discovery from databases (e.g., [11]).

Fault Detection and Diagnostic
Techniques.The development of fault detection and diagnosis systems involves not only various knowledge domains but also a variety of methods.In the taxonomy proposed here, the group pertaining to the techniques to be used for fault detection and diagnosis is described as the second group.This group concerns the development of a fault detection system by using a suitable technique and utilizing specific knowledge.A brief explanation of each element is given below.
(1) The first element in this group is an analytical approach because two common methods for this approach, parameter estimation and diagnostic observers, were used to develop an aluminium process detection system [12].The approach was based on a quantitative model in a wellaccepted taxonomy developed by Venkatasubramanian et al. [13] in which precise first principles or mathematical models of the process are used to model a system based on the relationship between the inputs and outputs of the process.The differences between actual system behaviour and that of the system model are then calculated and called residuals [13].
(2) Figure 2 shows the two main stages in model-based fault detection and diagnosis [7] where some of the frequently used residual generation methods are diagnostic observers, Kalman filters, and parameter estimation.These residuals are further evaluated in order to identify the occurrence of faults in the process [7].
In a fault detection system for the aluminium smelting process, an extended Kalman filter was used in order to not only estimate the alumina concentration in different sections of an aluminium reduction cell, but to also indicate an abnormal alumina distribution.A mathematical model was developed to estimate the alumina concentration.Residuals were generated from the difference between the alumina concentration expected by the system model and the actual concentration [12].Abnormal alumina distribution was detected when the residuals were significant.However, the residuals not only indicate abnormal events but may also indicate other sources including noise, disturbances, and model errors [7].This issue of robustness may limit the effectiveness of using the Kalman filter or other model-based approaches.
(3) An expert system which is a process history-based approach is the second element in this group.In the process history-based approach, prior knowledge is extracted from a large amount of historical data.This feature extraction can be divided into qualitative and quantitative methods as shown in Figure 3 [14].A popular example of a qualitative method is the expert system where prior knowledge from experts is extracted to represent human knowledge in a particular domain.It is used in fault diagnosis to infer a conclusion of an out-of-control situation by combining the facts from a user with the knowledge from human experts represented in knowledge databases.In the aluminium smelting process, knowledge relating to diagnosis and correction of operating cells was incorporated in a number of expert systems such as those of Haldris [9], the FMFA-based expert system [10], and the CVG Venalum potline supervisory system [15].In an aluminium electrolysis process expert system (AEPES) [16], for example, there were two subsystems; the first one incorporated more general knowledge of the aluminium reduction cell including unstable cell voltage, anode carbon quality, and higher iron impurity.The second one incorporated specific knowledge including bath temperature, metal level, and bath ratio.The use of an expert system, however, lacks statistical inference and pattern recognition [17].
The third element in this group, neural networks, is also a process history-based approach.As shown in Figure 3, the quantitative method can be divided into statistical and nonstatistical.The use of artificial neural networks is a nonstatistical approach used in fault diagnosis to recognise the received pattern of data by using a nonlinear mapping between input (data patterns) and output (fault classes).This mapping consists of hidden neurons that are highly interconnected and arranged in layers [17].In the aluminium smelting process, a backpropagation neural network was used to map spectra of cell resistance and output vectors for three cases which were normal cell, aluminium roll, and abnormal anode [1].In addition, a feedforward neural network was used to predict cell resistance and as a fast dynamic indicator [4].Both systems used simulation data to train the networks.The use of neural networks, however, lacks the ability to generalise/explain behaviour [18].

Advances in Materials Science and Engineering
(4) The fourth element in this group is the use of multivariate statistical techniques which is also a quantitative and process history-based approach.Multivariate statistical techniques such as PCA and PLS are used to extract a number of latent variables from normal operating data which are retrieved from historical databases, in order to form an empirical model [19,20].Thus, in the future, whenever the behaviour of the operation of the plant differs from the empirical model of the normal process, unexpected changes in the process can be detected [20].The following are examples of the use of PCA/PLS for process monitoring in materials processing including aluminium processing: (a) a combination of PCA and linear discriminant analysis (LDA) was used for monitoring the quality of iron and steel [21]; (b) PCA was used for monitoring the quality of copper [22]; (c) multivariate image analysis was used for estimating alumina concentration on anode cover [8]; (d) estimation was carried out for aluminium reduction cell performance using PLS [23]; (e) multivariate monitoring of aluminium reduction cells was undertaken using PCA [24]; (f) multivariate online monitoring of preheating, startup, and early operation of aluminium reduction cells was investigated using PLS regression [25].
These examples show that the multivariate techniques, PCA/PLS, have been investigated for analysis of historical data and monitoring of processes in various complex process industries because of their ability to handle large volumes of highly correlated data.

Usage Frequency.
The third group to be considered in this taxonomy is usage frequency where it applies to the way the fault detection and diagnosis system performs its analysis of the process.A brief explanation of each element is given below.
(1) The first element in this group is one which is continuous.An online fault detection and diagnosis system monitors the process continuously by analysing continuous data from the process.The system may immediately signal abnormal events after they happen.
Examples of these systems include a backpropagation neural network developed by Shuiping et al. [1] and a feedforward neural network system developed by Meghlaoui et al. [4].
(2) Periodic analysis is the second element in this group.
In an aluminium smelting process, an offline fault detection and diagnosis system periodically analyzes data at a frequency ranging from daily to once in two days.This level of frequency is to enable the detection of abnormal events using bath chemistry and heat balance parameters.Some of the examples in this system include (1) process monitoring using PCA [24] and (2) an analytical model for estimating alumina concentration and abnormal events [12].

Results
Presentation.The fourth group in this taxonomy describes the three modes for presenting the detection results: text, graphics (visual), and three-dimensional (3D) visualization.The presentation of detection results to the operator can be more informative if the operator's needs are considered in terms of a clear visual indication in the screen design [26].This theory is supported by research done by Harris et al. [27] where colour and statistical graphs were incorporated in the design of a supervisory control system.The use of the bold, contrasting colour in this system clearly indicates when there is an alarm so that the section leader in a smelter can act accordingly.Furthermore, the potential operator colour sensitivity should be considered by choosing colour palettes that provide effective contrast for all potential colour vision levels.
Although many contributors to the reference literature pertaining to aluminium process fault detection cited in this taxonomy do not provide screen design in their approach, there are some articles that do provide or describe how the results are presented.Three major examples are given here.Firstly, in a multivariate statistical application, a 3D visualization was used to illustrate Hotelling's  2 statistic with a 3D control envelope which is based on bath temperature, liquidus point, and cumulative sum of alumina feed ratios [28].Secondly, a fault diagnosis system based on a neural network had a screen interface in which two modes of presentation were used: text (querying history report, spectrum analysis of cell resistance, and fault diagnosis for the cell state) and graphics (real-time curve and history curve of the cell signals) [1].Thirdly, a supervision system for aluminium reduction cells based on mathematical models had an interface displaying five functions including real-time display and curve change for specified parameters [29].The state of the cells is displayed using a text box, and the temperature, the voltage, the current, and the alumina concentration are displayed in charts.The user interface also consists of control boxes, such as combo boxes, and control buttons.These three examples show that a combination of text and graphics may be more effective for revealing monitoring results to the operator than solely using either text or graphics [29].
A fault detection and diagnosis system will now be shown and discussed in the next sections in order to demonstrate how a new system can be developed based on the proposed taxonomy.

Cascade Fault Detection and Diagnosis System
The cascade fault detection and diagnosis system [30,31] was designed to detect any faults and then diagnose faults that are related to anode effect, anode spike, block feeder, and low alumina dissolution.This system is presented as an example of how faults such as an anode effect can be detected and diagnosed with multivariate statistical techniques as can be seen in Figure 4.The key elements of this system are discussed below by referring to Figure 4.

Fault Detection and Diagnostic Knowledge.
The first element of this system is the discovery of new knowledge based on the established relationship between pseudoresistance and alumina concentration.In addition to the extraction of knowledge from the prior research of experts and the production of a theoretical resistance/alumina concentration curve through experiment, learning to identify abnormal patterns from data is one of the practical ways by which to discover fresh knowledge relating to fault detection and diagnosis [32].
Since there is a need to develop a fault detection and diagnosis system based on the changes of cell voltage and cell resistance patterns within overfeed/underfeed cycles, ascertaining abnormal patterns within these cycles using data mining to discover new knowledge was carried out in this research [30].
In Figure 4, for example, abnormal patterns within the current cycles were detected and diagnosed to be related to the patterns of an anode effect and patterns of the previous cycles to be related to a block feeder and low alumina dissolution.

Fault Detection and Diagnostic (FDD) Techniques.
In the first element of the system, the established relationship between pseudoresistance and alumina concentration is used as a basis for discovering new knowledge.In the second element, the established relationship is used as the basis for monitoring the process with the added use of a suitable FDD technique.It has been interesting to note that the established relationship between pseudoresistance and alumina concentration has become the basis for many applications from linear to nonlinear models for a range of purposes such as (1) the estimation of alumina concentration using the Kalman filter approach [12], (2) the prediction of anode effects using a linear time-series model and a simple nonlinear exponential rise curve [33], and (3) the prediction of feed control decision variables using neural networks [4].The strengths and weaknesses of some of these applications were discussed by Stevens Mcfadden et al. [34] where an application using the neural network model has been suggested as a suitable approach for a predictive modelling task.As discussed above, many fault detection techniques have been employed previously.The main interest of this research, however, is a technique that is capable of early fault detection in the industrial application of the aluminium smelting process.All of the previously mentioned applications in this research stem from analytical and knowledge-based approaches, the focus of which has mainly been on the avoidance of anode effects.Less attention has been given to the use of data-driven approaches such as PCA and PLS for observing the changes of patterns within the overfeed-underfeed cycle for the detection and diagnosis of problems.Also, many researchers have used only simulated data instead of real data.Since the aluminium smelting process is complex, having many problems that require effective process monitoring, it may be impractical to develop an accurate and explicit mathematical model of the process for this purpose.Therefore, the model-based methods, both quantitative and qualitative, have not been considered in this work.
On the other hand, there has been growing interest in using process history-based approaches for fault detection of industrial applications [14,19].Venkatasubramanian et al. [14] listed three key reasons for this increasing interest; these are as follows: (1) they are easy to put into practice, (2) little modelling effort is required, and (3) little prior knowledge is needed.A number of process history-based fault diagnostic techniques have been developed for the aluminium smelting process including expert systems, neural networks, and multivariate statistical techniques (PCA/PLS).Firstly, a number of expert systems were developed for fault diagnosis [9,10].Due to the complexity of the aluminium smelting process, the cause of an abnormal operating pattern is often difficult to diagnose.Process engineers may interpret the abnormal pattern themselves before or while using an expert system.A computerized system that is capable of solving the persistent problem of diagnosing abnormal patterns for multiple aluminium reduction cells is needed.Furthermore, expert systems require considerable effort in order to build a knowledge-based diagnosis system for a complex and large process.An existing solution for this problem has been based on neural networks [1].However, this requires comprehensive and excessive amounts of data, causing Shuiping et al. [1] to use simulation data instead of real data in their study.The use of PCA and PLS is a viable option because only moderate amounts of historical data are needed.Based on this, an application of PCA was developed by Tessier et al. [24] for monitoring the aluminium electrolysis process.However, in order for a monitoring system to be rendered effective, consideration needs to be given to dynamic cell behaviour.
Therefore, the objective of this system is to incorporate the dynamic behaviour of the two important events of anode changing and alumina feeding during the aluminium smelting process, for effective and timely fault detection and diagnosis.This can be done by using a new multivariate statistical framework using PCA and PLS.In this system, PCA has been chosen for the development of a fault detection system and a combination of PCA and PLS has been chosen for the development of a system for fault diagnosis.This is because these multivariate statistical techniques can address some of the problems arising in the detection and diagnosis of faults in the aluminium smelting process.
(1) Firstly, PCA or PLS can handle a substantial quantity of data which is both correlated and noisy.
(2) Secondly, both PCA and PLS use a noncausal model so that the lack of a causal model in the aluminium smelting process is not an issue.A causal model needs a first principles model.
(3) Thirdly, multiway PCA (MPCA) and multiway PLS (MPLS), extensions of PCA and PLS, respectively, are able to handle any nonlinear behaviour during the process of alumina feeding.
(4) Finally, PCA and PLS are effective in practice for the monitoring of the aluminium smelting process since the reference models have been mainly built from process data [35].
Principally, the use of multivariate statistical techniques such as PCA and PLS needs to be investigated not only for the prediction of anode effects, but also for the diagnosis of problems that cause anode effects and for the early detection of anode spikes.This advanced monitoring of aluminium processing leads to a reduction in energy consumption and emission of PFCs.Abnormal patterns within the alumina feeding cycles were analysed using MPCA and MPLS.As shown in Figure 4, the monitoring charts used in the system were based on MPCA.These charts are Hotelling's  2 chart and the SPE chart.The abnormal events detected by these charts were then diagnosed using MPLS in order to classify patterns related to these abnormal events [31].

Usage Frequency.
The continuous monitoring of changes of variability patterns within the overfeed/underfeed cycles is preferred in this research for early fault detection and diagnosis [30].As shown in Figure 4, five-minute data were used for monitoring the process.The monitoring charts detected and diagnosed an anode effect 25 minutes before it occurred in the real operation.This shows an early detection and diagnosis of an anode effect.

Results Presentation.
Charts that can show changes of pattern against acceptable limits for operations are one of the important elements in monitoring.Information about the current process and the results of the diagnosis that were provided in textual form were put together with the charts.In this research [30,31], a mixture of text and graphics incorporated with suitable colour (red and green) and user control boxes such as a combo box for selecting cells was used instead of selecting only one mode in order to demonstrate clearly abnormal events.In Figure 4, for example, the operator's screen indicated this situation by a change in the colour of button for cell 2004 from green to red, the status of the process from "IN CONTROL" to "OUT OF CONTROL, " and the status of the anode effect detection from "NO" to "YES." A clear indication of abnormal events as shown in this example can help process engineers and operators to timely respond to problems that occur in the process.

The Need of Augmented Reality (AR).
Augmented reality is a viable option for improving the results presentation.Results from the system were mostly based on computergenerated information such as text, graphics, charts, and tables.Operators in the smelters will take actions based on this information.Integrating this digital information with a real situation might help further in fault diagnosis.In fact, this is the basis of augmented reality where it has been defined in a broad sense as augmenting natural feedback to the operator with simulated cues [36].The main reason for using AR is its capability of augmenting a user's view of an industrial plant, so that it permits a situation-oriented action in real working environments.The integration of augmented reality within an aluminium process fault detection and diagnosis system is a potential new strategy for improving decision making in fault diagnosis.

The New Strategy for Aluminium Process Fault Detection and Diagnosis
A new strategy for fault detection and diagnosis was proposed to incorporate AR technique.This adds a new element in terms of results presentation as shown in Figure 5. AR was selected because it is a novel human-computer interaction tool that overlays computer-generated information on a realworld environment [37].This technique has been applied in industry, for example, the Boeing wire harnessing project [38], car engine maintenance [39], and an intelligent welding gun [40].These works have shown potential of AR to be combined with human abilities to offer efficient and complementary tools to assist manufacturing tasks [37].This motivates this research to propose a new strategy for improving fault detection and fault diagnosis in the aluminium smelter.
4.1.Procedure for the New Strategy.In this new strategy, there are five steps in incorporating AR in the fault detection and diagnosis system: requirement, design, development, implementation, and evaluation.These steps are described in the sections that follow.
4.1.1.Requirement.In the first step, the requirements for AR technology for a specific task in manufacturing are identified.This identification is based on the need for error-free job execution, reduced cognitive load, ease of learning a task [37], and assisted decision making.When the specific task has been identified, the current industry situation needs to be studied in order to support the task, in terms of its end-users, level of expertise, and current environment [41].
One of the tasks of operators in an advanced supervisory control and management system (named integrated potline control and improvement, referred to as IPC-Im hereafter) is root causes diagnosis [42].This task can be combined with AR (as illustrated in Figure 6) in order to offer efficiency of information presentation and to assist developer of the system in providing an improved interaction between human and the system.

Design.
A number of the main elements, which were identified from AR-assisted maintenance system by Nee et al. [37], can be considered in this design phase.These elements are as follows.
(1) Display Device.What device is used for visual output?Examples include head-mounted display (HMD) (e.g., [43,44]), handheld devices (HHD) (e.g., [45,46]), and projectors (e.g., [47]).Since handheld devices, such as mobile phones, can be used as a tool to view this information overlay, mobile AR has gained increased attention from academia and industry, due to the portability of mobile phones and the ubiquitous nature of camera phones [37].Therefore, mobile phone is one of the viable options as a display device in this new strategy.
(2) Tracking Technologies.What technology is used for tracking the cameras position, in order to register virtual objects?Examples include vision-based tracking (marker) (e.g., [48]), sensor-based tracking (e.g., [49]), or a hybrid (i.e., visionand sensor-based tracking) (e.g., [50]).In this new strategy, vision-based tracking (marker) has been used to simulate how information of a cell can be superimposed on a live view of the cell.An example of such markers is shown in Figure 7, where the camera first locates the marker (which in this case is an image of aluminium reduction cell's number, 2053).When the marker is recognized, a superimposed image (shown as information of a cell, e.g., temperature, excess AlF 3 , liquidus temperature, and voltage) will appear on the screen, in order to mix the virtual world with real world that is being viewed.This innovative technology offers a solution for assisting in monitoring a complex process industry, such as the aluminium smelting industry.
(5) User Collaboration.How does an AR-based application provide collaboration among users?Examples include using a microphone and a remote laser pointer (e.g., [52]).If abnormalities can be diagnosed using the proposed mobile AR-based approach, a real-time fault diagnosis system could be developed as an advanced tool to diagnose problems in an aluminium smelter as shown in Figure 8.In this application, all the processing work and file saving can be done in the cloud of the Internet after considering the limited processing capability of the mobile phone [37].
The mobile AR module should provide sufficient information for the process operators to diagnose operating problems.Six functions that need to be considered are the plant information system, linking of documents, machine history, interactive troubleshooting, error tracking and feedback, interactive video, and a virtual laser pointer.The potential view of the mobile AR module for a process operator in an aluminium smelting plant is illustrated in Figure 9 where there are four main functions: (1) buttons for interactive manipulation, step, implementation, possible problems in setting up the application should first be identified [41], before implementing the application.A clear action plan should be developed in order to assist end-users or workers in using the new technology.The fifth step is evaluation where user satisfaction is evaluated, and the benefits of the application are identified.These five steps (requirement, design, development, implementation, and evaluation) can be used as guidance in developing an AR application for any manufacturing plant, such as the aluminium smelting plant.In addition, the AR module can also be added in corrective action guidelines because AR can be used to highlight dangerous area in a plant.A virtual fire in the plant, for example, might help an operator to have in-depth understanding with operating procedures when an abnormal situation occurs.Therefore, operator behaviour in normal and abnormal situation can be tested in order to improve operating procedures.

Conclusions
Developing a fault detection and diagnosis system for the aluminium smelting process is a major challenge.This fault detection and diagnosis system should be able to accurately indicate abnormal situations although the process is complex and dynamic.In this paper, the proposed taxonomy described with examples of existing systems was given.The taxonomy clearly highlights the key elements of a fault detection and diagnosis system which covers utilization of knowledge, FDD techniques, usage frequency, and results presentation.The taxonomy has many uses including the following: (1) to identify the key elements to distinguish between existing systems, (2) to identify areas of improvement for the existing systems, (3) to provide an overview of the system where various techniques have been applied to detect and diagnose faults.
This taxonomy has helped in the development of this work by identifying the gap in existing fault detection and diagnosis systems and realizing a new approach to developing a new system that is practical, provides timely detection and diagnosis, and is easy to understand by operators.In the future, the use of AR technology can enhance the competence of the diagnostic module to diagnose problems in a more practical manner.AR can provide an interactive environment, where operators and remote experts can communicate using the same field of vision.Since AR can be used to augment a user's view of an industrial plant, it provides alternative solutions for design, quality control, monitoring and control, service, and maintenance in complex process industries, such as the aluminium smelting industry.

Figure 4 :
Figure 4: Operator screen shows an indication of an anode effect and its possible causes: a block feeder and low alumina dissolution.

Figure 5 :
Figure 5: The addition of new elements, augmented reality, in the detection mode of results.

Figure 7 :Figure 8 :
Figure 7: Augmented view of cell 2053 with superimposed information for four process variables.

( 2 ) 4 Figure 9 :
Figure 9: Augmented view of the real world with superimposed information for four main functions.