Research

,


Introduction
Energy is a crucial element for the subsistence of our modern civilization. Almost all human activities require energy to work. is requirement is increased year after year, especially due to the growing population, which is estimated by about 10 billion people in the middle of this century [1]. Nowadays, fossil fuels are the main source of energy because of their relatively low cost of production and high energetic capacity. However, they are not a long-term option. Alternatives like renewable energies are increasing their participation in modern life. However, the current technology of renewable sources is still not able to supply all energy needed [2,3].
On the contrary nuclear sources can provide great quantities of energy. Although fusion energy is still developing, its potential is enormous, even compared with nuclear fission. Nuclear fusion is the process by which two or more atomic nuclei join together to form a single heavier nucleus. is is usually accompanied by the release of large quantities of energy. Fusion is the process that powers active starts, the hydrogen bomb, and some experimental devices. Nuclear fusion could be cheaper, cleaner, and safer. Fusion power would provide much more energy than any other technology currently in use, and the fuel required for fusion, mainly deuterium, exits abundantly in the oceans. Fusion could, in theory, supply all the energy needs of the world for millions of years [4,5].
Achieving full control of the energy generated by nuclear fusion devices involves an analysis over huge databases with thousands of signals that is impossible to do it manually.
is amount of data implies performing the analysis (e.g., finding significance or regular patterns) in high dimensional spaces and it is essential to automate the process using machine learning [6][7][8][9]. To this end, we can find several algorithms in the literature in order to perform pattern recognition in an automatic way. In the context of the pattern classification problem, the most popular algorithms are Support Vector Machines and Neural Networks; both have shown high performance in previous applications in fusion [10][11][12], but with an important inconvenient: these algorithms produce black-box models, where it is not possible to obtain explicitly a simple mathematical relationship that outputs the classification. e aim of this article is to present a new approach that combines the pattern (waveform) classification online with internal information about the decision model (i.e., interpretability). In the literature, there are several examples of using black-box models to automate the classification problem. e main reason for this is the high success rates reported in different topics such as nuclear fusion [12][13][14]. However, a black-box model does not give us any hint about the reason for the classification; for example, we are not able to know what the most important input variable is involved in the decision.
In this sense, ensemble methods provide a good balance between success rate and internal information about the model [15,16]. Particularly, the AdaBoost algorithm allows obtaining an explicit set of simple rules that outputs the class of the signal from the input data [17]. Such property adds interpretability to the models, which could be useful to understand the reason for the classification process and, ultimately, for improving knowledge of the underlying physical phenomenon.
is fact will allow a much more precise adjustment of the obtained models. e main contributions of this article are as follows: (1) the waveform classification using ensemble methods generate rules-based models (if-then rules) that are not black boxes and could be useful to understand the entire process of the plasma discharges in nuclear fusion devices and (2) the classification system of waveform works online, which implies that we do not need to wait until the discharge finishes to obtain the class from the input data. e rest of the paper is structured as follows: Section 2 introduces some basics aspects of the Nuclear Fusion Energy, the AdaBoost algorithm, and the signals used. Section 3 presents the offline and online approaches to classify the signals. A brief analysis of the models and features obtained is also presented. Finally, Section 4 summarizes the main conclusions.

Nuclear Fusion Energy.
In order to reproduce on the Earth the fusion power, some fusion reaction can be used. One of the most important is the deuterium-tritium cycle [18], which release 17.58 MeV as follows: In a fusion device, the reaction is produced at very high temperatures, about 150 million degrees Celsius. To this temperature, the matter inside of fusion devices is found like plasma, which is a state of matter similar to gas with a portion of its particles ionized. Magnetic fields are used to confine plasma in the shape of a torus. e most common configurations for magnetic confinement of plasma are stellarators and tokamaks. Figure 1 shows a simple and general scheme of the process of generating electrical energy from the nuclear fusion. e reactor uses deuterium (D) and tritium (T) to produce the reaction. e water is heated by the energy of the reaction and it feeds a turbine generator that produces the electricity.
e International ermonuclear Experimental Reactor (ITER) is an international nuclear fusion research and engineering project, which is currently building the world's largest and most advanced experimental tokamak nuclear fusion reactor at the Cadarache (France) [18]. ITER is expected to demonstrate that more energy is obtained than is used to initiate the fusion process, something that has not been achieved by any experimental fusion reactor. After ITER, the first commercial demonstration fusion power plant, named DEMO [19], will be intended. Currently, there are many experimental fusion devices in operation. e Joint European Torus (JET) [20] is an experimental tokamak reactor located in Oxfordshire (UK). It is currently the largest facility of its kind in operation. TJ-II [21] is a medium size stellarator located at CIEMAT in Madrid (Spain). DIII-D is another tokamak machine developed by General Atomics in San Diego (USA) [22].
Experiments on fusion reactors are carried out by producing discharges or shot, in which plasma exists inside the torus. e duration of the shot is normally tens of seconds. ITER would keep the shot for about 30 minutes. During the discharges, many diagnostics around the reactor acquire data at high sampling frequencies. About 10 GBytes per discharge can be acquired in JET [20] (ITER could storage 1 TByte per shot). Bolometry, density, temperature, and soft X-rays are just some examples of the thousands of data sets acquired during a discharge. Huge databases, with an enormous amount of data, are a common situation in experimental fusion reactors.
However, nowadays only 10% of the generated data is processed, while the rest is not processed at all. erefore, in order to achieve fusion energy as a clean, inexhaustible, safe, and cheap energy source, the current databases of experimental devices (tokamaks and stellarators) should be analyzed completely. Performing complete analysis will involve an optimal operation planning of ITER and, in turn, will be basic for a successful design of DEMO. For that reason, this project proposes the use of advanced pattern recognition and machine learning techniques in order to analyze in a faster and more efficient way massive fusion databases.
In this paper, the AdaBoost algorithm is used to build a rule-based model to classify five different waveform classes of the TJ-II stellarator. e plasmas in TJ-II are produced and heated with ECRH (2 gyrotrons, 300 kW each, 53.2 GHz, 2nd harmonic, X-mode polarization) and NBI (300 kW) [21]. Figure 2 shows a view of the TJ-II device.

AdaBoost Algorithm.
e adaptive Boosting algorithm (AdaBoost) was proposed by Yoav Freund and Robert Shapire [17,24]. AdaBoost is a general method to obtain a strong classifier (in our case a rule-based model) from a set of T weak classifiers (also called hypotheses, or rules in our case). is algorithm takes as input a set S � F, y , where F is a features vector of the signal that is going to be classified, y is the label of class for each signal (+1 or −1), and m is the total number of signals. Note that h t is a rule, and T represents the number of rules that composes the rule-based model. Algorithm 1 is the pseudocode that shows the implementation of this algorithm. e basic idea of boosting is to select the best weak (and simple) classifier after each iteration. e hypothesis selected is weighted according to its capacity to classify the training set correctly. Samples that were not correctly classified are also weighted in order to look for a suitable hypothesis for them in the next iteration. AdaBoost uses exponential error loss as an error criterion. e final model corresponds to a weighted sum of the selected weak hypotheses. e most important issue is that the resulting model is based on if-else rules, which means that the model is not based on a blackbox type model. is represents an advantage compared to other classification algorithms [24].
In this way, AdaBoost can be used in a straight forward manner with signals. For example, a simple rule could be to predict a class if the average of the last 30 milliseconds is greater than a threshold. us, we can use if-then sentences such as if (avg (signal) > threshold) then +1, else -1 as a weak rule h t (as in lines 9 and 10 of the pseudocode above). e output of the AdaBoost classifier will be finally the sign of the weighted sum of T rules (line 18 of the pseudocode) such as in equation (2). Note that α t corresponds to the importance or weight of each class: e algorithm can be easily extended for a multiclass problem (more than two classes) using the approach the one versus the rest, which implies building a model to classify the waveforms of a particular class (+1) versus the waveforms that belong to a different class (−1). is process is repeated in order to build one classifier for each class. For example, in [24], there is a detailed explanation about fundamental concepts of AdaBoost. In [11,25], there are good descriptions about implementing classifiers for two or more classes in nuclear fusion databases combined with other algorithms (autoencoder and wavelet). Figure 3 shows an illustrative example of the AdaBoost algorithm from [26], which is a previous work of the authors.
ere are seven samples of two classes (red circle and blue cross) in the upper image. Let us assume that a new sample located at (3.5; 3.0) has to be classified in one of the two classes. We can use the seven samples to build (train) a supervised data-driven model to predict the class of the new sample by using AdaBoost. e feature vectors are the respective Cartesian coordinates x1 and x2. After some iterations, the new sample is classified as a cross. e image

TJ-II Waveforms.
e data generated by the experiments of the TJ-II device is stored in the relational database called TJ2RDB. is database allows searches to find shots with particular properties and to do scaling studies. More general information about this database can be found in [27,28].
To illustrate the complexity of the data used for this work, we show the waveform of one of the signals involved in this research. Figure 4 shows the signal ECE7 for 200 shots. As it can be seen, for the same signal, the shape of the waveform is very different from a shot to another one. is implies that the classification of these kinds of signals can be a difficult task if it is carried out manually.
In order to evaluate and reveal the benefits of this approach, we have implemented a proof-of-concept using an illustrative example of the online classification for five different waveforms. is explanatory classification problem has been selected because the proposed approach can be easily compared with other previous works, where black-box algorithms have been implemented. Figure 5 shows the temporal evolution of the 5 waveforms used to test the proposed approach in this work. From top to bottom and from left to the right, the waveforms (classes) are ECE7, GR, GR2, HALFAC3, and IACCEL1. Table 1 presents a brief description of the selected TJ-II signals. Note that the selection of other signals might provide (1) Input: S � (F i , y i ), ∀i � 1 . . . m (2) #S: Training set example (3) #D 1 : Initial weight distribution (4) #m: Size of the training set (5) #y i ∈ −1; +1 { } (6) D 1 (i) � 1/m, ∀i � 1 . . . m (7) #T rules that compose the rule-based model (8) for t: � 1 to T do (9) #Get weak hypothesis h t : #Z t : Normalization factor D t+1 is a distribution: different results to those presented here, but the approach is enough general to obtain a classification with similar successful rates.
Finally, note that a supervised training scheme requires a previously labelled data set, and since in this context each signal is acquired by a separated sensor system, all the labels are known when data is stored. In a different context, the labelling process could imply the assistance of many specialists to obtain such data sets.

Waveforms Classification
e waveform classification is developed using AdaBoost with two approaches: (1) offline and (2) online. In the offline approach, the obtained model uses the entire signal to perform the classification, which involves the classification is done after the discharge has finished. On the other hand, a sensitivity analysis was also performed to select a reduced set of features in order to classify the waveform before it actually finishes, which could be very interesting for real-time applications. For comparison purposes, the rule-based model has been tested with signals used in previous works.

Offline Classification.
For the offline approach, the AdaBoost algorithm has been implemented to classify the TJ-II waveforms using all the samples of the discharge. In this case, 340 waveforms have been used in total (68 waveforms for each class). Each entire waveform is resampled to 1024 samples to form the feature vector (F) in  Table 1.
Complexity 5 order to feed the AdaBoost algorithm. Finally, AdaBoost will output a rule-based model (AdaBoost model) that allows classifying a new waveform. Figure 6 shows the block diagram of the implemented stages. Table 2 shows three rules (h t ) and their associated weights (α t ) of the obtained rule-based model to classify GR signals (class 2). Note that the features F 578 (magnitude of GR signal at sample 578), F 842 (sample 842), and F 1024 (sample 1024) are used to perform the classification.
In the case of class 3 (GR2 signal), the classification can be performed by using only the following rule: if (F 1 < −1.151) then +1 else −1, which implies that using only the first feature of a discharge (F 1 ), the approach is able to classify GR2 signals successfully.
In order to evaluate the model, we split the data into two subsets (cross-validation). e training stage was carried out with 60% of the data set (205 waveforms for each of the 5 classes) while 40% of the data set was used for the test stage (135 waveforms for each class). Table 3 shows the results of the offline classification of the 5 types of signals. As it can be seen, the results are encouraging. All the success rates are above 93%. e average success rate of the ensemble model is up to 98%, improving the results of previous works. In [25], a Wavelet Transform with Support Vector Machines (WT + SVM) [25], the results were up to 92%. More recently, in [11], a Stacked Autoencoder (a type of Neural Network) in combination with Support Vector Machines (NN + SVM), the results were up to 94%.
One interesting thing about the proposed approach is the ability to see the importance of each feature to perform the classification. Figure 7 shows the features selected (samples) by the algorithm to classify the ECE7 signals. e blue line represents shot and red circles represent the features used to classify this signal. e size of the circles is proportional to the importance of the rule (α t ) that uses the feature. As it can be seen, the most important values are located before sample 200, which implies that some signals could be classified at the beginning of the discharge; that is, an online classification could be performed.
Based on the previous results, the idea of online classification came up. In this way, it is not necessary to wait until the end of the discharge in order to perform the classification. e next section presents the online classification algorithm.

Online Classification.
is approach starts the classification at the very beginning of the discharge. First, the signal is preprocessed in sliding windows obtained by grouping 10 consecutive samples and taking only one representative sample for each window. In this way, the signal is reduced by a factor of 10. en, the feature extraction stage is applied to obtain some specific characteristics of the signal that helps in the classification. Figure 8 shows the block diagram of the online approach. Figure 9 shows an explanatory diagram of the algorithm. e red solid line represents the signal that is being analyzed. e blue dashed rectangle represents the sliding window, which contains the segment of the signal analyzed at the current iteration. en, the four features of this window are obtained: average value (F 1 ), minimum value (F 2 ), maximum value (F 3 ), and, finally, slope value (F 4 ), which is calculated by performing a least squares adjustment. After that, the AdaBoost model classifies the signal into one of the five classes. When the result of the AdaBoost is three consecutive positive values (+1), the signal is classified as this class. In this example, the signal is classified as HALFAC3 (Class 4) as is represented by the red dashed rectangle.
Similar to the offline case, we can easily obtain the attributes used by the algorithm to perform the classification. Table 4 shows the features used for each class. Note that classes 1, 2, and 5 use only three features. Table 5 presents the confusion matrix that shows the results of the online classification of the five classes. Rows represent the class that is being classified and the columns represent the predictions of the classification for the actual    signal. As it can be seen, almost all the 27 tested discharges for each class are correctly classified. is leads to the fact that the average success rate is over 99%. Table 6 shows the time fraction of the discharge required to perform the online classification of the five classes for 27 randomly selected shots. e second column is the average (in percentage) of time fraction for all shots to carry out the classification. e third column is the standard deviation of the time faction (in percentage) needed to make the classification.
e fourth and fifth are the minimum and maximum values. As it can be seen, the algorithm takes more time to classify the class HALFAC3 (0.23%) of the signal,   which is a good result because this value is still short. e standard deviation is also short, which means that all the signals are classified with a small initial segment of each signal. e minimum value indicates that, for all classes, the algorithm never classifies a signal before 0.08% of the time. e maximum value indicates that the algorithm can classify all the signals before 1.21% fraction of the entire signals, which is a very good result.
e experiments were carried out on a PC with an Intel Core i7-8750H, 2.2 GHz, 16 GB of RAM, and Ubuntu 18.04.1 LTS operating system. For this configuration, the classification process time of each sliding window is less than 10 milliseconds (about 1 ms for feature extraction and less than 9 ms). Considering the nature of the rule-based model, this time could be clearly reduced when using embedded hardware such as field-programmable gate array (FPGA) or Application-Specific Integrated Circuit (ASIC).

Conclusions
is article proposes two approaches to perform a classification of five TJ-II waveforms using the ensemble algorithm AdaBoost. e first method is carried out in an offline manner and the signals are resampled to obtain distinctive attributes in the feature extraction stage. ese features are translated into AdaBoost rules to classify the signals. With this method, the classifications can achieve high success rates and the classifiers are built with explicit relationships between features and rules of the AdaBoost algorithm, which allows designers to understand better the physical underlying phenomenon. In the second approach, the classification is made for performing online classification. Firstly, the signal is preprocessed in sliding consecutive windows. en, the feature extraction stage is performed to obtain the average, the minimum, the maximum, and the slope of the signal. ese features are translated into rules of the AdaBoost algorithm that is capable of classifying the signals. e main advantage of this approach is that we do not need to wait until the discharge has finished in order to classify, which means that the classification can be performed online. Almost all the 27 tested discharges for each class are correctly classified. e average success rate is over 99%. e results show that the online classification can be performed by using only a very small fraction of the discharge.

Data Availability
e data used to support the findings of this study have not been made available.

Conflicts of Interest
e authors declare that they have no conflicts of interest.