Automated Flare Prediction Using Extreme Learning Machine

Extreme learning machine (ELM) is a fast learning algorithm of single-hidden layer feedforward neural networks (SLFNs). Compared with the traditional neural networks, the ELM algorithm has the advantages of fast learning speed and good generalization. At the same time, an ordinal logistic regression (LR) is a statistical method which is conceptually simple and algorithmically fast. In this paper, in order to improve the real-time performance, a flare forecasting method is introduced which is the combination of the LR model and the ELM algorithm.The predictive variables are three photospheric magnetic parameters, that is, the total unsigned magnetic flux, length of the strong-gradient magnetic polarity inversion line, and total magnetic energy dissipation.The LRmodel is used to map these three magnetic parameters of each active region into four probabilities. Consequently, the ELM is used to map the four probabilities into a binary label which is the final output. The proposed model is used to predict the occurrence of flares with a certain level over 24 hours following the time when the magnetogram is recorded. The experimental results show that the cascade algorithm not only improves learning speed to realize timely prediction but also has higher accuracy of X-class flare prediction in comparison with other methods.


Introduction
The sudden and intense release of energy stored in solar magnetic fields generates solar flares [1,2], which can have a great influence on the near earth space environment (socalled space weather) [3].And solar flares may bring some hazards such as the interruption of the radio communication, the injury of the astronauts, and the damage of the spacecraft devices.Therefore, it is meaningful to predict solar flares accurately and fast.
The system of solar flares prediction includes two main parts, namely, the selection of the predictors and the construction of the prediction models.At present, many predictors have been put forward.Among these predictions, the sunspots and the photospheric magnetic parameters are two mainly used predictors.Recently, the predictors derived from the magnetic observation have aroused a great deal of interest [3][4][5][6][7][8][9][10]. On the other hand, flare prediction actually can be regarded as a classification task.As a result, some machine learning methods have been employed for solar flare forecasting, such as the Bayesian network [10], neural network [11], support vector machine (SVM) [3], k-nearest neighbors (KNN) [12], and C4.5 decision tree [6].
Of the proposed measures, the predictors proposed by Jing et al. [5] have been extensively investigated.The three predictive parameters are (1) the total unsigned magnetic flux, (2) the length of the strong-gradient neutral line, and (3) the total magnetic dissipation.These three parameters are moderately correlated with the flare productivity of active regions [3], and their utility in flare forecasting has been shown in [3,5,7].Promising results have been achieved.
Based on these three predictive parameters, there are mainly two types of flare forecasting methods: pattern recognition-based methods and probability analysis-based methods.The disadvantage of the first type method (such as SVMbased method [3]) is that the output is only a label, which does not provide information on how much confidence can be placed on each forecast [3].On the other hand, probability analysis-based method (such as ordinal logistic regression [7]) provides the probability that a flare event will occur.But it is not an easy job to choose a good threshold value [3].Recently, Yuan et al. [3] proposed a method by combining an ordinal logistic regression model and a SVM classifier together (which is denoted as LR and SVM) to predict the occurrence of a certain level of flares within 24 hours.But it is well known that the parameter turning in SVM is very tedious and time-consuming [13].An improvement in the accuracy of M-and X-class flares prediction has been obtained by employing the AdaBoost algorithm in [8], but the result of this method largely depends on the selection of the weak classifier and it costs too much time on training.Therefore, it is urgent to construct a fast and accurate predictive model for forecasting solar flares.
To develop a fast method for automatically forecasting the occurrence of solar flares, we employ a learning method which is named extreme learning machine (ELM) in this paper.ELM, proposed by Huang et al. [14], is a simple learning algorithm for single-hidden layer feedforward neural networks (SLFNs).Its learning speed can be thousands of times faster than that of traditional feedforward network learning algorithms like back-propagation (BP) algorithm while obtaining better generalization performance [15].In comparison with SVM, users can use ELM easily and effectively by avoiding tedious and time-consuming parameter tuning [13].Therefore, we combine ordinal logistic regression and ELM together for the automatic prediction of the occurrence of solar flares.As in [3,5,7,8], the predictive variables are three photospheric magnetic parameters, that is, the total unsigned magnetic flux, length of the stronggradient magnetic polarity inversion line, and total magnetic energy dissipation.The logistic regression model is used to map three magnetic parameters of each active region into four probabilities; the ELM is then used to map the four probabilities into a binary label which is the final output.The data which come from SOHO/MDI (Solar and Heliospheric Observatory/Michelson Doppler Imager) magnetograms is used as training sample to predict whether the solar flares will occur or not in the next 24 hours.Experimental results show that the proposed method performs better for the prediction of the occurrence of X-class flare.
The rest of paper is organized as follows: we introduce the predictors briefly in Section 2. The adopted method is described in Section 3, and some experimental results are provided in Section 4. In the last section, some conclusion remarks are presented.

Predictors
To be consistent with the works in [3,7], the same predictive variables are used in this paper.The predictors are composed of the following three parameters.
(1) Total unsigned magnetic flux,  flux , is the integration of pixel intensity over the area of an active region: where   is the pixel intensity of MDI magnetographs.
(2) Length of the strong-gradient magnetic polarity inversion line,  gpi , which is elaborated as a measure of the nonpotentiality of active regions.It is the total number of pixels on which the gradient |∇ ⊥   | is greater than 50 G Mm −1 [7].The value |∇ ⊥   | is defined according to the following description: (3) Total magnetic energy dissipation,  diss , is as follows: where the integration is performed over the area of an active region.
For more details on these predictors, please refer to [5,7].

Prediction Method
As aforementioned, the output of pattern recognition-basedmethod is only a label, and some crucial information cannot be provided [3].On the other hand, it is very hard to choose a good threshold value in probability analysis-based method.Yuan et al. [3] proposed to combine pattern recognition-based method and probability analysis-based method together, and they use LR and SVM to forecast the flare.But the parameter turning in SVM is very tedious and time-consuming [13].In this paper, we propose to forecast the flare by combining LR and ELM together.Firstly, we utilize ordinal logistical regression to map the input parameters into an output statistical model.The output is the probability that a solar flare event will occur.Secondly, the output is fed into the ELM.Consequently, the results of the ELM will tell us whether the given active region belongs to one level or not.

Probability Prediction Based on Ordinal Logistic Regression.
Logistic regression is a probabilistic nonlinear regression model.The generalized logistic regression model always returns values between 0 and 1 (it is only applicable to binary response variables) [16].When the response variable is not just two categories, and the categories of the outcome variable have a natural order, we require more complex methods to deal with this question.The ordinal logistic regression may be appropriate since the model is able to handle the multiple categories and, ideally, account for the ordering.Actually, the ordinal logistic regression is the extension of the standard logistic regression.Hence, an ordinal regression model is more suitable in the flare forecasting since the response variables of flares in the given active region are ordering (from high to low).
Ordinal logistic regression is used to map predictive variables into probabilities of occurrence of flares by Song et al.It is shown in [7] that the forecasting results based on LR are better than those of the Solar Data Analysis Center and NOAA's Space Weather Prediction Center.
The ordinal logistic regression model can be defined by [17] where  * is the observation response variable, which cannot be measured directly, and  is error.
For a given , the cumulative probabilities can be expressed as follows: In this study, logit is used as the incidence function for the construction of ordinal response variable.The cumulative logistic model for ordinal response data can be given as the following equation: This is the well-known proportional odds (PO) model [18], which is also called the ordinal logistic model [19].It follows that the cumulative probabilities can be measured by the following equation: Once the cumulative probabilities are calculated, the probability of any category (( = 1), ( = 2), . . ., ( = )) can be calculated.For example,  ( = 1) =  ( ≤ 1) , In addition, the condition ( = 1)+( = 2)+⋅ ⋅ ⋅+( = ) = 1 should be satisfied.
In this paper, a four-state response variable is used in the LR method for solar flare forecasting.Level = 0 shows that the active region only produces microflares (it is flaring quiet or A-or B-class flare) in the coming 24 hours.Level = 1 means that the active region produces C-class flares.Level = 2 means that the active region only produces M-class flares and level = 3 is for X-class flares.Therefore, the category number  is set to 4. Furthermore, the model contains three logit functions, which can be estimated as follows: where  1 ,  2 ,  3 , and  4 indicate the probability of the fourstate category and  1 +  2 +  3 +  4 = 1.Equation ( 10) can be reexpressed as follows where  0 is the combination of  and   .The parameters  0 and   are calculated by the method of maximum likelihood [20].The application of ordinal logistic regression in flare predicting is as follows.
(1) Training: for each training sample, three photospheric magnetic features of an active region are given.Furthermore, the level of the given active region is also provided.
(2) Predicting: for a given active region, its three photospheric magnetic features are figured out.Then we feed these parameters into the ordinal logistic regression model.The output of the model contains four elements, which correspond to the probabilities that the given active region belongs to level 0, 1, 2, or 3.

Binary Predicting Based on Extreme Learning Machine.
As a learning technique, ELM has demonstrated good potentials for resolving regression and classification problems [14,15].ELM techniques have also received considerable attention in computational intelligence and machine communities.In this section, we briefly review the ELM algorithm.For more details, please refer to [14,15].Given  different random samples (  ,   ), where   = ( 1 ,  2 , . . .,   )  ∈   and   = ( 1 ,  2 , . . .,   )  ∈   , standard SLFN with  hidden neurons and activation function () is mathematically modeled as [15] where   = [ 1 ,  2 , . . .,   ]  is the weight vector of input neurons which connects the th hidden node,   = [ 1 ,  2 , . . .,   ]  is the weight vector of output neurons which connects the th hidden node, and   is the threshold of the th hidden neuron.  ⋅   means the inner product of   and   ⋅   is the actual output of the network.The above  equations can be written as the following matrix form: where As named in [14,15],  is the hidden layer output matrix of the neural network.Huang et al. have proved that if the number of hidden nodes is enough, SLFN will approximate the continuous function under the random parameter assignment of the input weight vector.But in order to get better generalization performance, the number of hidden nodes  should satisfy the condition  ≪  [15].Therefore, when the input weights   and the hidden layer biases   are fixed, the hidden layer output matrix can be set up.Consequently, the training of an SLFN is simply equivalent to finding a least-squares solution β of the linear system  = :       ( According to (10), the smallest norm least-squares solution is where  + is the Moore-Penrose generalized inverse of matrix .
Thus, the outline of ELM is given as follows.
Given a training set {(  ,   )}  =1 ⊂   ×   , hidden node number , and activation function , we have the following.
Step 2. Calculate the hidden layer output matrix .
We observe that, different from the common understanding of learning, the hidden layer of ELM need not be iteratively tuned.The typical implementation of ELM is to apply random computational nodes in the hidden layer.Furthermore, in ELM, the hidden layer need not be tuned and the hidden layer parameters can be fixed; the output weights can then be resolved using the least-squares method [13].Therefore, the learning speed of ELM can be thousands of times faster than that of traditional feedforward network learning algorithms like back-propagation (BP) algorithm while obtaining better generalization performance [15].

Experimental Results
In this section, the effectiveness of the proposed method is evaluated by some experiments.The data set given in [7] is applied in this paper, which contains 230 active regions extracted from SOHO/MDI magnetograms.We compare the proposed method with LR [7], SVM [3], and LR and SVM [3] prediction models.All the methods are tested similarly to the work given in [3].And all the experiments in this paper are carried out in MATLAB (2009a) environment running in a Core 2, 2.00 GHZ CPU.
Using the same criteria as [7], active regions are categorized into four levels according to the most powerful flare produced: an active region is classified as level 0 if it is flaringquiet or only produces A-and/or B-class flares; an active region is classified as level 1 if it produces at least one C-class flare but no M-or X-class flares; level 2 corresponds to those active regions which produce at least one M-class flare but no X-class flares; level 3 corresponds to those active regions which produce at least one X-class flare.
To assess the performance of the proposed method, seven measurements are used, which are correctness, true positive, true negative, weighted true rate, positive accuracy, negative accuracy, and weighted accuracy.All these seven measurements can be derived from Table 3.For a given contingency There are two kinds of mistakes in flare prediction: one kind is the miss in which we predict no flare, but flares occur and the other kind is a false alarm in which we predict the occurrence of a flare, but it does not happen.Please note that, among the seven measures, positive accuracy is the most important measure in flare forecasting, in which a miss (forecasting no flare, but flares occur) is worse than a false alarm (forecasting the occurrence of a flare, but it does not happen).The higher the value of positive accuracy is, the less events are missed.In practical application, it is more important for prediction of M-class or X-class flares (label two or label three).
In this paper, the leave-one-out cross validation method is applied in the experiments, namely, for 230 samples; 229 samples are used as training samples, and the remaining one is for testing.So, to every sample, the process is repeated 230 times.And the hidden node number of ELM is 49; the activation function is triangular basis function.
Based on the above description, we compare the proposed method with LR, SVM, and LR + SVM prediction models.The experimental results are recorded in Table 2.And the comparison of all the methods is showed in Table 3.
From these tables, we can see that none of the methods can get the best results for all types of flare levels.Firstly, observing the results of level zero in Table 2, the value based on LR method is higher than the other values based on the other methods.As shown in Table 3, the positive accuracy of LR-based method (0.83) is higher than other methods' results (0.73, 0.72, and 0.65).
Secondly, observing the results of level one in Table 2, the accurate number of positive samples by the proposed method is larger than the other values by the other methods.But, in Table 3, we can see that the LR-based method almost surpasses the other methods in all aspects except the positive accuracy.So the LR-based method is well for predicting lowclass flares.
Then, for level two, the method of LR does not work very well.The numbers of correct prediction of flare occurrence for LR, SVM, LR + SVM, and the proposed method of LR + ELM are 10, 9, 15, and 19, respectively.At the same time, in Table 3, we observe that the positive accuracy of LR + ELM (0.28) is the highest.
At last, for level three, the proposed method outperforms the LR-based method, SVM-based method, and LR + SVMbased method.It can correctly predict 12 of the 34 Xclass flares, but the results of the LR, SVM, and LR + SVM methods are only 1, 7, and 7, respectively.As a result, we may include that the LR-based method no longer has any effect for forecasting the occurrences of high-class flares.We notice that the proposed method outperforms the SVMbased method and LR + SVM-based method on level three.Besides, the values  and  in Table 1 are also important aspects in evaluation of all the methods.The two parameters are the mistake information of flare prediction; the lower the parameters are, the higher the parameters of  and  are, and then the higher the correctness is.As illustrated in Table 3, we can conclude that the parameters  and  of the proposed method are 11 and 22, which are both smaller than SVM methods' results.But the value of  in the proposed method is larger than that in LR + SVM-based method (9) and value of  in the proposed method is smaller than that in LR + SVMbased method (27).

Conclusions
In this paper, according to the relationship between flares and magnetic field, the magnetic parameters are selected as predictors.Then a model, which is the combination of LR and ELM, is built up as prediction model to predict flares.The LR model maps three magnetic parameters of each active region into four probabilities, and then the ELM maps the four probabilities into a binary label which is the final output.In comparison to some traditional computational intelligence techniques (such as SVM and BP), ELM provides better generalization performance at a much faster learning speed and with least human intervenes.But the ELM used in this paper still can be considered as empirical risk minimization theme and tends to generate overfitting model [21].Additionally, the output weight is resolved using the least-squares method directly, so it is easily affected by the highlight of the predicted value, which may cause poor forecasting results.Therefore, in order to address these disadvantages of ELM, an algorithm based on structural risk minimization principle and weighted least squares will be developed.To demonstrate the effectiveness of the proposed method, some comparisons of experimental results have been provided.These results show that the proposed method performs better for the prediction of the occurrence of Xclass flare.It should be noted that the proposed 1 , . . .,   ,  1 , . . .,   ) β − ( 1 , . . .,   ,  1 , . . .,   ) β −       .

Table 1 :
Indication of prediction results.