A Novel Approach to Ensemble Classifiers: FsBoost-Based Subspace Method

In this article, an algorithm is proposed for creating an ensemble classiﬁer. The name of the algorithm is the F-score subspace method (FsBoost). According to this method, the features are selected with the F-score and classiﬁed with diﬀerent or the same classiﬁers. In the next step, the ensemble classiﬁer is created. Two versions that are named FsBoost.V1 and FsBoost.V2 have been developed based on classiﬁcation by the same or diﬀerent classiﬁers. According to the results obtained, the results are consistent with the literature. Besides, a higher accuracy rate is obtained compared with many algorithms in the literature. The algorithm is fast because it has a few steps. It is thought that the algorithm will be successful due to these advantages.


Introduction
An ensemble classifier is a method in which multiple classifications are used together to improve classification performance [1,2]. For example, when three classifiers are used to classify an object, the classifier works like this: if the first classifier is classified as a cat, the second classifier is classified as a dog, and the third classifier is classified as a cat, the ensemble classifier generates the result by taking the average of these decisions. ere are many ways to create an ensemble classifier. Some of the most commonly used ones are (1) adaptive resampling and combining (boosting) [3], (1.1) AdaBoost (adaptive boosting) [4], (2) bagging (bootstrap aggregating) [5], and (3) random subspace [6]. e boosting method can create powerful classifiers by combining and training weak classifiers [3]. e most commonly used boosting method is AdaBoost [4]. e AdaBoost method tries to improve performance by focusing on misclassified instances [4]. In the bagging method, classifiers trained with different training sets randomly selected (random sampling method) from the dataset are combined [5]. Outputs of classifiers are combined with majority voting or weighted voting [5]. In the random subspace method, feature subsets are generated by randomly selecting from N samples [6]. Each subset has an M element and an m feature [6]. In other words, a subset of features is created, not a subset of instances [6]. In this way, the training process is accelerated. ese subclasses and classifiers are trained to form ensemble classifiers. e outputs of the classifiers are combined with majority voting or weighted voting. ese algorithms have a disadvantage. As the education levels for AdaBoost increase, the number of samples decreases, and training becomes more difficult. ere is a need for more samples for training. It is quite slow because the training stages are too much [7,8]. e bagging method involves complex calculations [7]. Both methods require many iterations [1]. So, the success rate is usually lower than the random forest method [1]. ese models cannot explain the dataset by modeling it as decision trees [7]. When these disadvantages are taken into consideration, these methods are still in need of improvement. In this study, a new ensemble algorithm based on the F-score feature selection algorithm has been developed to reduce the processing load of existing ensemble algorithms and to increase the accuracy rate.
Feature selection algorithms are often used in the machine learning field to improve the performance of systems [9][10][11]. In the field of machine learning, datasets are used in a variety of sizes and types [12][13][14][15]. Large size data will cause the classifier to lengthen the training duration. Feature selection algorithms have been developed to solve this problem [9,16,17]. ey do this by clearing irrelevant data when holding relevant data [9]. us, data size, process load, and training time decrease while classification accuracy increases [9,18]. Many feature selection algorithms have been developed in the literature [1,16]. However, in this study, the F-score feature selection algorithm is used because it can work fast, and its performance is good [16]. Feature selection algorithms can be used in many places such as health areas [19][20][21][22].
In this study, two different methods have been developed, namely, FsBoost.V1 and FsBoost.V2, based on the F-score feature selection algorithm that can enhance training performance for ensemble classifiers. e FsBoost.V1 method is like the random subspace method. However, the features are chosen concerning the data label, not random. Selected datasets are classified with a single classifier, and then ensemble classifier 1 is created. is process is repeated for three or more different classifiers. Eventually, ensemble classifiers for three different classifiers are merged. In this way, it is ensured that unnecessary data are removed from the training process. e operation can be interrupted first in the ensemble classifier. In the FsBoost.V2 method, all data are classified with different classifiers. In the second step, the subfeature space is created by the F-score feature selection algorithm and reclassified. e ensemble classifier is created because of classification. is process was repeated a second time. Eventually, ensemble classifiers for three different classifiers are merged. e use of a single classifier reduces the cost. Only relevant features are retrieved by using the F-score feature selection algorithm. is process accelerates the training process. Complexity is less than other algorithms.

Materials and Methods
e operation was performed according to the flow in Figure 1. Firstly, the records to be used in the study were collected. en, features were selected with the F-score feature selection algorithm. Finally, the data are classified with different classifiers, and their performances are calculated. When these operations are performed, ensemble classifiers are created, and their performances are calculated at different levels and formations.

Collection of Data.
e data used in the study were downloaded from the Machine Learning Repository website of the University of California, Irvine (UCI) [23,24]. e data consist of 4 groups (A/B/C/D) belonging to epilepsy patients (Table 1). Records include EEG records of individuals. Each record is 23.6 seconds. 2300 EEG recordings were taken during the epileptic seizure. e other 2300 records (nonepilepsy) were recorded while in a healthy condition. However, the records belong to epileptic patients. e epilepsy data in each set are the same. However, nonepilepsy records are different. e database contains 178 features for each EEG recording.

F-Score Feature Selection Algorithm.
e F-score is one of the feature selection algorithms that helps distinguish classes from each other [25]. To select the feature, an F-score value (F i ) is calculated for each feature (equation (1)). e F-score threshold value (F E ) is determined by taking the average of all F-score values. For the ith feature, if F i > F E , ith feature is selected. is step is repeated for each feature.
(1)  In the study, A, B, C, and D dataset features were selected with the F-score (Table 2). Feature selection has been applied twice.

Ensemble Classifier.
e ensemble classifier is a system created by combining different classifiers to produce safer and more stable estimates [26]. e system is built with N classifiers. N can be single or double. While classifying according to the feature vector, for each feature vector 1, each classifier generates an output value. e output values produced are counted. en, the output of the ensemble classifier is determined by the number of votes. If the number of classifiers is even, the average of the decision values of the classifiers is rounded off, and the decision of the ensemble classifier is determined. is process applies to all feature vectors. e ensemble classifier was prepared in MATLAB using three different classifiers: kNN, PNN, and SVMs [27]. e kNN is one of the machine learning classification methods with advisory learning [28,29]. Under the structure of the training dataset, classification is done according to nearest k of the new classifier. In this study, k � 5 was selected, and ten distance calculation formulas were used. ese include Spearman, Seuclidean, Minkowski, Mahalanobis, Jaccard, Hamming, Euclidean, Cosine, Correlation, and Cityblock.
PNN is a statistical classification algorithm based on kernel and Bayesian [30]. e method is developed based on feedforward networks [30]. e classifier takes care of all class elements when processing [31]. e radial-based kernel function calculates the distance between class samples. e user in the PNN classifier can manipulate the spread parameter. As the spread parameter approaches zero, the network begins to behave like the nearest neighbor classifier [32]. is value when farther away from zero, the classifier classifies, considering several vectors that separate data from each other [32]. In the study, PNN networks were designed with a total of 500 different values ranging from 0.01 to 5 steps of the spread parameter, with 0.01 step range. At the end of the study, the best performing network parameters and performance criteria were calculated.
SVMs are among the best machine learning algorithms [33]. ey can be used in the regression analysis as well as classification [33]. SVMs try to separate datasets from each other with a linear and nonlinear line. e purpose of the SVM algorithm is to be able to distinguish between the data with the minimum error [34]. Gaussian or radial basis function (RBF) kernel (rbf ) was used in the study. e BoxConstraint box limit is set between 1 and 100 so that the best performance can be achieved.

Ensemble Classifier
Powered by the F-Score. In this study, two different ensemble classifiers, namely, Classifier-FsBoost.V1 and Feature-FsBoost.V2, were developed.

Classifier-Based Ensemble Classifier: FsBoost.V1.
e implementation steps of this method are shown in detail in Figure 2. Accordingly to this, firstly, a dataset (A) is classified in a classifier (kNN). In the second step, the first feature selection is performed and again classified in the same classifier (kNN). In the third step, the first and second feature selection are performed and again classified in the same classifier (kNN). us, it is classified in three different steps, but only in a classifier (kNN). ese three results are combined to form the kNN ensemble. e same process is repeated in PNN and SVMs. Eventually, the kNN ensemble, the PNN ensemble, and the SVM ensemble are combined into a single ensemble classifier.

Feature-Based Ensemble Classifier: FsBoost.V2.
e steps for this method are shown in Figure 3. Accordingly to this, firstly, a dataset (A) is classified by each classifier (kNN, PNN, and SVMs). ese three classifiers are combined to obtain ensemble classifier 1. In the second step, the first feature selection is performed, and the process in the first step is repeated. In the third step, the first and second property selection steps are performed together, and then the first process is repeated. Ensemble 1, 2, and 3 classifiers are combined to create the ensemble classifier.

Performance Evaluation Criteria and Distribution of Data
for Classification. Different performance evaluation criteria were used to test the accuracy rates of the proposed systems. ese are accuracy rates, sensitivity, specificity, kappa value, receiver operating characteristic (ROC), area under a ROC (AUC), and k (10-fold) cross-validation accuracy.

Results
e work aims to develop a new algorithm to improve the ensemble classifier performance. We have developed an algorithm (FsBoost) that is similar to the random subspace method but with less workload, faster running, and better performance. F-score feature selection algorithm based on this method has two versions (FsBoost.V1 vs. FsBoost.V2). e ensemble classifier is created with a single classifier in FsBoost.V1 (Şekil 2, Level 1) and at least three different classifiers in FsBoost.V2 (Şekil 3, Level 1). e developed  algorithms were tested with four two-class datasets (A, B, C, and D) ( Table 3).
According to the FsBoost algorithm, the dataset features were selected twice using the F-score feature selection algorithm. For example, according to FsBoost.V1, the dataset (A) is classified with the same classifier after each property selection (Figure 2, Level 1-kNN1, kNN2, and kNN3) ( Table 4). kNN ensemble was formed by combining classifiers of three kNNs (Figure 2, Level 1) ( Table 5). is process was repeated with three different classifiers to create PNN ensemble and SVM ensemble ( Figure 2, Level 1) (Table 5).
en, the kNN ensemble, PNN ensemble, and SVM ensemble were combined to form the final ensemble classifier (Figure 2, Level 2) ( Table 5).
is process is repeated for each dataset (Tables 4-8     In FsBoost.V2, the dataset (A) is classified with different classifiers after each feature selection (Figure 3, Level 1-kNN1, PNN1, and SVM1) ( Table 4). ese three classifiers were combined to create ensemble 1 (Figure 3, Level 1-ensemble 1) (Table 9). en, ensemble 1, ensemble 2, and ensemble 3 were combined to form the final ensemble classifier (Figure 3, Level 2) ( Table 9). is process is repeated for each dataset (Tables 4-7 and 9). Finally, the FsBoost ensemble algorithm is also compared with the ensemble algorithms available in the literature (Table 10).
Accuracy rates for FsBoost.V1 and FsBoost.V2 are higher than those for single classifiers (Table 10). e FsBoost algorithm is well ranked compared to other boosting algorithms in the literature (Table 10). FsBoost.V1-Level 1-SVM ensemble method is the best method when compared with the literature (Table 10, Rank). ree different datasets were used to reconfirm the results obtained. e distribution of datasets is shown in Table 11.
In order to compare the FsBoost algorithm with boosting algorithms, three different datasets were reanalyzed. e results obtained from the analysis are summarized in Table 12. According to the results, the algorithm with the average best performance is the FsBoost.V1 Level 2 ensemble algorithm.

Discussion and Conclusion
FsBoost is one of the best algorithms developed until now [4][5][6][7]. is method has very few steps. In this way, it provides results faster. A high accuracy rate is a distinct advantage. Algorithms with high accuracy and fast results are preferred in medical data classification. In this regard, FsBoost may be preferred.
FsBoost contains fewer calculations and steps than the algorithms in the literature [4][5][6][7]. e accuracy rate is very good compared with other algorithms (Table 10) [4].   Considering these advantages, FsBoost may be a commonly used algorithm soon. FsBoost algorithms are also suitable for use in biomedical signal processing, deep learning, and communication [35][36][37].
FsBoost can be used with three or more classifiers. Besides, FsBoost.V1 is a version of FsBoost that can be used with a single classifier. Achieving high performance with a single classifier is a distinct advantage of FsBoost.V1. e F-score feature selection algorithm creates this advantage.      As a result, we can say that FsBoost is an alternative method to create an ensemble classifier. A high-performance ensemble classifier can be created with a powerful classifier and the F-score feature selection algorithm.
Data Availability e datasets in our paper could be downloaded from the UCI Machine Learning Repository (https://archive.ics.uci. edu/ml/datasets/index.html). e authors can send all the datasets based on the readers' requests.

Conflicts of Interest
e authors declare no conflicts of interest.