Feature Subset Selection with Optimal Adaptive Neuro-Fuzzy Systems for Bioinformatics Gene Expression Classification

Recently, bioinformatics and computational biology-enabled applications such as gene expression analysis, cellular restoration, medical image processing, protein structure examination, and medical data classification utilize fuzzy systems in offering effective solutions and decisions. The latest developments of fuzzy systems with artificial intelligence techniques enable to design the effective microarray gene expression classification models. In this aspect, this study introduces a novel feature subset selection with optimal adaptive neuro-fuzzy inference system (FSS-OANFIS) for gene expression classification. The major aim of the FSS-OANFIS model is to detect and classify the gene expression data. To accomplish this, the FSS-OANFIS model designs an improved grey wolf optimizer-based feature selection (IGWO-FS) model to derive an optimal subset of features. Besides, the OANFIS model is employed for gene classification and the parameter tuning of the ANFIS model is adjusted by the use of coyote optimization algorithm (COA). The application of IGWO-FS and COA techniques helps in accomplishing enhanced microarray gene expression classification outcomes. The experimental validation of the FSS-OANFIS model has been performed using Leukemia, Prostate, DLBCL Stanford, and Colon Cancer datasets. The proposed FSS-OANFIS model has resulted in a maximum classification accuracy of 89.47%.


Introduction
Microarray is an advanced technology that helps to recognize the pattern of gene expression of various genes at a time at the genomic level. It supports the researcher to investigate and analyze millions of genes in a single experiment [1]. It identifies many present diseases connected to each individual gene such as anaemia and cancer. Analysis of Gene Expression provides a method to recognize the gene that is differentially expressed [2], which is accountable to develop some diseases. Also, it shows the difference between normal and abnormal genes through a mathematical model [3,4]. Many openly accessible datasets such as Array Express and Gene Expression Omnibus (GEO) make the task easier to identify gene patterns of rare diseases. Classification of gene expression data splits cancer samples from healthy samples that are utilized in response to treatment prediction. Due to the smaller amount of samples with a larger amount of features in the gene expression information, the standard ML method disappoints to implement better for cancer classification [5].
Recently, there has been tremendous growth in the medical field around the world.
ere are several computational approaches utilized in the bioinformatics field in the last few decades, for example, data mining and pattern recognition, to deal with higher-dimensional problems but still unsuccessful [6].
us, recently, machine learning (ML), a branch of artificial intelligence, has received considerable attention from researchers in gene expression and genomics [7]. Also, ML is a branch of data science; the main goal is to allow a model for training and learning to make decisions by itself in the future. Machine learning is widely classified into semisupervised, semiunsupervised, supervised, and unsupervised learning [8]. For microarray data classification, the ML-based feature selection (FS) techniques such as gene selection techniques assist in selecting the essential gene [9]. Feature selection assists to preserve useful attributes. It is mainly utilized for the higher-dimensional data; simply, FS is a dimensionality reduction method. Feature selection significantly assists in the field that has relatively scarce and samples too many features, e.g., DNA Microarray and RNA sequencing [10]. is approach assists in better understanding of the feature space, preventing the scare of model overfitting, maximizing the model training time, handling the dimension, and maximizing the prediction accuracy. e results of FS are the optimum amount of features that are related to the provided class label that contributed to the prediction process.
is study introduces a novel feature subset selection with optimal adaptive neuro-fuzzy inference system (FSS-OAN-FIS) for gene expression classification.
e FSS-OANFIS model designs an improved grey wolf optimizer-based feature selection (IGWO-FS) model to derive an optimal subset of features. Besides, the OANFIS model is employed for gene classification and the parameter tuning of the ANFIS model is adjusted by the use of coyote optimization algorithm (COA). e application of IGWO-FS and COA techniques helps in accomplishing enhanced microarray gene expression classification outcomes. For examining the enhanced outcomes of the FSS-OANFIS model, a comprehensive simulation analysis was performed on distinct datasets.

Related Works
In reference [11], a two-phase approach named as MLintegrated ensemble of feature selection (FS) technique is used, and then a survival study was presented. In a primary stage, it can be chosen the optimum amongst 7 ML approaches dependent upon classifier accuracy, utilizing the whole group of features (under this case miRNAs). In the secondary stage, dependent upon classifier accuracy values, the top feature in all the FS approaches is assumed for making an ensemble to offer more categorization of miRNAs. Ayyad et al. [12] presented a novel classifier approach to gene expression data. Both executions are assumed that improve the performance of KNN. An important idea is for utilizing robust neighbors in trained data with utilizing a novel weighting approach. e authors in reference [13] presented a recently developed classification named Forest DNN (fDNN) for integrating the DNN structure with a supervised forest feature detector. Utilizing this built-in feature detector, this technique is capable of learning sparse feature representation and feeding the representation to NN for mitigating the overfitting problem. Dwivedi [14] developed a structure of approaches dependent upon supervised ML with utilizing the ANN approach for gene classification.
Shukla [15] established a novel gene selection (GS) approach by integrating minimum redundancy maximum relevance (mRMR) and teaching learning-based optimization (TLBO) for accurate cancer prediction. Primarily, during the presented method, mRMR was executed for determining one of the discriminative genes in the original feature set. In SVM, mRMR was utilized as a fitness function (FF) under the presented technique for selecting relevant features that are used for estimating the prediction accuracy and classifying cancer correctly. In reference [16], a novel social network analysis-based GS method was presented. e presented approach contains 2 important objectives: relevance maximization and redundancy minimization of chosen genes. During this approach, on all iterations, a maximal community was chosen repetitively. Next amongst the present genes under this community, the suitable genes were chosen by utilizing the node centrality-based condition.
In reference [17], an ensemble DL approach was presented for reducing the dimensional features. Primarily, the reduction of dimensional with utilize of autoencoder (AE) by utilizing several hidden layers have occurred and under the next step, a folded AE is also utilized for reducing the dimensional of identical original data. Eventually, both are combined and top feature is chosen on the fundamental of T-score value. Forestiero et al. [18] presented a multiagent technique to create a distributing approach to DNA microarray management. e group of agents, whereas all one signifying a Microarray (or chip), implement from the parallel a sequence of easy functions exploiting local data and organized virtual infrastructure was created at a global level. e word embedded method, capable of capturing the semantic context and signifying microarray with vector, was utilized for mapping the chip, thus permitting advanced agent functions.

The Proposed Model
In this study, a new FSS-OANFIS model has been developed for microarray gene expression data classification. e presented FSS-OANFIS model encompasses a series of processes, namely, data preprocessing, IGWO-FS-based election of features, ANFIS classification, and 2 Computational Intelligence and Neuroscience COA-based parameter optimization.
e application of IGWO-FS and COA techniques helps in accomplishing enhanced microarray gene expression classification outcomes. Figure 1 shows the overall process of FSS-OANFIS technique.

Preprocessing.
e z-score is a normalized and standardized system, which describes the count of standard deviation (SD), a raw data point, which is below or above the population mean [18]. It ideally lies in the range of −3 and +3. It standardizes the dataset to the aforementioned scale to change data with distinct scales to default scale. us, reflecting that several SD a point is below/above the mean as follows, but x refers to the value of particular instance, μ signifies the mean, and σ depicts the SD:

Steps Involved in IGWO-FS Technique.
Once the input data is preprocessed, the next stage is to choose an optimal subset of features. e GWO algorithm is naturally inspired by the behavior and social leadership of the grey wolves [19]. e population of wolves can be classified into alpha, beta, delta, and omega for establishing the social hierarchy of wolves. e fittest solution is called alpha (α), whereas beta (δ) and delta (δ) represent the 2nd and 3rd most efficient options, respectively. Omega (ω) represents semblance of a hopeful solution. e arithmetical expression of readapting position 0 is shown as follows: where X → a denotes the location of the alpha, X → β represent the location of the beta, X → δ indicates the location of the delta, and C → 1 , C signifies random vector, that is, the location of the existing solution, and shows the amount of iterations. It can be expressed in the following equation: where E → is represented in equation (3)   Computational Intelligence and Neuroscience over the iteration. Usually, the alpha is responsible for the chase. To change the position with the optimal searching agent position, the first three optimal solutions attained up until now compel another searching agent. en, the wolves position can be upgraded as follows: e variable b governs the balance between exploration and exploitation. Here, the variable b → can be updated linearly in all the iterations, which ranges from 2 to 0, with u being the iteration number and m i be the overall iterations allowed for the optimization: e wolf's location reflects attribute set selection and the solution space can be made by each probable attribute selection. e fitness function of the IGWO-FS technique can be utilized for determining whether an attribute subset would be selected or not: |S| represents the selected attribute length subset and c S (E) denotes the classification quality of attribute set S in relation to decision E. e overall amount of quality indicates the letter |D| ∝ ∈ 0, 1 and β � 1 − ∝ , are two respective values for the attribute subset length and classification quality, repectively. Both have dissimilar implications for the attribute reduction task. e set ∝ � 0.9, β � 0.1 and attribute subset length are less important when compared to the quality classification. e higher ensure that the ideal location is a rough set reduction as a minimum.
e fitness function evaluates the quality of location. After defining the fitness level, important feature is taken as well as removing the unwanted feature. e performance of the GWO algorithm can be improved by the design of IGWO algorithm with the inclusion of adaptive β-hill climbing (AβHC). It is a recently presented local search-based technique, that is, basically, a modified version of β-hill climbing (βHC) [20]. e study has established that AβHC gives optimum performance than several other famous local search techniques. For boosting the techniques exploitation capability and the quality of last solutions, AβHC has been combined with the fundamental GTO for support searching the neighborhoods of optimum solution under this study. e explanation of AβHC has been demonstrated mathematically as follows: In order to provide an existing solution fundamental of 2 control operators: N− operator and β-operator. e N-operator primarily transfers X i to a novel neighborhood solution. (9) and (10) as follows: where U(O, 1) refers the arbitrary number between the interval of 0 and 1 , x ij represents the value of decision variable from the j th dimensional, t stands for the existing iteration, Maxiter denotes the maximal count of iterations, N signifies the bandwidth distance amongst the existing solution and their neighbor, D refers to the spatial dimensionality, and the parameter K is a constant.

Optimal ANFIS-Based Classification.
At the final stage, the OANFIS model has been employed for the detection and classification of gene expression data into multiple classes. A network with 2 inputs, x and y and one output, f is considered. e ANFIS is a fuzzy Sugeno method. For presenting the ANFIS structure, 2 fuzzy if-then rules dependent upon a first-order Sugeno method are assumed as follows [21]: (i) Rule 1: ifx is A 1 and y is B 1 , then f 1 � p 1 x + q 1 y + r 1 (ii) Rule 2: if x is A 2 and y is B 2 , then f 2 � p 2 x + q 2 y + r 2 Where x and y are inputs, A 1 and B i imply fuzzy sets, f i is the output of fuzzy system, and p i , q i , and r i represent the design parameters that are defined in the training procedure. e ANFIS structure for implementing these 2 rules, whereas a circle represents the set node and a square refers the adaptive node. e ANFIS infrastructure has 5 layers. Figure 2 showss the framework of ANFIS. Layer 1. All nodes from layer1 are adaptation nodes. e resultant of layer 1 is are fuzzy membership grade of the inputs that are provided as follows: where x and y refer the inputs to node i, A refers the linguistic label, and ∝ A i (x) and μ B i−2 (x) are some fuzzy membership functions. Generally, ∝ A i (x) is chosen as where a i , b i , and c i are the parameters of membership bellshaped function.   Figure 2: Structure of ANFIS.

Computational Intelligence and Neuroscience
Layer 2. e node of this layer is labeled M, signifying that it can be implemented as an easy multiplier e resultant of this layer is demonstrated as follows: Layer 3. It comprises set nodes which compute the ratio of firing strength of the rules as follows: Layer 4. During this layer, the adaptive node is used. e resultants of this layer are calculated by the following equation: w i signifies the normalized firing strength in layer 3.  Training Validation e node executes the summation of every incoming signal. erefore, an entire result of the model is provided as follows: It could be realized that there are 2 adaptive layers under this ANFIS structure such as the 1st layer and 4th layer. During the 1st layer, there are 3 modifiable parameters a, b i c i that are connected to the input membership function. ese parameters are usually named as premise parameters. During the 4th layer, there are also 3 modifiable parameters p i q i r i relating to the first-order polynomial.
is parameter is supposed the consequent parameter.
For tuning the ANFIS parameters, the COA is applied to it. e COA is a mathematical model that depends on smart diversity [22]. Chasing, driving, attacking, and blocking are archived by four distinct kinds of chimps that are attained by chasers, drivers, attackers, and obstacles. ese hunting steps are accomplished in two phases such as exploration and exploitation stages. e exploration phase involves chasing, driving, and blocking the prey. e exploitation phase should attack the prey, and the chasing and driving are characterized as follows: where X prey denotes the vector of prey location, x chimp indicates the vector of chimp location, t denotes the amount of present iterations, a, c, and m represent coefficient vectors and are calculated as follows: a � 2 · f · r 1 -f, where f declined nonlinearly from 2.5 to 0, r 1 and r 2 denote the random value within [0, 1], and m represents the chaotic vector.
e dynamic coefficient f is selected for distinct slopes and curves; therefore, chimps employ distinct capabilities for searching the prey. Chimps upgrade the position according to the other chimps, and the arithmetical expression can be given in the following equation:

Experimental Validation
In this section, the experimental validation of the FSS-OANFIS model has been performed using four benchmark datasets [23][24][25][26]. e details of the dataset are given in Table 1. e results are investigated and the outcomes are assessed in terms of different measures. For experimental validation, a 10-fold cross-validation process is utilized. Figure 3 illustrates a set of confusion matrices offered by the FSS-OANFIS model on test datasets. e figure reported that the FSS-OANFIS model has properly recognized the class labels on all datasets. For instance, on the Leukemia dataset, the FSS-OANFIS model has identified 8 samples in class 0 and 2 samples in class 1. In addition, on the Prostate dataset, the FSS-OANFIS system has identified 16 samples in class 0 and 9 samples in class 1. Also, on the DLBCL Stanford dataset, the FSS-OANFIS approach has identified 2 samples in class 0 and 9 samples in class 1. Besides, on the Colon Cancer dataset, the FSS-OANFIS algorithm has identified 12 samples in class 0 and 5 samples in class 1. Table 2 provides the overall classification outcomes of the FSS-OANFIS model on the test datasets. e experimental outcomes pointed out that the FSS-OANFIS model  Figure 4 shows the precision-recall curves offered by the FSS-OANFIS model on the test Leukemia dataset. e figure indicated that the FSS-OANFIS model has depicted effective precision-recall values on the classification of two classes, namely, class 0 and class 1 on the test Leukemia dataset.

Results Analysis of Proposed Model.
Next, Figure 5 shows the precision-recall curves offered by the FSS-OANFIS model on the test Prostate dataset. e figure revealed that the FSS-OANFIS technique has depicted effective precision-recall values on the classification of two classes, namely, class 0 and class 1 on the test Prostate dataset.
Similarly, Figure 6 shows the precision-recall curves offered by the FSS-OANFIS system on the test DLBCL Stanford dataset. e figure exposed that the FSS-OANFIS model has depicted effective precision-recall values on the classification of two classes, namely, class 0 and class 1 on the test DLBL Stanford dataset. Figure 7 shows the precision-recall curves offered by the FSS-OANFIS method on the test Colon Cancer dataset. e figure indicated that the FSS-OANFIS approach has depicted effective precision-recall values on the classification of two classes, namely, class 0 and class 1 on the test Colon Cancer dataset.
A brief ROC investigation of the FSS-OANFIS model on the distinct four datasets is described in Figure 8. e results indicated that the FSS-OANFIS technique has exhibited its ability in categorizing two different classes such as class 0 and 1 on the test four datasets. Figure 9 shows the accuracy and loss graph analysis of the ODBN-IDS technique under four datasets. e results show that the accuracy value tends to increase and the loss value tends to decrease with an increase in epoch count. It is also observed that the training loss is low and validation accuracy is high under four datasets.

4.2.
Discussion. Finally, a detailed comparative study of the FSS-OANFIS model with recent methods on distinct datasets is shown in Table 3 and Figure 10 [27]. e experimental results indicated that the FSS-OANFIS model has shown effectual outcomes under all datasets. For instance, with the Leukemia dataset, the DE and AHSA-GS models have depicted lower performance over the other methods.
At the same time, with the Prostate dataset, the DE and AHSA-GS models have depicted lower performance over the other methods. Likewise, the PSO algorithm has resulted in somewhat reasonable performance with accu y , sens y , spec y , and G measure of 68.78%, 63.63%, 70.15%, and 66.01% and the FSS-OANFIS methodology has resulted in superior accu y , sens y , spec y , and G measure of 80.65%, 80%, 80%, and 81.37%, respectively. In addition, with the DLBCL Stanford dataset, the DE and AHSA-GS techniques have showcased lesser performance over the other methods. ough the PSO algorithm has resulted in slightly reasonable performance with accu y , sens y , spec y , and G measure of 72.80%, 60.82%, 60.17%, and 61.74%, the FSS-OANFIS approach has resulted in higher accu y , sens y , spec y , and G measure of 733.33%, 66.67%, 66.67%, and 70.47%, respectively.
Along with that, with the Colon Cancer dataset, the DE and AHSA-GS models have portrayed lower performance over the other methods. But, the PSO approach has resulted in slightly reasonable performance with accu y , sens y , spec y , and G measure of 59%, 43.02%, 58.34%, and 36.66%, and the FSS-OANFIS system has resulted in superior accu y , sens y , spec y , and G measure of 89.47%, 87.80%, 87.82%, and 87.82%, respectively. After examining the results and discussion, it is apparent that the FSS-OANFIS model has accomplished maximum performance in the microarray gene expression classification process.

Conclusion
In this study, a new FSS-OANFIS model has been developed for microarray gene expression data classification. e presented FSS-OANFIS model encompasses a series of processes, namely, data pre-processing, IGWO-FS-based election of features, ANFIS classification, and COA-based parameter optimization. e application of IGWO-FS and COA techniques helps in accomplishing enhanced microarray gene expression classification outcomes. For examining the enhanced outcomes of the FSS-OANFIS model, a wide range of simulations were performed on distinct datasets. e experimental results indicated that the FSS-OANFIS model has resulted in enhanced performance over the recent approaches. In future, the feature reduction and clustering approaches can be integrated to enhance gene expression classification outcomes.

Data Availability
Data are available and can be provided upon direct request to the corresponding author.

Ethical Approval
is article does not contain any studies with human participants performed by any of the authors.

Consent
Not applicable.

Conflicts of Interest
e authors declare that they have no conflicts of interest.