High-Dimensional Feature Selection Based on Improved Binary Ant Colony Optimization Combined with Hybrid Rice Optimization Algorithm

,


Introduction
High-dimensional data are increasingly prevalent in diverse felds such as medical diagnosis and genomics, presenting signifcant challenges to the construction of intelligent systems. Data from medical imaging generally comprise a variety of features that delineate patient conditions. Similarly, microarray data from high-throughput gene expression experiments concurrently measure the expression levels of thousands of genes [1]. Such high dimensionality creates difculties in terms of storage, computation, and analysis [2]. Specifcally, the abundance of features may lead to the "curse of dimensionality," where data sparsity occurs and the distance between sample points loses signifcance.
Tis situation, in turn, can lead to the overftting of data mining models and increased computation time. Terefore, selecting representative features that positively impact intelligent systems becomes critical. Feature selection (FS) is employed to minimize the feature space, thereby not only conserving storage space but also facilitating information discovery and mitigating the potential for model overftting.
FS techniques directly identify the optimal feature subset (OFS) from the original feature space, preserving the interpretability of the chosen features and enhancing the comprehension of underlying data patterns and relationships. Te broad applicability of FS spans areas such as text classifcation [3], image processing [4,5], fnance [6,7], and other felds. As an NP-hard problem in nature, the search space of FS grows exponentially with the number of features, making traditional search methods inefective in fnding the OFS within polynomial time. Existing FS methods primarily fall into three categories: flter, wrapper, and embedded, each distinguished by its selection process. Te flter method independently evaluates the relevance and redundancy of each feature using statistical or ranking measures, irrespective of the particular classifer employed in subsequent steps. Te wrapper method integrates a specifc learning algorithm within the FS process, treating FS as a search problem and seeking the OFS by assessing the performance of each feature subset on the classifer. Although wrappers entail a higher computational cost compared to highly effcient flter methods, they can handle nonlinear feature interactions and dependencies. Furthermore, wrappers can determine the OFS for a specifc learning algorithm, potentially enhancing predictive performance [8]. Te embedded method incorporates the FS process within the training process of the learning algorithm. Te superior performance of the wrapper method has attracted increasing research interest, marking it as a popular FS technique. To tackle the computational intensity of wrapper methods, researchers are investigating the use of metaheuristics in FS to boost search efciency and OFS quality.
Metaheuristics have surfaced as promising alternatives for FS in high-dimensional data, efciently balancing exploration and exploitation in the search space to converge towards near-optimal solutions. Various highperformance metaheuristics have been successfully employed across diverse FS tasks [9,10]. For instance, Jia et al. [11] enhanced the slime mould algorithm (SMA) [12] for FS, wherein the FS process and the parameter optimization of SVM occur concurrently. To improve population diversity, a composite mutation strategy was introduced, and a trial-based restart strategy was designed to circumvent the local optima. Qu et al. [13] proposed a novel gene selection method based on Harris hawk optimization (HHO). Te F-score is initially employed for preliminary feature fltering to condense the feature space, followed by a variable neighborhood learning strategy for balancing the exploration and exploitation of HHO. Ghosh et al. [14] studied eight diferent transfer functions from two series (S-shaped and V-shaped) and suggested an improved binary manta ray foraging optimization (MRFO) [15] for FS. Ewees et al. [16] modifed the seagull optimization algorithm (SOA) using the Levy fight mechanism to overcome its linear search defciency in the search space, expanding the search area and enhancing the capability of individuals to escape from local optima. To select the optimal gene combination efectively in microarray data, Pashaei [17] utilized mRMR to flter the top m promising genes in the initial stage to reduce the feature space and then applied the Aquila Optimizer (AO) with a mutation mechanism and TVMS transfer function to search the OFS. Awadallah et al. [18] developed a new enhanced binary rat swarm optimizer (RSO), in which the local historical optimal strategy of PSO was introduced to enhance individual exploitation. Furthermore, three crossover operators, namely, one-point crossover, two-point crossover, and uniform crossover, are incorporated into the individual update process and randomly selected with equal probability. A multi-strategy integrated grey wolf optimizer (MSGWO) was presented for biological data classifcation [19]. It adopted the convergence factor concept to adjust the transition between exploration and exploitation explicitly. Additionally, multiple exploration and exploitation strategies were employed to boost the global search and local exploitation processes.
Te no-free lunch (NFL) theorem suggests that no single algorithm can resolve all optimization problems due to the inability to strike a perfect balance between global search and local exploitation. Consequently, recent research has increasingly focused on hybrid metaheuristics to address the limitations of individual algorithms. For instance, Pashaei [2] introduced a hybrid FS method that integrates the dragonfy algorithm (DFA) and the black hole algorithm (BHA), in which the optimal solution derived from DFA serves as the initial solution for BHA. A hybrid FS method, merging the binary arithmetic optimization algorithm (BAOA) and simulated annealing (SA), was proposed in [20], utilizing SA as a local search operator to discover potential solutions near the optimal solution of BAOA. Stephan et al. [21] put forward a method combining the artifcial bee colony (ABC) and whale optimization algorithm (WOA) to concurrently search the OFS of breast cancer data and optimize the artifcial neural network parameters. 'In addition, there are several studies attempting to integrate the HHO with other metaheuristic algorithms. ' We hope this revision is more clear and accurate [22][23][24]. Beyond metaheuristic hybrids, the combination of flters and wrappers has also been extensively researched, aiming to determine the OFS at a lower computational cost. Zhu et al. [25] combined the Fisher fltering method with the artifcial immune optimizer for high-dimensional FS, introducing a lethal mutation mechanism and a Cauchy mutation operator with an adaptive adjustment factor to enhance population diversity. A new high-dimensional FS method in conjunction with mRMR was developed in [26]. Tis method improved the recently proposed COOT algorithm using a crossover operator and employed the hyperbolic tangent transfer function for continuous numerical binarization. Te improved COOT was hybridized with SA and applied to FS of microarray data. While these studies have demonstrated the performance of metaheuristic-based FS methods, the risk of local optima entrapment persists due to the stochastic strategies inherent in metaheuristics that ensure the global search of the algorithm. Terefore, the development of a computationally efcient and robust high-dimensional FS method remains a crucial and ongoing research endeavor.
Hybrid rice optimization (HRO) algorithm, proposed by Ye et al. [27], is a novel population-based metaheuristic inspired by the real-world breeding process of three-line hybrid rice. Te heterosis theory suggests that the frst generation of hybrids manifests superior physical, reproductive, and behavioral characteristics compared to their parent generation. Echoing this concept, HRO has exhibited several desirable properties, including high search efciency, fewer control parameters compared to other metaheuristics, and ease of implementation. Its high fexibility and reduced 2 International Journal of Intelligent Systems parameter dependency have encouraged researchers to apply it to the 0-1 knapsack problem [28] as well as the FS problem [29]. While most metaheuristics necessitate conversion to binary form via a specifc transfer function to indicate whether the feature is selected, this approach may be appropriate for all situations, as they were initially designed to manage continuous optimization problems. As a classic swarm intelligence algorithm, ant colony optimization (ACO) is more suitable for FS tasks, given its origin as a combinatorial optimization algorithm. ACO strikes a wellbalanced ratio between exploration and exploitation by dynamically adjusting pheromone density. Numerous studies have demonstrated the robustness and adaptability of ACO in resolving FS problems. Wang et al. [30] introduced a novel approach for FS, namely, the probabilistic sequence-based graphical representation ACO, incorporating symmetric uncertainty (SU) into the algorithm. Paniri et al. [31] presented an innovative multi-label FS method based on ACO, which used both unsupervised and supervised heuristic functions to seek features with minimal redundancy and maximal correlation with class labels. Moreover, a recent study [32] put forth a semisupervised FS method based on ACO that employs a nonlinear heuristic function trained using temporal diference (TD) reinforcement learning instead of the traditional linear heuristic functions. Despite the standard ACO and HRO having demonstrated commendable optimization performance in low-dimensional optimization problems, their performance may encounter difculty in achieving ideal performance when dealing with high-dimensional FS problems. Tis challenge motivates us to combine the strengths of both algorithms, aiming to create a more robust and efcient method for high-dimensional FS.
Tis paper proposes a two-stage hybrid technique for solving high-dimensional FS problems, leveraging the strengths of both ACO and HRO. Considering the limitations of standard ACO, this paper seeks to enhance it in the frst stage prior to hybridization. A novel problemoriented heuristic factor assignment strategy based on the importance of the knee point feature is designed to augment the search capability of ACO in high-dimensional FS. In the second stage, two hybrid models merging the improved binary ACO (IBACO) and HRO are presented and applied to high-dimension FS tasks. Te hybridization manifests in two forms: low-level relay hybrid (LRH) and high-level teamwork hybrid (HTH) [33]. In the case of LRH, IBACO serves as an operator within HRO to guide the evolution of the maintainer line. Tis hybrid mode has been introduced due to the notable absence of suitable maintainer line update strategies within HRO, a factor of particular importance for highdimensional FS tasks, while in HTH, subpopulations of HRO and IBACO evolve independently and share their local search results after each iteration. A comprehensive list of acronyms and symbol annotations used throughout this paper is provided in Table 1. Te main contributions of the paper are summarized as follows: (1) Two unique hybridizations of IBACO and HRO, namely, R-IBACO and C-IBACO, are presented to efectively leverage the advantages of the two algorithms and obtain a more promising solution.
(2) A new problem-oriented assignment strategy for the heuristic factor (HF) based on the feature correlation of the knee point feature is proposed. (3) Te proposed methods are applied in high-dimensional FS tasks and compared with existing state-of-the-art metaheuristic-based FS methods. (4) Te performance of the proposed methods is evaluated from multiple perspectives, and the results are subjected to the Wilcoxon signed-rank test and Friedman test.
Te remainder of this paper is structured as follows. Section 2 presents an overview of the related work. Section 3 discusses the mathematical model of FS and provides some theoretical background on BACO and HRO. Te specifcs of the proposed methods are elucidated in Section 4. Section 5 outlines the experiments conducted and analyzes the obtained results. Finally, Section 6 draws conclusions from this study and indicates potential areas for future research.

Metaheuristic-Based FS.
In recent years, the application of metaheuristics for FS problems has seen a signifcant rise. Notably, many of the proposed wrapper FS methods draw upon GWO due to its remarkable performance in solving continuous optimization problems. In the work of Hu et al. [34], an enhanced variant of the binary GWO was introduced, incorporating a novel strategy for updating the parameter governing exploration and exploitation, along with fve transfer functions for mapping continuous values to their binary counterparts, thereby enhancing the quality of candidate solutions. Likewise, Abdel-Basset et al. [35] proposed three distinct binary GWO variants, each utilizing diferent transfer functions. In addition to GWO-based  [36] proposed an improved integer PSO with a fast correlation-guided feature clustering strategy that markedly reduces the computation cost for FS. Te gainingsharing knowledge (GSK) optimization algorithm is a newly devised metaheuristic that draws its inspiration from the human process of acquiring and sharing knowledge [37]. Its robustness and convergence properties have proven to be highly efective in solving continuous optimization problems. Some enhanced versions of GSK have been successfully employed for FS tasks. As a continuous optimization algorithm, Agrawal et al. [38] implemented eight S-shaped and V-shaped transfer functions to map the individual codes of GSK into the binary search space. In addition, a dynamic population reduction strategy is introduced to facilitate the adjustment of population size during the pursuit of OFS. Another improved GSK integrating chaotic map strategies was proposed in [39], which utilized a probability estimation operator to represent its binary variant. Hanbay [40] proposed an innovative standard error-based ABC for FS, incorporating new solution search mechanisms based on standard error into the original ABC algorithm and utilizing the Shannon condition entropy value for FS. Moreover, an advanced salp swarm algorithm (SSA) was also suggested in [41], utilizing opposition-based learning to augment population diversity and implementing a novel local search mechanism to circumvent the local optima.

ACO-Based FS.
ACO has displayed exceptional performance in tackling a wide range of discrete optimization challenges. For instance, Owuor et al. [42] performed a detailed evaluation of three population-based optimization techniques in the context of mining gradual patterns. Te fndings suggested that ACO surpassed the other two techniques and traditional counterparts. In another study, Zhang et al. [43] introduced a knowledge-based local search for multi-population ant colony systems to address the multi-objective supply chain confguration problem in supply chain management. By leveraging two independent ant colonies, the cost of goods sold and the lead time were concurrently minimized, facilitating an efcient search of the target space. Moreover, Zhao et al. [44] enhanced ACO using horizontal and vertical crossover search and applied it in the feld of image segmentation. Te superior combinatorial optimization capability of ACO has encouraged researchers to extend its application to various types of FS tasks, some of which have been successfully implemented [45]. Te heuristic factor (HF) and the pheromone density (PD) are two crucial parameters in ACO that can signifcantly infuence its performance. However, some FS methods based on ACO have not appropriately set the HF, even generating it randomly. Appropriately confguring the HF and PD parameters can enhance the global and local search capabilities of ACO. Recently, the impact of these parameters on the performance of ACO has been explored. Manbari et al. [46] introduced a circular graph-based ACO algorithm for FS, wherein each feature is interconnected with the subsequent one through a pair of select/deselect edges. Te heuristic information pertaining to each feature is determined based on its corresponding term variance. Ghosh et al. [47] proposed an ACO-based wrapper-flter FS method that employs a flter to assess feature subsets instead of a wrapper, signifcantly reducing the computational complexity. Te HF is estimated by calculating the similarity between the last added feature and the feature whose addition probability is being calculated. Additionally, a novel multi-label FS method based on ACO was proposed in [48], which employed a heuristic learning approach instead of a static heuristic function. Te heuristic function of ACO is learned from experiences directly using TD reinforcement learning, which signifcantly improves the quality of the selected feature subset. Moreover, Hashemi et al. [49] proposed an ensemble FS approach based on ACO that utilized multiple heuristic information determined by a multi-criteria decision-making procedure. Such distinctive heuristic information supplies further insights about subsequent nodes. Experimental results demonstrated that the proposed method signifcantly outperforms other methods across various evaluation indicators.
Despite the proven efectiveness of ACO in solving FS problems, it still encounters issues such as limited global search capability and slow convergence rates, particularly when handling high-dimensional problems. To mitigate these limitations, the concept of hybridizing with complementary metaheuristics has emerged as a promising solution. As detailed by Talbi [33], there exists a two-tiered hybridization system encompassing low-level and high-level hybridization, each comprising two distinct mechanisms of hybridization. Building on this concept, Wan et al. [50] proposed the VMBACO, a hybrid FS method that blends a modifed binary ACO algorithm with a genetic algorithm (GA). Tis hybrid method employs the solution derived from GA as the assignment of the HF in BACO, resulting in substantial improvements in both the quality of feature subsets and classifer accuracy. In a similar vein, Li et al. [51] introduced a hybrid FS model that combines ACO with the antlion optimizer, which includes a novel mutation operator to enhance exploration capability. Furthermore, Ma et al. [52] suggested a two-stage hybrid ACO algorithm for highdimensional FS. Tis method employs an interval strategy to determine the size of the OFS and integrates a hybrid model that harnesses the inherent relevance attributes of features and classifcation performance to direct the OFS search. Tis advanced hybrid ACO assigns HF to a feature by calculating its correlation with the chosen feature subset after softmax normalization. Despite the improved performance of these hybrid algorithms, they remain susceptible to the local optima and often exhibit relatively high computational complexity.
Te heterosis-inspired HRO has demonstrated high search efciency and robust global search capabilities, as illustrated by its successful application to the band selection problem [29]. Moreover, the diverse set of operators employed by HRO facilitates the efective maintenance of population diversity in high-dimensional problem spaces. Considering these benefts of HRO, it is hypothesized that integrating the improved, problem-oriented ACO with HRO may yield a more efcient and robust search for OFS in highdimensional FS tasks.

Te Binary Coded Ant Colony Optimization Algorithm.
Binary ant colony optimization (BACO) is a metaheuristic inspired by ant behavior, and it is designed to fnd the optimal solutions for binary optimization problems. BACO is particularly useful in optimizing scenarios such as FS and intrusion detection [53]. BACO commences with a randomly generated set of candidate solutions, depicted as binary strings. Troughout each iteration, the algorithm adjusts the probability of including each feature, drawing from the information acquired from previously established solutions. In BACO, the PD serves as a directive for subsequent ants to decide whether a feature should be incorporated into their solution. Figure 1 presents a binary directed acyclic graph, symbolizing the potential paths ants can follow to fnd a solution. Figure 2 elaborates on the path of an ant to construct a corresponding solution to a feature subset. In this example, fve features (F i , i � 1, 2, . . . 5) are considered, and the bold black solid line represents the path constructed by an ant. Te selected feature subset is F 2 , F 3 , F 5 , while features F 1 and F 4 are deselected.
Te node selection for the next bit is determined by the state transition probability, a function derived from both the PD and HF. Te mathematical expression for the state transition probability is given by equations (1) and (2), where p k i,j (1) denotes the probability that the k th ant selects path 1 from bit i to bit j and τ i,j (1) and η i,j (1) are the corresponding PD and HF values for this path. Similarly, p k i,j (0), τ i,j (0), and η i,j (0) represent the probability, PD, and HF values for the case where the ant selects path 0 from bit i to bit j. Te parameters α and β control the relative importance of the PD and HF, respectively. Initially, the PD for each path is set to be equivalent and then decreases as ants make their selections. Te update rule for the PD from bit i to bit j is described by equations (3) and (4).
Te evaporation factor, denoted as ρ, signifes the rate of pheromone decay on unselected paths. Conversely, the chosen path experiences an increase in PD as described by Δτ, which is defned by the ftness value of the optimal candidate solution, f best , procured in the current iteration. BACO also maintains a record of the optimal solution found to date, denoted as g best . Additionally, each iteration records its own optimal solution, p best , with its ftness value being compared against that of g best . If the ftness value of p best surpasses that of g best , g best is updated with p best at iteration t. Upon the completion of all iterations, g best serves as the global optimal solution. BACO is limited by the absence of a suitable HF for the FS task, especially when the dimension is high. Tis makes BACO tends to converge to a local optimal solution rather than a global optimal solution. To fully exploit the performance of BACO in FS, it is essential to introduce a suitable HF assignment strategy specifcally for the FS task and to improve the relevant update strategy. Te improved strategies will be presented in detail in Section 4.2.

Hybrid Rice Optimization Algorithm.
Hybrid rice optimization (HRO), proposed by Ye et al. [27], is an innovative metaheuristic that draws upon the advantages of heterosis, ofering superior search efciency and robust global search capabilities. HRO divides the population into three lines based on their ftness values. Let X � (X 1 , X 2 , . . . , X n ) denote the population sorted by ftness value, where n is the size of the population. Te frst subpopulation, representing the best ftness within the entire population, is referred to as the maintainer line and is denoted by X 1 , X 2 , . . . , X p (p � ⌊n/3⌋). Te second subpopulation represented by X 2p+1 , X 2p+2 , . . . , X n possesses the poorest ftness value and constitutes the sterile line, which requires hybridization with the maintainer line to enhance its ftness quality. Te remaining subpopulation is the restorer line, denoted as X p+1 , X p+2 , . . . , X 2p . It aims to evolve into the maintainer line through a process referred to as selfng. Te principle of HRO will be further elaborated in the following sections.
represents the d th gene of a randomly selected individual from the sterile line, while X d m (t) is the d th gene of the individual randomly selected from the maintainer line. r 1 is a random number generated from [0, 1].

Selfng.
Te selfng process is designed to update the restorer line, with the intention of steering the individual towards the global optimal solution. Tis behavior can be mathematically modeled using the following equation: represents the new gene produced by selfng between the i th and j th restorer (j ≠ i). X d best (t) denotes the d th gene of the best individual found so far and X d r(j) (t) represents the d th gene of the restorer randomly selected from the restorer line. Te variable r 2 is a random number generated within the range of [0, 1].
After generating a new individual through hybridization and selfng, it is evaluated and compared to the original candidate solution. Te substitution process, as defned in equation (7), replaces the old individual with the new one only if the ftness value of the new individual surpasses that of the old one.

Renewal.
Te selfng process in HRO involves tracking the number of iterations during which a restorer has not undergone updates using a parameter called SC (selfcrossing). If a restorer's SC reaches the upper limit, denoted as t max iterations without updates, a reset operation is performed on that individual. Tis reset behavior is mathematically encapsulated in the following equation: where X d r(i) (t) represents the d th gene of the i th restorer that has not been updated and V d max and V d min represent the maximum and minimum values of the d th dimension, respectively. r 3 is a random number drawn from [0, 1]. It is worth noting that the solution obtained by HRO is continuous, which needs to be mapped into the solution space of FS through the application of the transfer function. In this study, the sigmoid function is employed as a binary map, which is defned by

The Proposed Method
Although the efectiveness of traditional ACO-based methods in solving low-dimensional combinatorial optimization problems is well established, it is necessary to further improve the strategies employed by BACO to tackle high-dimensional FS problems. In pursuit of this aim, two hybrid models have been proposed, integrating IBACO and HRO to improve the efciency of exploring the global optimal solution and to enhance the rate of convergence. Moreover, given the considerable impact of the HF assignment strategy on both the population update process and the overall performance, a novel problem-oriented HF assignment scheme is introduced, which is based on the signifcance of the knee point feature. Tis scheme takes into account the importance of the knee point feature to ensure that the population is updated efectively. Lastly, the objective function that needs to be optimized has been fnessed, taking into account not only the classifcation accuracy but also the size of the feature subset.

Te Hybrid Models.
To fully harness the convergence rate and global search capability benefts of HRO, IBACO is melded with HRO in two unique hybrid models: the relay model (R-IBACO) and the collaborative model (C-IBACO). Te details of these two hybrid models are discussed in the following.

Te Relay Model.
As described in Section 3.2, the maintainer line in HRO symbolizes the optimal candidate solutions discovered by the population. Te quality of the maintainer line is crucial in determining the evolution of the population and infuencing both the convergence rate and the fnal result. Moreover, the sterile line is updated via crossing with the maintainer line, underscoring the importance of enhancing the quality of the maintainer line in HRO. To this end, a LRH model is proposed, applying the update strategies of IBACO to the update process of the HRO maintainer line. Specifcally, the k th bit of the binary string corresponding to each maintainer candidate solution is evolved using the PD and HF of IBACO. Each individual of the maintainer line selects a path determined by the transition probability (equations (1) and (2)). Tis integration of the IBACO operator efectively enhances numerous dimensions in the maintainer line during the early 6 International Journal of Intelligent Systems iterations, enabling R-IBACO to concentrate on global search in discrete spaces and quickly converge towards the vicinity of the global optimum. Furthermore, the best individual from the maintainer line is employed to update the PD. Te detailed implementation of the relay model is outlined in Algorithm 1.

Te Collaborative Model.
In this section, a co-evolutionary hybrid model, referred to as C-IBACO, is proposed. Tis model maintains the efciency of IBACO in tackling high-dimensional FS problems while integrating the efective optimization performance of HRO. C-IBACO consists of two subpopulations, with IBACO and HRO independently executing their respective update strategies in each iteration. Specifcally, ants select a path based on the transition probabilities specifed in equations (1) and (2), while the k th gene of the i th rice individual is updated using equations (5), (6), and (8). Te HF of the 0 th path assigns importance to the knee point to measure the potential of each feature to be selected. Features with a correlation greater than the knee point are more likely to be included in the feature subset while those with a low feature correlation not losing the opportunity to be selected.
To promote co-evolution between these two parallel algorithms, the search results from each subpopulation are shared after each iteration. Te global optimal solution is determined by comparing the ftness values of the best candidate solutions from both subpopulations. If the best individual from HRO outperforms that of IBACO, the pheromone updated originally with the best candidate solution from the ant colony is substituted by the best individual from HRO. Conversely, if the best candidate solution from IBACO prevails, the worst solution of the maintainer line in HRO is supplanted by the best candidate solution of IBACO.
By incorporating these improvement strategies and information sharing mechanisms, C-IBACO can potentially identify a superior feature subset, selecting the most promising features efectively. Te specifcs of the C-IBACO procedure are detailed in Algorithm 2.

Te Improved Heuristic
Factor. Te random assignment of the HF in BACO has been identifed as a limitation in its capability to optimize high-dimensional FS problems. Tis limitation calls for a task-specifc HF assignment to enhance performance. Nevertheless, traditional FS methods, such as those based solely on high-correlation features, may be arbitrary and do not consider the potential infuence of combined features with low correlation on the fnal classifcation result. To mitigate this issue, this paper proposes the use of the weight of the knee point [54] as a threshold in the HF assignment process. Tis approach obviates the need for complex experimental verifcation and avoids signifcant loss of class label information.
Initially, the proposed methods employ the random forest (RF) algorithm to calculate the importance of each feature. All features are then sorted based on their RF results, and the knee point is defned as the feature with the maximum distance from the straight-line projection connecting the features with the maximum and minimum importance. Te detailed selection process for the knee point is described in Algorithm 3 and is graphically represented in Figure 3.
Equation (11) presents the HF assignment strategy in the proposed IBACO method, where I k denotes the importance of the k th feature and I knee point represents the importance of the knee point feature. η k (0) indicates that the HF of the 0 th path adopts I knee point while η k (1) means that the HF of the 1 th path adopts I k . According to equations (1) and (2), this assignment strategy prioritizes selecting features with higher correlation while not disregarding less signifcant features.

Time Complexity Analysis.
Te time complexity of the proposed hybrid algorithms primarily hinges on four aspects: initialization, individual evaluation and update, population sorting, and pheromone update. Table 2 contrasts the time complexity of the hybrid algorithms with those of standalone algorithms at each stage. In this context, N symbolizes the population size, while N 1 and N 2 denote the subpopulation sizes of HRO and IBACO within C-IBACO, respectively. D signifes the dimension of F. Te detailed analysis of time complexity reveals that, in comparison to ACO, R-IBACO increases the computational complexity merely at the population sorting phase. Conversely, in relation to HRO, it heightens the computational overhead solely in updating the maintainer line and PD.
Owing to the cooperative updating of two subpopulations engaged in C-IBACO, it manifests a higher time complexity than the single algorithms. Tis increased complexity emerges during the population initialization, population sorting, and individual update process. Te total time complexities of R-IBACO and C-IBACO stand at , respectively, where T represents the maximum number of iterations. Overall, it can be suggested the hybrid algorithm signifcantly improves the overall performance of the model within an acceptable range of increased computational overhead.

Te Objective Function.
Te objective function serves as the guiding principle in providing guidance for the update process and determining the optimization direction of the algorithms. In the context of FS, the objective is to identify the most informative feature subset that can enhance the performance of classifers. Te primary criterion for evaluating the efcacy of a classifer is its classifcation accuracy, which is defned in equation (12). However, since the objective function is designed to be minimized, the error rate of classifcation, which is the complement of accuracy (1-accuracy), is employed as a primary factor in the objective function. Moreover, it is desirable to maintain a minimal International Journal of Intelligent Systems feature count in the subset, prompting the inclusion of the selection ratio in the objective function, as specifed in equation (13). By minimizing the objective function, the proposed methods can identify the least redundant and most informative feature subset with the highest classifcation accuracy.
In equation (12), T P and T N , respectively, denote instances where the classifer correctly identifed a test sample as positive or negative, whereas F P and F N represent instances where the classifer incorrectly classifed a test sample as positive or negative. In equation (13), the parameters n and N refer to the count of features included in the selected feature subset and the total number of features, respectively. Te fraction n/N signifes the feature selection rate. Te ftness value, denoted by Fitness, is calculated by weighing the error rate and the selection rate using the weights λ and μ, respectively. As the classifer's accuracy is the primary component of the ftness value, λ is typically assigned a larger value, such as 0.9.

Experimental Results and Discussion
In this section, the performance of the proposed methods is evaluated on fourteen high-dimensional biomedical datasets, with feature sizes ranging from 2000 to 12533 dimensions. Te primary performance metrics considered are classifcation accuracy, size of feature subset, and running time. Te K-nearest neighbor (KNN) classifer serves as the evaluation metric for the performance of the feature subset selected by each algorithm. Te efectiveness of two methods for computing feature importance on model performance is also investigated. Five-fold cross-validation is employed to avoid model overftting due to limited sample sizes, providing a more accurate assessment of model performance metrics. To thoroughly analyze the FS performance of the proposed methods, comparison experiments with thirteen other metaheuristic-based FS methods are conducted in three distinct groups. In the frst group, the proposed methods are compared with standard HRO and IBACO, which are components of the proposed hybrid methods. Te standard ACO is also included in this group to validate the efectiveness of the proposed heuristic factor. In the second group, the proposed methods are compared with fve wellknown FS methods based on basic metaheuristics, such as FPA, BQPSO, ABC, SSA [41], and GWO. In the fnal group, the efectiveness of the proposed hybrid methods is validated against fve state-of-the-art methods reported in recent studies, which are CMSRSSMA [11], MBAO [17], MSGWO Input: Dataset D and objective function Fitness Output: OFS and corresponding classifcation accuracy (1) Initialize population P and set the maximum number of iterations T (2) Calculate feature importance and knee point feature using Algorithm 3 (3) Set heuristic factor and initialize PD (4) while t < T do (5) Calculate the ftness of the population f � Fitness(P) (6) Sort P in descending order and divide it into three lines: M (Maintainer), R (Restorer), and S (Sterile) (7) for Individual X i in P do (8) if X i in M then (9) Generate trial solution X new(i) using equations (1) and (2)  (10) else if X i in S then (11) Generate trial solution X new(i) using equation (5)  (12) else (13) if SC < SC max then (14) Generate trial solution X new(i) using equation (6)  (15) else (16) Generate trial solution X new(i) using equation (8)  (17) end if (18) end if (19) Compute Update X i using equation (7)  (21) if X i is a binary vector with all bits set to 0 then (22) Reinitialize the binary vector corresponding to X i (23) end if (24) end for (25) Update g best , p best , and PD (26) end while ALGORITHM 1: FS optimized by R-IBACO. 8 International Journal of Intelligent Systems [19], SCHHO [24], and HFSIA [25]. Tese advanced hybrid algorithms allow the proposed methods to be benchmarked against current leading solutions in the feld. All algorithms employed in this research were implemented using Python language version 3.6.9. Te experimental results presented in this paper were obtained on a personal computer equipped with an Intel (R) Core (TM) i7-8700 CPU operating at 3.2 GHz, 16.0 GB RAM, under a Windows 10 system. Table 3 outlines the key characteristics of the datasets employed in this research. Tese datasets span a range of medical disciplines, predominantly serving binary or multi-class classifcation tasks. Notably, each dataset is characterized by a large number of features, varying from 2000 to 12533. A shared trait among these datasets is the presence of numerous redundant and irrelevant features, compounded by a typically limited sample size. It is imperative to apply dimensionality reduction to such data, as the presence of unnecessary features can compromise model performance.

Parameter Settings.
Te parameters for all algorithms are confgured as follows. Te maximum number of iterations is fxed at 100 for all algorithms. To ensure fairness in terms of maximum evaluation times of the ftness function (FE_MAX � 3000), the population size of all algorithms is set to 30 except for HRO, C-IBACO, and HFSIA. In consideration of the fact that only the sterile line is updated during the hybridization process of HRO and taking into account that the subpopulation sizes in C-IBACO are required to Input: Dataset D and objective function Fitness Output: OFS and corresponding classifcation accuracy (1) Initialize the subpopulations P X and P Y and set the maximum number of iterations T (2) Calculate feature importance and knee point feature using Algorithm 3 (3) Set heuristic factor and initialize PD (4) while t < T do (5) Calculate f X � Fitness(P X ) and f Y � Fitness(P Y ) (6) Find p bestx and p besty : individuals with min ftness in P X and P Y (7) Sort P X in descending order and divide into three lines: M (Maintainer), R (Restorer), S (Sterile) (8) for Individual X i in P X do (9) if X i in R then (10) if SC < SC max then (11) Generate candidate solution X new(i) using equation (6)  (12) else (13) Generate candidate solution X new(i) using equation (8)  (14) end if (15) else if X i in S then (16) Generate candidate solution X new(i) using equation (5)  (17) end if (18) Calculate Update X i using equation (7)  (20) Check if X i is an all-zero binary vector (21) end for (22) for Individual Y i in P Y do (23) Update path of Y i using equations (1) and (2)  (24) Check if Y i is an all-zero binary vector (25) end for (26) Update p bestx and p besty (27) Update the PD and the worst individual in maintainer line (28) end while (29) g best is the best of p bestx and p besty ALGORITHM 2: FS optimized by C-IBACO.  International Journal of Intelligent Systems satisfy equation (14), where N p stands for the total population size (fxed at 30), the population size for the single standard HRO is adjusted to 45. Additionally, the subpopulation sizes for HRO and IBACO in C-IBACO are set to 27 and 12, respectively. For HFSIA, the population size is set at 14 and the crossover probability is set at 0.35 to maintain consistency in the evaluation times of the ftness function. Te flter ratio of Fisher in HFSIA is set at 0.3 to preselect the top 30% of relevant features. Moreover, in the initial stage of MBAO, the number of features fltered by mRMR is fxed at 100, aligning with the existing literature. Lastly, to mitigate the efects of randomness and bolster the stability of the experimental outcomes, each algorithm is executed independently on each dataset ten times.  Table 4. Te evaluation results underscore the efciency and efectiveness of RF and ReliefF in determining feature correlation and eliminating redundant features. It is noted that ReliefF tends to select more features than RF across all datasets, indicating the superior selection efciency of RF. For example, the selection rate of RF in the lymphoma dataset reaches 95.85%, which is 90.59% higher than that of ReliefF. A similar outcome is observed in the warpAR10P dataset, where the selection rate of RF outperforms ReliefF by 88.32%. Moreover, both methods exhibit varying degrees of improvement in classifcation accuracy compared to using Input: Dataset D and feature space F Output: Knee point and its weight I (1) Set maximum vertical projection d max ←0 (2) Initialize index of knee point k and its weight I←0 (3) Compute feature correlation for each D i using RF (4) Sort the feature correlation in descending order, denoted by W � (w 1 , w 2 , . . . , w n ) (5) Connect features with largest and smallest correlations to form line L (shown as red dotted line in Figure 3) (6) for each feature f j in F do (7) Calculate vertical projection distance from f j to line L, denoted as d j (8) if d j > d max then (9) k←j (10) I←w j (11) end if (12) end for (13) Te knee point is the k th feature in F (as marked with a red circle in Figure 3) and its weight is I ALGORITHM 3: Te selection process of the knee point.  However, the selected feature subset does not invariably guarantee improved classifcation accuracy. For instance, in the Brain_Tumor_2 dataset, RF selects a small feature subset of 60 features, achieving a dimensionality reduction rate of 99.42%. However, the classifcation accuracy merely attains 64.82%, marking a decrease of 7.99% from the baseline. Tis trend is similarly observed in lymphoma, Leukemia_2, and 11_Tumor datasets. Te diminished classifcation accuracy underscores the limitations of FS methods that rely solely on feature correlation ranking, as crucial information tied to features with lower correlations than the knee point may be overlooked. Consequently, it might be more judicious to consider the correlation of all features as the criteria for high-dimensional feature ordering instead of discarding less correlated features.
Te classifcation experiment results, detailed in Tables 5  and 6, demonstrate the efectiveness of employing the feature correlation determined by RF and ReliefF as the heuristic factor in the proposed IBACO. Te fndings suggest that C-IBACO and R-IBACO surpassed standalone RF or ReliefF in classifcation accuracy. For instance, when k is set to 3, the classifcation accuracies of RF-based C-IBACO and R-IBACO on the Brain_Tumor_2 dataset register at 88.82% and 86.82%, respectively, which are 22.28% and 18.51% superior to the accuracies of single RF. Furthermore, the classifcation accuracy obtained in the DLBCL dataset by C-IBACO using the feature correlation calculated by RF achieved 100%, which indicates the ability of the proposed methods to identify samples that are challenging to distinguish between KNN and RF-based KNN. Compared to K � 5, the proposed methods exhibit an enhanced search capability when K is fxed at 3. As a result, in subsequent comparative experiments, KNN (K � 3) will serve as the fnal classifer to evaluate the performance of feature subsets.

Comparison of the Proposed Methods with HRO and IBACO.
To substantiate the efectiveness of the proposed heuristic factor assignment and hybrid strategies, the proposed methods were contrasted with the standard single algorithms. Table 7 summarizes the performance of the proposed methods and the basic methods in terms of classifcation accuracy, average number of selected features, and average runtime over ten independent runs. Te results distinctly demonstrate that the mean accuracy of the proposed methods surpasses that of the original single algorithm across all datasets. Moreover, the introduction of an improved problem-oriented HF propels IBACO to outperform the standard ACO on ten of the fourteen datasets. Regarding the average number of selected features, the proposed methods obtained the smallest feature subsets on nine datasets. Although the standard ACO recorded the shortest average runtime across all datasets except the lung dataset, exhibiting its efcient selection capability, its lack of enhanced strategies culminated in inferior classifcation outcomes.

Comparison of the Proposed Methods with Other Basic
Metaheuristics. To further investigate the superiority of the proposed methods, a fair comparison was conducted with fve well-known basic metaheuristic-based FS methods, including FPA, BQPSO, ABC, ISSA [41], and GWO. Te comparison maintained an equal maximum number of ftness evaluations, and the comparative outcomes of this group of methods are presented in Table 8. Te proposed methods achieve the highest average and maximum classifcation accuracy across all datasets. BQPSO and GWO also deliver impressive results on most datasets, with classifcation accuracy on the Leukemia_1 dataset even surpassing that of C-IBACO, a pattern also observed on the 11_Tumor dataset.
Regarding the size of the selected feature subset, the proposed methods signifcantly outperform their counterparts. For instance, the feature selection rate of C-IBACO on the Prostate_Tumors dataset is remarkably low, at only 0.41%. Tis is in stark contrast to the substantially higher feature selection rate of 49.47% displayed by FPA and IBSSA, a diference of 99.17%, which underscores the superior effectiveness of the proposed methods. Furthermore, C-IBACO selected an average of 143.6, 32.2, 53.1, and 23.6 features on the warpAR10P, GLIOMA, Prostate_GE, and ALLAML datasets, respectively, with a minimum dimensionality reduction rate of 94.02%. R-IBACO selects the smallest average number of features on the Colon dataset, with only 57 features selected, a reduction of 94.09% compared to the average number of features chosen by IBSSA. Te proposed methods record the shortest runtime on seven out of fourteen datasets. Notably, IBSSA demonstrated higher computational efciency than other methods on high-dimensional datasets (the last four datasets with over ten thousand dimensions). In general, as the problem dimension increases, the performance gap between the proposed methods and other algorithms widens, indicating that C-IBACO and R-IBACO exhibit greater robustness and are more suitable for highdimensional FS tasks.

Comparative Study with the State of the Art.
Apart from comparing the proposed methods with single standard algorithms and basic metaheuristic-based FS methods, this study also probes into the performance comparison of C-IBACO and R-IBACO with fve advanced metaheuristicbased FS methods recently expounded in the literature, including MBAO [17], HFSIA [25], SCHHO [24], MSGWO [19], and CMSRSSMA [11]. Table 9 presents the comparison results of multiple independent runs for each advanced algorithm on each dataset. As evidenced in the table, C-IBACO and R-IBACO surpassed their counterparts in terms of classifcation accuracy on thirteen datasets, except for the lymphoma dataset. In the case of the lymphoma dataset, MBAO achieved the best classifcation accuracy while selecting the least number of features. Tis hinted at the limitations of the proposed methods in eliminating redundant and irrelevant features while bolstering classifcation accuracy. Although the application of flters substantially diminished the time consumed in the FS process, it consequently led to a drop in classifcation performance. Tis is illustrated by the average classifcation accuracy of MBAO, amounting to only 65.4 on the 11_Tumor dataset. In this case, the classifcation accuracy of our proposed C-IBACO is 40.61% higher than that of      International Journal of Intelligent Systems MBAO. Similar trends were also observed in the Brain_-Tumor_2 and warpAR10P datasets, where R-IBACO and C-IBACO were 24.45% and 23.30% higher than MBAO and SCHHO, respectively, which might be due to flters eliminating features potentially useful for classifcation. While emphasizing classifcation accuracy, the hybrid flter approaches managed to select the minimum number of features across all datasets, with MBAO achieving thirteen and HFSIA securing one. Tis outcome can be ascribed to the ability of flter to constrain the search space to a lower dimension, thereby signifcantly shrinking the size of the feature subset to be searched. Regarding computational time, the hybrid flter methods demonstrate superior search efciency as they conduct the search within the fltered, low-dimensional feature space. Specifcally, HFSIA registers the shortest computational time on the initial nine datasets, whereas MBAO exhibits exceptional search efciency on the fnal fve datasets characterized by higher dimensionality. In the context of search efciency across the full feature space, both C-IBACO and R-IBACO perform slightly less efectively than MSGWO, which does not utilize a hybrid strategy. However, C-IBACO and R-IBACO prove to be more robust than SCHHO and CMSRSSMA, particularly as the feature dimension increases.
Although the C-IBACO and R-IBACO may not match the computational efciency and selected feature subset size exhibited by MBAO and HFSIA, they still demonstrated acceptable performance and robustness in classifcation accuracy and stability, which is worth a slight sacrifce in time.  Figure 4 provides a visual representation of the comparative superiority and efcacy of the proposed methods against the single standard algorithms and other metaheuristic-based FS approaches. Tis fgure encapsulates the overall maximum and average classifcation accuracy, along with the number of selected features for all algorithms. Specifcally, Figures 4(a) and 4(b) depict the overall maximum and average classifcation accuracies, respectively. As the fgures suggest, the feature subsets selected by C-IBACO and R-IBACO deliver markedly superior classifcation performance compared to other methods. Higher overall classifcation accuracy underscores the superior search performance of the proposed methods and a heightened likelihood of discovering more promising candidate solutions. Conversely, MBAO did not exhibit optimal performance in overall classifcation accuracy, as discussed in the preceding section. Tis discrepancy could be due to the classifer used, which difers from those documented in the literature. Figure 4(c) displays the overall number of features selected by each algorithm. MBAO selected the fewest features, followed closely by R-IBACO and C-IBACO. It is worth noting that the number of features chosen by the proposed methods was less than that of HFSIA, even though the latter also employs a fltering method. Tis result reafrms the efectiveness of the proposed methods in reducing the number of selected features. Te convergence results, illustrated in Figure 5, demonstrate that both C-IBACO and R-IBACO converge quickly to the vicinity of the global optimum before the termination of iterations. Te hybrid algorithms possess an advantage in achieving a superior initial solution by properly selecting features with high correlation based on the modifed heuristic factor in the frst stage. As the convergence curves indicate, the proposed methods maintain a certain exploratory capability in the fnal stage of iterations, reducing the risk of getting trapped in the local optima. In contrast, the convergence curves of MBAO reveal its relatively underwhelming overall performance. Specifcally, on datasets such as lung, Brain_Tumor_2, Leukemia_2, and 11_Tumor, MBAO prematurely converges to a local optimum and fails to further search for a better feature subset. One possible explanation for this is that mRMR flters out informative feature combinations, resulting in a search space with poor performance.

Graphical Analysis.
To provide a more intuitive understanding of the computational complexity of diferent algorithms, Figure 6 displays the average running time of all algorithms for each dataset. Evidently, HFSIA attains the shortest running time for the frst nine datasets, whereas MBAO demonstrates superior search efciency for the last fve datasets with higher dimensionality. It is noticeable that IBACO achieves a shorter average CPU running time than HRO and their hybrid counterparts across all datasets, which can be attributed to its ability to quickly select features with high correlation using the modifed HF, thereby accelerating convergence. However, this advantage is accompanied by lower classifcation accuracy and unsatisfactory average ftness values compared to the hybrid algorithms. Nevertheless, C-IBACO and R-IBACO display exceptional robustness by generating acceptable candidate solutions without incurring a signifcant computational cost. Overall, the experimental results afrm the feasibility and potential of deploying the proposed methods for practical high-dimensional FS tasks.

Experimental Results for Nonparametric Test.
To determine the signifcance of the diference between the proposed hybrid algorithms and compared metaheuristicbased FS approaches, the experimental results were analyzed using the Wilcoxon signed-rank test and the Friedman test [55]. Te results of the Wilcoxon signed-rank test are presented in Table 10. In this table, W + denotes the sum of ranks where the proposed hybrid algorithm outperforms the comparative method, whereas W − symbolizes the reverse. Te p value signifes the level of signifcance, with p < 0.05 indicating a signifcant diference between the two algorithms under comparison. Te outcomes reveal that C-IBACO outperforms R-IBACO, suggesting that C-IBACO maintains a lower individual ftness value and exhibits superior performance in the FS process. Apart from the null hypothesis being rejected between C-IBACO and R-IBACO, both are signifcantly superior to other basic and advanced methods, demonstrating signifcant diferences. Table 11 presents the results of the Friedman test, including average accuracy rankings and fnal ranks for each algorithm across all datasets. As demonstrated in Table 11, the proposed methods claim the top two positions in the fnal ranking. Te Friedman test produces a p value of 2.61E − 18, which is less than the preset signifcance level of 0.01, indicating a statistical diference among the algorithms. Table 12 provides the results of the Holm test, which serves as a post hoc test to determine signifcant diferences between the control method and other algorithms in pairwise comparisons. Te Holm test rejected the null hypothesis at a signifcance level of 0.05, indicating signifcant diferences between R-IBACO and other competing methods except for C-IBACO, which is consistent with the fndings of the Wilcoxon signed-rank test.
Based on the statistical test results, it can be inferred that the hybrid algorithms incorporating IBACO and HRO exhibit superior optimization performance compared to their standalone counterparts. Tis emphasizes the efective combination of the optimization strategies of every single algorithm, leading to an enhancement in their competence in high-dimensional FS tasks.

Discussion and Biological
Interpretation. Based on the results of three comparative experiments and statistical analysis, it can be inferred that the hybrid models of HRO and IBACO integrate the advantages of both single algorithms, allowing them to converge more quickly to the most promising region in the search space while maintaining a certain level of global exploration capability in later iterations. Te problem-oriented HF enhances the ability of ACO to select more important features while discarding irrelevant ones. Moreover, the optimal solution obtained by HRO guides the update of pheromones. Tis collaborative update process continues until the termination condition is achieved. By combining HRO and IBACO, the population benefts from greater diversity while reducing the probability of a single algorithm getting stuck into the local optima.
Numerous metaheuristics have been developed to solve low-dimensional continuous optimization problems, but these often struggle when addressing high-dimensional discrete optimization issues. Tis study introduces two hybrid wrapper FS methods, and the experimental results on fourteen high-dimensional datasets demonstrate that the feature subset obtained by the hybrid algorithms had higher classifcation accuracy and smaller size. Nevertheless, their optimization performance still needs to improve in some instances, a shortcoming attributable to the inherent traits of metaheuristics. To enhance the potential of locating the optimal solution, metaheuristics often adopt a global search strategy that depends on certain randomness. Tis approach inevitably raises the risk of the algorithm getting stuck in the local optima. For instance, the algorithms only managed to achieve an average classifcation accuracy of 83% on the warpAR10P dataset, a rate that may not satisfy the stringent standards demanded in practical medical scenarios.   Furthermore, the introduction of advanced collaborative hybrid strategies can reduce the optimization efciency of the algorithm, given that updating the maintainer line and independent subpopulations amplifes the complexity of the model.
Te performance improvement of the hybrid algorithms can primarily be attributed to the fact that they leverage the outstanding traits of individual algorithms. As illustrated in Table 7, BHRO exhibited superior classifcation accuracy, while IBACO surpassed in terms of optimization efciency and robustness. Another crucial factor contributing to performance enhancement is the hybrid strategy and the improved HF. Tese elements ensure adequate coverage of the search area, providing the proposed methods with a better opportunity to locate the global optimal solution within the entire search space. Tis results in the leading performance regarding maximum classifcation accuracy across all datasets. However, it is essential to note that the proposed methods still have limitations. For instance, setting the selfng upper limit parameter SC Max for HRO can be challenging since it is crucial for the transition between the selfng and renewal stages. Additionally, the inherent randomness in metaheuristics cannot guarantee the acquisition of the optimal feature subset in a single run.
Te index of the most frequently selected features (chosen in > rbin7 out of 10 runs) by C-IBACO is presented in Table 13. Te results demonstrate that C-IBACO is capable of selecting a small number of highly discriminative features on most datasets except for warpAR10P, Brain_-Tumor_1, and Brain_Tumor_2, which suggests its potential applications in disease diagnosis and gene expression problems. Notably, the feature subsets composed of highfrequency features on the Colon and Prostate_Tumors datasets exhibit superior classifcation performance compared to the results in Table 7. Moreover, the size of these subsets is only 31.12% and 20.5% of the original average selected features, respectively, which suggests that the proposed methods still have the potential for enhanced removal of irrelevant features. With more appropriate and reasonable parameter settings in the future, the suggested approaches can be applied to other practical problems such as fault detection [56], scheduling problems [57,58], text classifcation [59], and sentiment analysis [60].

Conclusion
In this research, two innovative hybrid wrapper-based FS methods that integrate HRO and IBACO are proposed to identify the most informative features in high-dimensional disease diagnosis and gene expression data. Te primary objective of hybridization is to enhance the performance of IBACO by harnessing the power of HRO to facilitate the exploration and exploitation of the high-dimensional search space. By combining the superior search efciency and robustness of IBACO and the excellent performance of HRO in searching the global optima, the proposed hybrid methods manifest enhanced FS capabilities. Moreover, IBACO attempts to boost performance through a problem-oriented assignment strategy that employs the correlation of the knee point feature, enabling the algorithm to exploit valuable latent information in the features. Tis strategy is also integrated into HRO to compensate for the absence of update to the maintainer line. Two distinct forms of hybridization are presented in this study: R-IBACO and C-IBACO. In R-IBACO, IBACO plays a critical role in updating the maintainer line of HRO, with the best solution derived from HRO subsequently used to update PD at each iteration, while in C-IBACO, the subpopulations of HRO and IBACO perform the update process independently, and the local search results are shared to update PD and maintainer line after each iteration. In the methods proposed, the KNN algorithm functions as the classifer, and RF is employed to calculate the feature importance required for assigning HF. Te proposed methods were evaluated on fourteen well-known biomedical datasets. Teir performance was benchmarked against thirteen other algorithms, including single standard algorithms that comprise them, as well as basic and advanced metaheuristic-based wrapper FS methods. Te experimental results indicate that the proposed methods outperform the other techniques in terms of both the number of selected features and classifcation accuracy on most datasets. Furthermore, the statistical results of the Wilcoxon signed-rank test and Friedman test reveal that the proposed methods achieved the top rank in terms of classifcation accuracy, which corroborates their efectiveness as practical strategies for selecting the most representative features related to diseases.
Although the suggested approaches efectively enhance the exploration and exploration capabilities of a single algorithm, it is important to note that optimization efciency might decline as the model complexity increases. Moreover, optimal performance may not be achievable in certain specifc scenarios. Consequently, future research should prioritize the consideration of more efcient hybrid strategies that can ensure the generalization capabilities of the algorithms across problems of varying scales and dimensions. One direction for further research involves exploring more efcient parameter settings and integrated optimization of algorithm and classifer parameters, which will help determine synergistic settings that maximize the overall performance. Additionally, it would be benefcial to explore diferent transfer functions and advanced classifers, as these components have the potential to further bolster the performance of the algorithms in a variety of optimization scenarios.