Performance Optimization and Comprehensive Analysis of Binary Nutcracker Optimization Algorithm: A Case Study of Feature Selection and Merkle–Hellman Knapsack Cryptosystem

,


Introduction
Te optimization problem (OP) is the problem of fnding the optimal solution from within the feasible region. Tere are several optimization problems in several felds, which need to be accurately optimized to fulfll better performance. Tese optimization problems are divided into two categories: discrete and continuous problems. Te discrete problems contain a set of decision variables that are integer or binary values and need to be accurately optimized to reach the near-optimal solution. Tere are several discrete problems such as DNA fragment assembly, bag selection, unit commitment, knapsack, the ensemble of classifers, and unoccupied facility location [1]. On the contrary, continuous problems have continuous decision variables. Each Op might have single-, multi, or many-objective functions.
Binary optimization is a subset of discrete optimization where the only possible values for the decision variables are 0 or 1. Tere are a number of signifcant binary problems, including 0-1 knapsack (KP01) [2], feature selection (FS) [3], and cryptanalysis in the Merkle-Hellman cryptosystem (MHKC) [4,5]. Feature selection (FS) is a crucial preprocessing technique introduced to minimize the dimensionality of the machine learning datasets while keeping or improving their quality based on eliminating irrelevant, noisy, and redundant information in order to propose simpler classifers with high performance in less time [6]. Simply, FS has been used to eliminate unnecessary features that could signifcantly impact the performance of machine learning techniques. Some of the FS applications are cancer detection [7], image classifcation [8], genomics [9], and text categorization [10]. Te FS techniques are divided into three categories: flter, wrapper, and embedding methods [11]. In flter methods, feature selection is done independently of any particular machine learning technique, while in wrapper methods, features are chosen using a machine learning approach. Although wrapper approaches have been demonstrated to outperform flters in experimental outcomes, they come with a high computational expense [12].
Te KP01 dilemma aims to use resources efciently to maximize proft and minimize cost. KP01 is applied in several felds to save resources and maximize proft; some of which are investment decisions [13], cryptography [14], real estate property maintenance optimization [15], energy minimization [16], principal budgeting [17], the availableto-promise problem [18], cargo loading problems [19], adaptive multimedia systems [20], resource allocation [18], housing problem [21], computer memory [22], project portfolio selection [23], and cutting-stock problem [24]. Te public-key cryptosystem (PKC) is designed to encrypt any message using a public key before transmitting it to the recipient, who has a private key to decrypt the message to become readable. Merkle-Hellman knapsack cryptosystem (MHKC) is a common public-key cryptosystem (PKC) that uses a two-key system (one for the sender to use in encrypting the message and another for the recipient to use in decrypting the message) to protect the privacy of transmitted data [4].
Unfortunately, the traditional techniques have difculty fnding near-optimal solutions for the binary optimization problems because they require a huge amount of computational cost, which increases exponentially with increasing the number of dimensions to observe all possible permutations that might involve this solution [25]. Over the last few decades, the requirements for strong modern techniques for tackling those problems in an acceptable time have signifcantly increased. Terefore, the researchers have recently directed to metaheuristic algorithms that can fnd acceptable solutions for several optimization problems in a reasonable time [26]. Several metaheuristic algorithms have been presented in the literature to handle binary optimization problems, some of which are the gaining-sharing knowledge-based algorithm [27], binary sine cosine algorithm (BSCA) [28], binary salp swarm algorithm (BSSA) [29], and others [30]. Although these strategies have the potential to provide strong results, they sufer from at least one of the following issues: a lack of diversity, an imbalance between exploration and exploitation operators, or a propensity to become mired in local minima.
Recently, a new swarm-based meta-heuristic algorithm, namely the nutcracker optimization algorithm (NOA), inspired by the foraging, storage, and pine seed retrieval behaviors of the nutcracker in nature, has been presented to address continuous optimization methods [31]. Te great success of the NOA in solving various challenging CEC benchmarks leads us to present its binary variant for various binary optimization problems. To make the classical NOA applicable to discrete problems, two well-known families of the transfer function, namely S-shaped and V-shaped, are investigated for the binary NOA to identify the best one which could maximize its performance for a binary optimization problem. Additionally, a local search method is used to enhance BNOA; this NOA variant is termed BINOA, and it is based on efciently integrating some genetic operators into the BNOA's exploitation and exploration. Tree common binary optimization problems are used to verify the performance of BNOA and BINOA in comparison to several robust binary state-of-the-arts in terms of statistical information, statistical tests, and convergence speed. Tese problems include feature selection, 0-1 knapsack, and the Merkle-Hellman knapsack cryptosystem (MHKC). Te empirical results demonstrate that BINOA is superior to classical BNOA and other competing optimizers when it comes to solving the 0-1 knapsack and attacking the MHKC and is on par with other algorithms such as the genetic algorithm for feature selection. Tis paper presents the following contributions: (i) Proposing the frst binary variant of NOA (BNOA) using V-shaped and S-shaped transfer functions for binary optimization problems (ii) Integrating efectively BNOA with a local search strategy based on borrowing some genetic operators to present a better variant able to efectively handle the majority of the binary problems; this variant is called BINOA (iii) Validating both BNOA and BINOA using three diferent binary problems, namely feature selection, 0-1 knapsack problem, and Merkle-Hellman knapsack cryptosystem (MHKC) (iv) Results and comparisons under various statistical information disclose that BINOA is a strong binary technique for several binary problems.

Paper Organization
Te following sections of this paper are organized as clarifed in the following list: (i) Section 3 briefy reviews some of the techniques proposed for binary optimization problems (ii) Section 4 discusses three binary optimization problems: feature selection, 0-1 knapsack problem, and Merkle-Hellman knapsack cryptosystem (MHKC) solved in this study (iii) Section 5 overviews the classical nutcracker optimization algorithm (iv) Section 6 presents the proposed BINOA and BNOA (v) Section 7 presents results and discussion of the proposed three investigated binary problems.

Literature Review
Various binary optimization algorithms capable of overcoming several optimization problems such as FS, KP01, and 2 Complexity cryptanalysis in the Merkle-Hellman cryptosystem (MHKC) have been presented in several studies in the literature. Some of these algorithms will be reviewed in the rest of this section. Dhiman [32] presented the binary variant of the emperor penguin optimizer (EPO) adapted using various transfer functions (V-shaped and S-shaped) for tackling the feature selection problem; this variant is named BEPO.
In [33], the crow search algorithm (CSA) was integrated with the opposition-based learning (OBL) strategy for estimating the near-optimal subset of the feature. In addition, to identify the ideal subset of features in datasets, in [34], a binary marine predator algorithm, namely BMPA-TVSinV, used in conjunction with time-varying sinus and V-shaped transfer functions to become applicable for binary problems, was presented. Te performance of BMPA-TVSinV is signifcantly enhanced by these transfer functions. Te GWO and rough set (GWORS) method, as suggested in [35], was used in the feature selection of the recovered mammography images. Te proposed GWORS was compared to various well-known rough sets and bioinspired feature selection methods. It shows superior fndings compared to the rival optimizers.
In [36], four variants of a metaheuristic algorithm called a slime-mould algorithm (SMA) were introduced. Eight diferent transfer functions, including V-shaped and S-ones, are evaluated, and the one with the best overall performance is merged with the standard SMA. In order to further investigate promising solutions in the vicinity of the bestto-date solutions, the second approach, TMBSMA, combines BSMA with two-phase mutation (TM). Te third variant employs a new attacking-feeding strategy (AF), abbreviated as AFBSMA, which balances exploration and exploitation by taking into account individual particles' capacities for memory storage. As a result of merging TM/ AF with BSMA, the new version, FMBSMA, generates better solutions. In [37], three binary approaches based on symbiotic organisms search (SOS) were developed to deal with the feature selection issue. Moreover, two separate wrapper feature selection methods are introduced (FFA) [38], both of which are based on the farmland fertility algorithm [39]. Binary FFA algorithms (BFFAS and BFFAG) are simplifed forms of the original FFA algorithm. In the frst variant, the sigmoid function is applied. Te second variant of FFA uses a dynamic mutation (DM) operator in addition to the binary global memory update (BGMU) and binary local memory update (BLMU) operators for binarization.
For solving the 0-1 knapsack problem (KP01), the quantum-inspired wolf pack algorithm (QWPA) [61] was presented. Te QWPA was improved primarily by quantum rotation and quantum collapse. Both are used to avoid becoming stuck at local minima and speed up the process of arriving at the optimal solution. To guarantee its efcacy in practise, QWPA was compared to various optimization strategies. In addition, Gaussian perturbation and opposition-based learning (OBL) have been incorporated into butterfy optimization (MBO) techniques for solving the KP01 [13]. OBL was employed on half of the population late in the optimization process to speed up the rate at which solutions converged on the optimal one. At the same time, Gaussian perturbation at each iteration helps weakly ft an individual for avoiding settling into local optima. Te fower pollination algorithm (FPA) was adapted for tackling KP01 [62]. Since FPAs often produce continuous numbers as output, the author used a sigmoid transfer function to discretize the results. Te author also used a penalty function to ensure that unfeasible solutions were not favored. Additionally, an enhancement-repair method was used to make previously unachievable solutions feasible.
Te binary variant of equilibrium optimization (BEO) was presented for the solution of the discrete 0-1 knapsack problem [21]. Since the conventional equilibrium optimizer (EO) has been proposed for tackling continuous optimization problems, a discrete variation is necessary to be applicable for solving binary optimization problems. In order to convert continuous EO to binary EO (BEO), eight transfer functions, including V-Shaped and S-Shaped, were used. A binary version of the marine predators algorithm (MPA) was also presented for solving the 0-1 knapsack (KP01) issue [21]. To create the binary variant of MPA (BMPA), a variety of V-Shaped and S-Shaped transfer functions for transferring continuous values to binary are examined. Tere are several other recently published algorithms for KP01, some of which are the binary slime-mould algorithm [36], binary salp swarm algorithm [63], binary light spectrum optimizer [38], hybrid rice optimization algorithm [64], hybrid harmony search algorithm [65], hybrid quantum genetic algorithm [66], binary gainingsharing knowledge-based optimization algorithm [67], and improved binary pigeon inspired optimization algorithm [68].
To determine whether MHKC is highly secured and no attacker could break it, recent works have been used metaheuristic algorithms to predict the plain text for each ciphertext. Tis is done to demonstrate the vulnerability of the MHKC systems and encourage researchers to develop more secure and reliable systems for encrypting messages. Below, we shall examine a number of these methods in greater detail. Abdel-Basset et al. [4] modifed the whale optimization algorithm for this goal by introducing a mutation Complexity 3 operation to improve the solutions and a penalty function to eliminate the infeasible solution, namely MWOA. Tis method outperformed several common optimization approaches. In addition, an improved version of the ant-colony optimization for deciphering encrypted text was created by Grari et al. [69]. Some researchers have suggested breaking the MHKC using two binary variants of particle swarm optimization (BPSO) [70]. Two well-established metaheuristics algorithms-cuckoo search and genetic algorithm-were investigated for their potential to automate cryptoanalysis of the reduced knapsack cryptosystem [71]. Te trials validated cuckoo search as the optimal technique for automating the cryptographic analysis of a minimal cryptosystem. Diferential evolution was employed by Sinha et al. [72] to solve the MHKC, and the results were compared to those obtained by a genetic algorithm (GA) to prove its efcacy. A genetic algorithm for the MHKC has been presented by Kochladze et al. [73]. Te experimental fndings clearly showed that the method outperformed the more common Shamir algorithm. Te frefy algorithm (BFA) was used in a new attempt to show the insecurity of such MHKC systems [74]. Te MHKC is more vulnerable to attacks using BFA than GA.

Merkle-Hellman Knapsack Cryptosystem.
Tis is an example of an asymmetric public-key cryptosystem, wherein the message is encrypted with the help of a public key and then decrypted with the help of a private key so that only the intended receiver could recover the original message [75]. What follows is an explanation of how the cryptosystem encrypts and decrypts information. (1) is used to convert a superincreasing knapsack series into a trapdoor knapsack sequence.

Encryption. Equation
Take, for example, the following example, which assumes the superincreasing sequence, q, and r: (2) After then, the sequence of the trapdoor knapsack will be formed as it is indicated as follows: Suppose that the trapdoor knapsack sequence A � 20, 30,70,150,290,143,236,482 { } is used to encrypt the message "CAT" in accordance with the following: ta 1 shows the 8-bit ASCII encoding that is applied to this word. Te bits associated with each letter are multiplied by the trapdoor element that corresponds to it, and the products are added to create the encrypted text (or cipherText). For example, the cipherText for the letter "C" is a phrase about multiplying the frst bit (valued at 0) by the frst trapdoor element (valued at 20) to produce a result of 0; the second bit (valued at 1) by the second trapdoor element (valued at 30) to produce a result of 30; and so on. In the last column of Table 1, you will see the ciphertext for each character. Te last stage is sending the ciphertexts to the receiver, who has the private key and can thus decrypt them to read the original message.

Decryption.
In this stage, we go backward from the estimated cipherText to identify the bit representation that is applied to estimate the cipherText. Te specifcs of how to use a private key to decrypt ciphertext are covered in [4]. If the sender does not disclose their private key, is it still possible to approximate the original message? In this paper, we present a binary variant of a newly proposed metaheuristic algorithm, namely the nutcracker optimization algorithm (NOA), to show its ability for estimating the bit representation which could estimate the cipherText from the information given [69,75].

MHKC's Objective Function.
Te metaheuristic techniques always employ an objective function, also known as a ftness function, to evaluate the quality of each solution within its population when predicting the ciphertext from the plaintext. As a result, we will utilize a penalty function to represent this problem's objective function to scale everyone's ftness score between 0 and 1, where 1 indicates that the obtained individual was able to reach the values in the 4 Complexity trapdoor knapsack sequence. Tis function eliminates infeasible solutions whose total is bigger than the ciphertext by assigning them lower ftness. For the proposed algorithm, the following mathematical expression represents the penalty function used as the objective function [4,76]: T indicates the target sum or ciphertext, and x → i is a vector including the binary values of the ith solution.

Feature Selection Problem.
Feature selection is a crucial stage for machine-learning techniques to prepare data prior to the learning process to achieve maximum precision. Due to their high processing cost and lack of precision, standard methods cannot be used to solve this problem. Due to their capacity to solve many NP-hard problems in an acceptable amount of time, metaheuristic algorithms were the most viable solution for this situation. Before beginning the optimization process, metaheuristic algorithms randomly distribute N solutions, each with d features, with binary values, where 0 indicates the unselected dimension, and 1 indicates the selected dimension. Te FS belongs to the multiobjective problem class since it has two conficting objectives: optimizing accuracy and minimizing feature length. However, the vast majority of metaheuristic algorithms, including the proposed approach, were designed for single-objective issues and are therefore inapplicable to multiobjective situations. Numerous studies approached this issue in two distinct ways: Pareto optimality and weighting variables to combine two aims into one. In our suggested work, the second method is used to transform the multiobjective FS into a single objective by utilizing a weighting variable α that includes a value between 0 and 1 based on the preference of one objective over another. Te objective function used with the suggested algorithm to locate the smallest subset of features that could simultaneously maximize accuracy and minimize training time: where c R (D) is the classifcation error rate that was returned from the K-nearest neighbor algorithm after separating the dataset into training and testing datasets using the holdout method, |S| indicates the selected features, |N1| refers to the length of the features in the dataset, and β is computed as follows: β � (1 − α).

0-1 Knapsack Optimization Problem (KP01)
. KP01 assumes you have a knapsack of capacity c and want to carry n objects, each of which has its proft r and weight w. Te goal here is to maximize your earnings while minimizing the amount of weight you have to carry in your knapsack, which has a weight restriction of c. Te objective function for this problem is mathematically formulated as follows: Maximize n z�1 x z * pr z , Subject to n z�1 w z * x z ≤ c,

Nutcracker Optimization Algorithm (NOA)
Recently, a new optimization method, namely nutcracker optimization algorithm (NOA), has been proposed for tackling the continuous optimization problems [31]. Te foraging, storage, and pine seeds retrieval behaviors of nutcrackers are simulated by NOA. In a general sense, there are two main patterns of behavior in nutcrackers, each of which occurs at a diferent time. Te frst pattern of activity, prevalent in the warmer months of the year, indicates the nutcracker gathering seeds to store for the winter. An alternative behavior during the winter and spring months is considered, which is to seek the concealed caches marked at diferent angles using various objects or markers as reference points. If the nutcrackers cannot get to their food supply in the stored seeds, they will forage for food by randomly probing the search area. Te mathematical model of NOA based on two main strategies, namely foraging and storage strategy and cache-search and recovery strategy, is formulated as follows.

Foraging and Storage
Strategy. Tis strategy can be broken down into its two primary stages, which are called foraging and storing, respectively, and are explained in more detail as follows.

Foraging Stage: Exploration Phase 1.
At this phase, the nutcrackers begin to randomly assume their beginning positions inside the search space. Each nutcracker begins by examining the original position of the cone containing the seeds. If the nutcracker discovers viable seeds, it will transport them to the storage region and bury them there. If the nutcracker is unable to locate viable seeds, it will search for another cone in a diferent location within pine trees or other trees. Tis behavior can be mathematically represented using the following position update strategy: where X →t+1 i refers to the new solution of the i th nutcracker at iteration t + 1; X t i,j refers to the jth dimension of the i th solution at iteration t; L j and U j are used to stand for the lower bound and upper bound of the jth dimension in the optimization problem; c is a numerical value generated randomly using the levy fight; X t best,j represents the jth position of the best-to-date solution; A, C, and B are the indices of three solutions selected randomly from the population to search globally for a high-quality food source; τ 1 , τ 2 , r, and r 1 are real numbers generated randomly in the range of [0, 1]; X t m,j represents the mean of all solutions in the current population for the jth dimensions of at iteration t; and μ is a numerical value estimated according to the following equation: where r 2 , r 3 , and τ 4 are real numbers generated in the range of [0, 1]. τ 4 is a random number based on the normal distribution, and τ 5 is a random number based on the levy fight (τ 5 ).

Storage Stage: Exploitation Phase 1.
Nutcrackers begin by carrying the food acquired in the previous phase, exploration phase 1, to temporary storage areas. In this frst phase of exploitation, the nutcrackers harvest pine seed crops and store them. Te mathematical expression for this behavior is as follows: where λ is a random number based on the lévy fight, and l is a linearly decreased factor from 1 to 0 to diverse in the exploitation operator. Te following formula governs the exchange between the foraging phase and the cache to preserve an equilibrium between exploration and exploitation operators during the optimization process: where φ is a number generated randomly at the interval 0 and 1, and P a 1 is a probability value that decreases linearly from 1 to 0.

Cache-Search and Recovery
Strategy. Tis strategy can be broken down into two stages: Cache-search and recovery phases, which are discussed in depth in the next two sections.

Cache-Search Stage: Exploration Phase 2.
In NOA, each nutcracker in the population uses two RPs for each cache as signals are defned using the following matrix: where RP � �→t i,1 indicates the frst RPs of the i th nutcracker at iteration t. Tere are two diferent equations developed to produce the frst and second RPs, respectively, to enhance the nutcracker exploration operation while looking for concealed caches. Te frst RP can be generated according to equation (12), while the second RP is computed according to the following equation: where X →t i refers to the cache of the i th nutcracker at iteration t; L → and U → refer to the lower and upper boundaries, respectively; r 2 → is a randomly generated vector between 0 and 1; X →t A is a solution selected randomly from the current population; θ is a number generated randomly between 0 and π; P rp is a probability used to determine the percentage of the exploration operator; and α can be computed according to equation (15) to ensure that the NOA converges regularly.
where T max and t stand for the maximum number of generations and the current generation, respectively. In NOA, all nutcrackers will utilize the exploration process to locate the most promising regions that may have a near-optimal solution. Te algorithm will seek and utilize areas surrounding caches with suitable RPs to avoid being stuck in local minima. Te position of a nutcracker is updated based on the frst RP using the following equation:

Recovery Stage: Exploitation Phase 2.
Te frst possibility is that a nutcracker can recall the position of his cache using the frst RP. Te mathematical model of this behavior is as follows: where X t ij represents the j th dimension of the i th solution at iteration t; r 1 , r 2 , τ 3 , and τ 4 are three randomly generated numbers between 0 and 1, and C represents the index of a randomly chosen solution from the population. Te second possibility is that the nutcracker forgets where he hid his food using the frst RP, in which case he will use the second Complexity 7 RP to search for it. A nutcracker memorizes a lot of the RPs it will take during early storage. Nutcrackers recover caches on the frst attempt (with the frst reference), but the proposed algorithm takes the likelihood of failing on the frst attempt into account. Te nutcrackers that cannot reach their cache based on nearby landmarks may vanish when the weather changes between autumn and winter will activate the spatial memory to the second RP according to the following formula: In NOA, it is assumed that a nutcracker will use the second RP to locate its cache, so equation (17) is updated based on the second RP: where r 1 , r 2 , τ 5 , and τ 6 are numerical values generated randomly in the interval [0, 1]. Te tradeof between the frst and second RPs within the recovery behavior is achieved according to the following equation: where τ 7 and τ 8 are numbers generated randomly between 0 and 1. A nutcracker that remembers the hidden store is represented by the frst case in the previous equation, whereas a nutcracker who forgets the hidden store is represented by the second case. Activating the spatial memory between the frst and second RPs, as well as the current position, to search for the cache is achieved according to the following equation: Eq. (17), otherwise.
Finally, the recovery and cache-search stages are randomly exchanged according to the following formula to balance between exploitation and exploration: where P a 2 is a predefned value between 0 and 1 to determine the probability of the exploitation stage within the optimization process However, the nutcrackers in the NOA model remain in their current position if the quality of their solution is superior to that of the new solution. In conclusion, the preceding concept can be expressed by equation (23). Finally, the pseudo-code of NOA is summarized in Algorithm 1.

The Proposed Algorithms: BNOA and BINOA
Tis section is presented to clarify how the classical NOA is adapted for tackling binary optimization problems. In addition, the improvement strategy integrated with the binary variant of NOA (BNOA) is explained in this section. Te main steps of the proposed algorithms are as follows: After randomly initializing N solutions with binary values as represented in Figure 1 and analyzing each solution to determine the best-to-date solution with the fttest objective value, the optimization process will begin searching for a superior solution. However, because the solutions created by metaheuristic algorithms are continuous, they cannot be directly applied to binary problems. Terefore, two well-known families of transfer functions will be used to convert these continuous values into binary values, so that they can be applied to these problems. Tese transfer functions will be described in depth in the next section.

Transfer Functions.
Te majority of proposed metaheuristic algorithms are designed to handle continuous issues, but they are inapplicable to solving binary problems like attacking the MHKC [77,78]. Terefore, two wellknown transfer function families, S-shaped and V-shaped, mathematically given in Table 2 have been proposed to normalize the continuous solutions between 0 and 1. Ten, according to equation (24), these normalized values are changed to 0 and 1 to become applicable for binary problems [78]. Figure 2 illustrates the distinction between S-shaped and V-shaped transfer functions. Tese transfer functions are used in conjunction with the classical NOA to present a binary variant, namely BNOA, applicable for tackling binary optimization problems. Te pseudo-code of BNOA is presented in Algorithm 2.
6.3. Uniform Crossover Operator. In the proposed algorithm, this operator is based on two solutions: the frst one is the best-so-far solution, and the second one X b r �→ is selected randomly from the best binary local solutions of all individuals in the current population. It then generates two ofspring based on a randomization process that defnes which positions from the frst solution will be exchanged with the identical positions from the second solution. Tis operator is depicted in Figure 3 for clarity's sake. In this fgure, the same binary values at the blue cells of each solution will be transferred to the new ofspring as they are; meanwhile, the other cells will be exchanged between those two solutions. (2) Evaluation and extracting the best solution (3) t � 1; //the current evaluation// (4) while (t < T max ) (5) σ and σ 1 : two numbers generated randomly between 0 and 1. (6) If σ < σ 1 // * Foraging and storage strategy * // (7) φ is a random number between 0 and 1.
In the frst fold, the solution obtained using either of the DPS or uniform crossover operators is mutated a predefned number of times by the fip and swap operators in an attempt to improve its quality. However, the frst fold in this operator may not produce a superior solution because the optimal solution may require minor modifcations in the parent. Terefore, the second fold is designed to be based only on the fip and swap operators. Te second fold is only applied to the best-so-far solution and the local best binary solutions due to our hypothesis that those solutions are the closest to the desired optimal solution. Te exchange between the twofold is randomly achieved as described in Algorithm 3. In general, Algorithm 3 accepts as inputs the best-to-date solution X best ����→ , the randomly selected binary local solution X b r �→ , the randomly selected binary solution X cb r ��→ , the maximum trial of the fip and mutation for the frst fold µ, and the maximum trial of the fip and mutation for the second fold µ 1 . In Lines 1-20 of this algorithm, two random numbers r and r 1 are generated, and if r is less than r 1 , the frst fold in this operator will be invoked to generate a new binary solution as an attempt to improve the exploration operator of BNOA. In Lines 21-39, if r is greater than r 1 , the fip and swap operators will be executed with a predefned maximum trial on the solution selected randomly from both X b r �→ and X best ����→ as a trial to improve convergence speed in the direction of the near-optimal solution. Tis operator is integrated with BNOA to propose a new variant, namely BINOA. Te fnal pseudo-code of this improved variant is presented in Algorithm 4, which is depicted in Figure 5.

Experimental Findings for Binary Problems
Tree binary problems-feature selection (FS), the 0-1 knapsack problem (KP01), and cryptanalysis in the Merkle-Hellman cryptosystem-are validated in this section to illustrate how well the proposed BINOA performs in comparison to the classical BNOA and the other optimization algorithms. All algorithms are implemented using the Matlab platform with the same population size and maximum function evaluations on a device with the following specifcations: 32 GB of RAM, a Core i7 2.40 GHz Intel processor, and Windows 10 were installed.

Yes Start
Initialization t < T max Generate random numbers σ and σ 1 between 0 and 1.

Yes
Yes φ is a random number between 0 and 1.
ϕ is a random number between 0 and 1.
Updating by Eq. (7) No  7.1.1. Parameter Settings. BINOA has fve parameters that need to be identifed accurately prior to starting the optimization process to maximize its performance. Two parameters of these are for the newly proposed binary search strategy, namely µ and µ 1 , which is estimated here using extensive experiments under numerous values as depicted in Figure 6. Tis fgure discloses that the optimal value for these parameters is 1. Regarding the other parameters, they are related to the classical NOA and also estimated here using several experiments under various values as shown in Figure 7. Tis fgure shows that the optimal value for these parameters, namely δ, P a 2 , and P rp , is 0.1, 0.05, and 0.05, respectively.

Investigation of Various Transfer Functions.
Tis section examines the performance of S-shaped and V-shaped transfer functions for the binary variant of NOA in small-and medium-scale KP01 instances in order to determine how well each one works. Consequently, each transfer function is conducted 25 times independently, and the average ftness value is displayed in Figure 8. Tis fgure illustrates that the transfer function S2 is the best, followed by S1 as the second-best one, while S4 and V1 are the worst.
In the subsequent experiments, S2 is used in conjunction with BINOA to demonstrate its efectiveness in comparison to a number of other binary optimizers for small-and medium-scale KP01 instances.

Performance Evaluation Using Small-and Medium-Scale Instances.
Here, we provide the results of the experiments conducted to assess BINOA's efcacy in solving the medium-and small-scale KP01 instances. Te tabular data in Tables 4-6 represent the results of 25 replicated runs of each technique on these instances, including the best, average, worst, and SD values. Tables 4 and 6 show that the best, average, worst, and SD values for the proposed BINOA in comparison to   various competing optimizers are all competitive for smallscale instances. In contrast, for the medium-scale instances solved in Table 5, BINOA performs much better for the majority of the performance indicators on all investigated instances. In addition, the Wilcoxon rank-sum test was used to determine the p value between the outcomes of BINOA and those of each compared algorithm in order to examine the diference between them. Tables 4 and 5 detail the outcomes of this test. Examining these tables demonstrates rivalry for smallscale cases and BINOA preeminence for medium-sized instances. In addition, the convergence curve is used to analyze the efectiveness of various approaches for obtaining the optimal ftness value more quickly. Consequently, the average convergence values after 25 replicated runs are displayed in Figure 10, which reveals that the proposed optimizer may obtain the highest ftness value for all displayed KP01 instances faster than all competing optimizers. In summary, the proposed method was shown to be the most promising in terms of the convergence curve and ultimate accuracy for small-and medium-sized instances.

Performance Evaluation Using Large-Scale Instances.
Tis section frst investigates the performance of various transfer functions for the binary variant of NOA in the largescale KP01 instances to show how well each one performs. Terefore, under each transfer function, it is executed 25 independent times, and the average ftness value is reported in Figure 11. Tis fgure reveals the robust performance of the V-shaped, especially V4, and the poor performance of the S-shaped, which could not satisfy the knapsack capacity as a constraint. As a result, in the following experiments, V2 is used in conjunction with BINOA to demonstrate its effectiveness in comparison to several other binary optimizers for large-scale instances.
For large-scale instances with varying correlations between profts and weights, additional experiments are herein provided to verify the superiority of the proposed BINOA. Te proposed algorithm is compared to four robust binary optimizers, including BEO, BDE, BMRFO, and BWOA, to reveal its efcacy for these instances. Specifcally, these compared algorithms are here used because they have already been proved to perform well on both small-and medium-sized instances. Te average ftness value and the p value obtained by the Wilcoxon ranksum test after 25 replications for each method on these datasets are shown in Figure 12 and Table 7, respectively. Tese results show that BINOA is statistically and practically superior to all other optimizers. However, unfortunately, for large-scale datasets, it could not reach the known optimal values due to a lack of population diversity, which is its main limitation that will be tackled in the future. (MHKC). Te efciency of the proposed approach for breaking the MHKC with an 8-bit knapsack size is analyzed here. Each algorithm is performed 30 times with a population size of 30 and a maximum number of iterations of 30 under the same environment conditions, as was previously mentioned. Te settings for the competing algorithms are taken from the cited works. Te proposed algorithm's parameters are also set according to previous recommendations. Te performance of BNOA is maximized for various binary optimization problems by selecting the best transfer function which could adapt it according to the nature of these binary problems. Consequently, each transfer function in conjunction with BNOA is conducted 30 times independently, and the average ftness value is displayed in Figure 13. Tis fgure illustrates the superior performance of S4, V2, and V4. In the subsequent studies, S4 is used in conjunction with BIHOA to compare its performance to that of several other binary optimizers for breaking the MHKC with an 8-bit knapsack size.

Application II: Merkle-Hellman Knapsack Cryptosystem
Tis message will be sent to its destination in the form of the ciphertext shown in Table 8 once it has been encrypted using the above information. After running each algorithm 30 independent times, the average ftness value, average SD, and convergence speed for each character were determined and displayed in Figure 14. Te results shown in this fgure show that the proposed algorithm provides far better performance for this message than any of the other approaches.

Test Case 2: MACRO Message under 8-Bit.
In addition, the suggested algorithm is evaluated using the "MACRO" message, which has been extensively utilized in the literature to test newly proposed optimization algorithms. Te MHKC encrypts and decrypts this message using the following [70]: Te ciphertext for each character, as shown in Table 9, will be sent to the recipient after the message has been encrypted using the preceding information. Te encrypted message would be unreadable even if the ciphertext was intercepted. Extensive experiments were conducted using the proposed algorithm and seven other metaheuristic algorithms to see if the plaintext could be deduced from the ciphertext to demonstrate the efectiveness of the MHKC systems at protecting this message. Our experimental results are shown in Figure 15, which reveals that BINOA performs better than the other methods in terms of average ftness, convergence curve, and SD.
Te recipient will receive the ciphertext for each character in the message encrypted with the above data, as shown in Table 10. After running each technique 30 times on each character, the performance metrics were calculated and shown in Figure 16, clearly showing that BINOA is the best compared to the others.

Comparison between BNOA and BINOA.
Here, we discuss the outcomes of several experiments conducted to evaluate the efectiveness of the traditional BNOA and BINOA at attacking various messages encrypted under the MHKC, as depicted in Figure 17. Tis fgure compares the average ftness of BNOA and BINOA for three messages to determine which algorithm is superior. As shown in this fgure, BINOA outperformed the conventional BNOA in every instance reported.

Application III: Feature Selection Problem.
In this section, the binary variant of classical NOA is efectively compared to the improved one to disclose their potential for solving the feature selection using eleven datasets taken from  Complexity the UCI repository and described in Table 11

Investigation of Various Transfer Functions.
As described before, the performance of BHOA is signifcantly dependent on the used transfer function. Terefore, in this section, various V-shaped and S-shaped transfer functions will be investigated to see which one will help BNOA achieve better performance in feature selection. First, BNOA in conjunction with each transfer function is executed 20 independent times on three instances (ID#1, ID#2, and ID#3), and the average ftness value is computed and reported in Figure 18. Tis fgure reveals that V3 could fulfll better outcomes in two instances, followed by V2 as the second best one. Terefore, in the next experiments, V3 is used in conjunction with BNOA and BINOA to conduct extensive experiments, showing their performance in feature selection.    Figure 19. Tis fgure confrms that BINOA outperforms the classical BNOA on all reported datasets, demonstrating the superior efectiveness of the improved variant. Terefore, the classical BNOA benefts from our enhancements.

Comparison Using Various Performance Indicators.
All algorithms are compared based on the average values of four performance metrics obtained within 20 independent    Figure 20. From observing this fgure, BINOA could be competitive with GAT in terms of classifcation accuracy and ftness value, and superior to the other metrics. Compared to the other algorithms, BINOA could signifcantly outperform all in terms of classifcation accuracy and ftness value, but its       performance is slightly poor for the average number of selected features and the average SD in comparison to BMPA.

Conclusion and Future Work
For binary optimization problems, this article introduces a new binary variant of a nature-inspired metaheuristic algorithm called the nutcracker optimization algorithm (NOA). Tis variant, dubbed binary NOA (BNOA), uses two families of transfer functions, the S-shaped and V-shaped transfer functions in order to bridge the gap between the continuous nature of the classical NOA and the discrete nature of the binary problems. Another binary improvement of NOA is presented in this study by employing a local search approach predicated on efciently integrating some genetic operators to the exploitation and exploration of the BNOA; this improved variant is abbreviated BINOA. Tree common binary optimization problems (feature selection, 0-1 knapsack, and the Merkle-Hellman knapsack cryptosystem (MHKC)) are used to compare BNOA and BINOA to numerous binary state-ofthe-art in terms of statistical information, statistical tests, and convergence speed. Te experimental results demonstrate that BINOA is superior to classical BNOA and other competing optimizers when it comes to solving the 0-1 knapsack problem and attacking the MHKC and is on par with other algorithms such as the genetic algorithm for feature selection. Te main limitation of our proposed algorithm is that it could not reach the known optimal values for large-scale optimization problems due to the lack of population diversity that prevented it from checking several possible solutions. In the future, we will be utilizing Pareto optimality to address the feature selection problem and to optimize both the classifcation accuracy and the length of the selected features, both of which are conficting objectives.

Data Availability
Te data used to support the fndings of this study are included within the article.

Conflicts of Interest
Te authors declare that they have any conficts of interest.   Complexity