Hybrid Hypercube Optimization Search Algorithm and Multilayer Perceptron Neural Network for Medical Data Classification

The hypercube optimization search (HOS) approach is a new efficient and robust metaheuristic algorithm that simulates the dove's movement in quest of new food sites in nature, utilizing hypercubes to depict the search zones. In medical informatics, the classification of medical data is one of the most challenging tasks because of the uncertainty and nature of healthcare data. This paper proposes the use of the HOS algorithm for training multilayer perceptrons (MLP), one of the most extensively used neural networks (NNs), to enhance its efficacy as a decision support tool for medical data classification. The proposed HOS-MLP model is tested on four significant medical datasets: orthopedic patients, diabetes, coronary heart disease, and breast cancer, to assess HOS's success in training MLP. For verification, the results are compared with eleven different classifiers and eight well-regarded MLP trainer metaheuristic algorithms: particle swarm optimization (PSO), biogeography-based optimizer (BBO), the firefly algorithm (FFA), artificial bee colony (ABC), genetic algorithm (GA), bat algorithm (BAT), monarch butterfly optimizer (MBO), and the flower pollination algorithm (FPA). The experimental results demonstrate that the MLP trained by HOS outperforms the other comparative models regarding mean square error (MSE), classification accuracy, and convergence rate. The findings also reveal that the HOS help the MLP to produce more accurate results than other classification algorithms for the prediction of diseases.

Although various OAs have already been investigated for training MLP neural networks, because of the duality of Aos' exploration and exploitation capabilities, there is still room for new designs and upgrades to current ones [30]. Also, in training MLP, the issue of slow convergence rate and trapping in local optima remains partially unsolved. e purpose of this study is to introduce a new optimization technique, called the hypercube optimization search (HOS) algorithm, for training MLP to present an improved classification approach for medical data by optimizing the MLP's weights and biases parameters. e HOS is recommended for training MLP to overcome the aforementioned challenges due to its outstanding performance in escaping local optima and fast convergence speed [31,32]. Also, HOS have fewer parameters and is easy to use, simple in principle, and adaptable when compared to other swarmbased OAs.
is paper's contributions can be summed up as follows: (i) To propose a new stochastic learning approach for training MLPs, in order to boost the MLP's performance in the classification of health data. (ii) To evaluate HOS-MLP's performance on four important medical datasets: diabetes, breast cancer, coronary heart disease, and orthopedic patients, and compare its performance against eleven different classifiers and eight well-known OA-based MLP trainer techniques. (iii) To achieve better outcomes than previous studies, using suggested HOS-MLP in terms of mean square error (MSE), classification accuracy, and convergence rate.
is paper is structured as follows: Section 2 presents the MLP. e HOS algorithm is explained in Section 3, whereas the proposed HOS-MLP approach is introduced in Section 4. Section 5 shows the experimental results and discussion. Finally, section 6 gives a conclusion as well as recommendations for further work.

Multilayer Perceptron Neural Network
e feedforward neural network (FNN) is one of the most prevalent forms of artificial neural network (ANN) and MLP is a well-known type of FNN that is widely used in solving realistic classification problems [10]. An MLP is made up of three groups of layers: input (i), hidden layer (j), and output (k). Each layer consists of a specific number of neurons, and each neuron has full-weighted connections with the adjacent layer neurons. A single hidden layer MLP network was used in this paper, which was demonstrated in Figure 1.
Each neuron can carry out two functions in the MLP: weighted summation and activation. e weighted sum is calculated using equation (1) for each hidden neuron j as follow: where w ij describes the connection weight, β j is the biased term, x i denotes the input i, and n shows the total number of inputs. In the second step, using the outcome of equation (1), an activation function is utilized to calculate the neurons' output. e function is illustrated as follows: (2) e most commonly used sigmoid activation function was selected in the MLP [22,26]. Utilizing the results of the hidden neurons, the final productions of the output neurons are computed as follows: MLP's performance depends highly on weights and biases, and training MLP aims to find optimal weights and biases. MLP training is a challenging task that leads to a high-performance MLP [21]. e HOS algorithm is used as a training method for MLPs in the following sections.

Hypercube Optimization Search Algorithm
e HOS algorithm, inspired by a dove's behavior in exploring new food zones, was proposed by Abiyev and Tunay for solving high-dimensional numerical problems [31]. e HOS algorithm is based on a randomly distributed set of points inside an m-dimensional hypercube (HC). HOS exhibit fast population convergence by shrinking the area of the HC at each iteration. e HOS algorithm consists of three stages: (A) the initialization process, (B) the displacement-shrink process, and (C) the searching areas process. ese stages can be described in detail as follows.

Stage A: Initialization Process.
e HOS algorithm begins with the initialization process, in which randomly generated points within a given HC form the candidate solutions matrix. Several starting conditions in the initialization phase should be computed, including (1) lowerupper boundaries (lb, ub), (2) size ( r di m ), (3) central value (x c ), and (4) dimension of the HC (m).
At the starting stage, the first HC is created by assigning random values to r di m and x c . e uniformly distributed N points x i � (x i1 , x i2 , . . . , x im ) are then randomly produced inside the HC. ese points could also be represented in matrix form X with the size of (N × m). e upper and lower boundaries of the first HC are then calculated using the X matrix. e r di m and x c of the next HC are determined using 2 Computational Intelligence and Neuroscience those boundaries. e X matrix is also utilized for evaluation, in which the best value of the fitness function F best and corresponding x best point is determined within the population at i th iteration. Using local search, the x best point is improved as follows: where F is the fitness function, and 0 ≤ ρ ≤ 1.

Stage B: Displacement-Shrink
Process. e displacementshrink phase aims to determine the center of the next hypercube (new hypercube) x c new and evaluate the fitness function. e center of the next hypercube is obtained using the average of the sum of the previous hypercube's center and the present best point (x best ) as follows: In this process, each iteration generates fresh data points, and the fitness function is evaluated. e hypercube size has been modified based on the evaluation results. is process is used as a conservative measure to reduce excessive variability in the search space. As a consequence, the size of HC is decreased and the search space is reduced, which is called "shrinking." e density of the search points (population) increases as the hypercube size decreases. e movement of the best value is governed by contraction. For smaller movements, the contraction is stronger. is ensures rapid convergence while also preventing the algorithm from becoming trapped at an undesirable (local) minimum. e algorithm will cycle through a sequence of points starting from the current position to estimate the maximum distance. e value of the F best is first compared with the F mean � F((x best + x last−center ) /2). If F mean value is less than F best in the given iteration, x displacements (or x movements) is computed and normalized twice at each iteration using the following formulas: Normalized To convert the displacement into unity-sided points, each element of x is first divided by the associated beginning interval (equations (7) and (8)), and then this number is again normalized by dividing it to the diagonal of the points, i.e., �� m √ (equations (9) and (10)). If F mean value in the specified iteration is greater than F best , x displacements will not occur and d nn will be assigned to 1. e searching areas process is carried out in the next step if the conditions are not met. Computational Intelligence and Neuroscience

Stage C: Searching Areas Process.
e phase of the search area generates a new HC by initializing new values to r di m and x c according to the value of d nn . If the 0 ≤ d nn ≤ 1 condition is satisfied, the factor of convergence S is calculated and values of the r di m and x c are updated accordingly using the following formulas: e size of the HC is reduced by multiplying r di m with S factor. If 0 ≤ d nn ≤ 1 condition is met, the size of HC remains unchanged. HOS ensure the quick arrival of candidate solutions to a global minimum by reducing the area of the hypercube after each iteration. e entire procedure is repeated till particular termination criteria are met. e HOS algorithm is depicted in Figure 2. More details are provided in [31,32].

HOS for MLP Training
e suggested HOS-MLP method, in which the HOS algorithm is utilized for training the MLP, is explained in detail in this section. When the method is designed, two important aspects are considered: (1) the representation of candidate solutions in HOS for training MLP, and (2) the definition of a fitness function for solution assessment. e matrix encoding approach is utilized in HOS-MLP to represent candidate solutions. For MLP's weight and bias parameters, each solution provides a set of values. A solution can be represented as follows: where W 1 indicates the weight matrix of linkages between hidden neurons and input and W 2 ′ demonstrates the transpose of the weights matrix of the linkages between the hidden neurons and output. For hidden and output neurons, the β 1 , and β 2 represent bias values, respectively. It is worth mentioning that the number of neurons in the input and output layers is specified by the dataset's total number of features and labels, while the Kolmogorov theorem is utilized to determine the number of neurons within the hidden layer (H) using the following equation: e MSE is utilized as the objective function for measuring the fitness value of candidate solutions in the proposed HOS-MLP approach as follows: where y and y symbolize the actual and predicted class label, and n is the number of samples in training data. e HOS based MLP training approach is carried out in the following stages: (1) Initialization: within an HC, the initial solutions (points) are generated randomly. Each solution represents the possible values for the parameters of MLP.

Results and Discussions
In this section, the proposed HOS-MLP model is examined on four medical datasets: orthopedic patients (vertebral column) [33], diabetes [34], coronary heart disease (Saheart) [35], and Wisconsin breast cancer [36]. e characteristics of the medical datasets are summarized in Table 1.
All medical datasets are split into two parts: 66.66% of the data is used for the training set, and the remaining (33.33%) is used for the test set. In this partitioning, stratified sampling is used to retain the initial data distribution in the training and testing. e algorithms have been run 20 different times to produce statistically valid results. e Minmax scaling method was utilized to standardize all feature values within the range [0, 1] using the following equation: e suggested HOS-MLP is compared with eight wellknown and recent Oas, including ABC [12], PSO [16], BAT [28], GA [37], BBO [38], firefly algorithm (FF/FA) [39], monarch butterfly optimization (MBO) [40], and flower pollination algorithm (FPA) [41]. For all OAs, the population size was set to 70, and the maximum number of 4 Computational Intelligence and Neuroscience Computational Intelligence and Neuroscience 5 iterations was set to 250 in all experiments. Two optional parameters in the HOS algorithm, tolF, and tolX, were set to 1e-09 and 1e-01, respectively. e parameters tolF and tolX represent relative tolerance for fitness function and vector x to stop the algorithm. e evaluation measures employed in this work are accuracy, MSE, box plot, and coverage rate. e rest of the parameters were set as suggested in [42].

Breast Cancer Dataset.
Many binary classification problems use accuracy and MSE metrics to show the model's ability to split the two-class labels. Table 2 summarizes the testing set results for the suggested HOS-MLP model compared to other OAs models from the literature. From Table 2, Figures 4(a) and 4(b), it can be noticed that the suggested method performs very better than other methods in terms of convergence rate. Although all algorithms achieved high ratios in terms of average accuracy, the suggested HOS-MLP shows reasonable and competitive results with the lowest MSE average (Figures 4(c) and 4(d)).

Diabetes
Dataset. e diabetes dataset evaluation results are illustrated in Table 3 and Figure 5. When the convergence curves in Figure 5(a) and 5(b) are compared to the other algorithms, the suggested strategy has a very high convergence rate, while most methods, such as GOA and ABC, have stagnated after 98 iterations. e proposed approach displays the maximum ratios in terms of average and best accuracy (Table 3 and Figure 5(c) Step B: displacement-shrink process Step C: searching space process Improve the solutions with HOS algorithm End HOS Figure 3: e flowchart of suggested HOS-MLP for medical data classification.  Computational Intelligence and Neuroscience ( Figure 5(d)) indicates that, while GOA has a more compact box, the proposed approach has the lowest error and acceptable stability.

Saheart Dataset.
Comparing the HOS-ML model with other OAs models from Table 4, we obtained better accuracy and MSE. is observation proves that HOS-ML can accurately model classification tasks. Figure 6 demonstrates the proposed HOS-ML model's accuracy, MSE, convergence speed, and stability. In terms of convergence speed, Figures 6(a) and 6(b) illustrate that, relative to the other algorithms, the proposed MLP-based trainer has a very fast convergence rate and the smallest MSE average (see Figure 6(d)). e suggested strategy produces an improved performance in contrast to other methods in terms of average accuracy (Figure 6(c)).

Vertebral Dataset.
e results of the evaluations for the vertebral dataset is shown in Table 5 and Figure 7. For this dataset, the evaluation results of all MLP-trainers were very close and competitive, but our proposed approach showed very faster convergence as can be seen in Figures 7(a) and  7(b). e boxplot (Figure 7(d)) also confirms that the proposed approach has the smallest MSE. Moreover, our suggested algorithm has obtained outstanding performance in terms of worst, average, and best accuracy (Table 5 and Figure 7(c)). e average classification accuracy of eleven different classifiers on 4 medical datasets is shown in Table 6 and Figure 8. ese classifiers are Naïve Bayes (NB), Bayes network learning (BayesNet), support vector machine (SVM) [43,44], MLP using backpropagation (MLP), K nearest neighbor (KNN), AdaboostM1 [45], bagging, fuzzy lattice reasoning (FLR) classifier, random forest (RF) [46],  Computational Intelligence and Neuroscience       fuzzy unordered rule induction algorithm (FURIA), and logistic model tree (LMT). As shown in Table 6 and Figure 8, the proposed algorithm has the best performance among the eleven algorithms on 3 medical datasets. For the diabetes dataset, the proposed HOS-ML ranked 4 th , after SVM, BayesNet, and LMT.
Overall, the experimental findings demonstrate that the MSE results of the proposed HOS-MLP are greatly better relative to other MLP-based optimization techniques for all medical datasets. e outstanding advantage of HOS is that it can achieve accurate results with a significantly higher convergence rate than other existing methods. However, some parameters in HOS should be adjusted, and some elements of HOS can be tweaked to increase the algorithm's classification accuracy in certain datasets.

Conclusion
is study introduced an improved classification approach, HOS-MLP, to increase the precision of medical diagnosis. e HOS algorithm was employed to adjust the MLP weights and bias values. e high-performance, simplicity, and fast convergence speed of the HOS algorithm were the inspiration behind the choice of HOS for training MLP. To evaluate the efficacy of the suggested HOS-MLP, its classification performance was assessed on four challenging real biomedical datasets: coronary heart disease, orthopedic patients, diabetes, and breast cancer.
e performance of the model was compared with eleven different classifiers and eight well-known OA-based MLPtrainers such as ABC, GA, BAT, BBO, PSO, FF, FPA, and MBO.
e experimental results of HOS on those biomedical classification problems are promising in terms of convergence rate compared to existing OAs. It managed to demonstrate better classification accuracy in most cases. We conclude that the HOS can train MLP well for classifying biomedical datasets since the HOS-trained MLP presents a higher convergence speed and better classification accuracy than current MLP training techniques and existing state-of-the-art classifiers.
In future work, HOS can be utilized to find the optimal structure of the MLP neural network, including the number of hidden layers and nodes. HOS can also be employed to train other forms of ANNs, such as the radial basis function (RBF). It may also be a valuable contribution to solving engineering classification problems using the proposed HOS-MLP.

Data Availability
No data were used to support this study.