Boosted Sine Cosine Algorithm with Application to Medical Diagnosis

The sine cosine algorithm (SCA) was proposed for solving optimization tasks, of which the way to obtain the optimal solution is mainly through the continuous iteration of the sine and cosine update formulas. However, SCA also faces low population diversity and stagnation of locally optimal solutions. Hence, we try to eliminate these problems by proposing an enhanced version of SCA, named ESCA_PSO. ESCA_PSO is proposed based on hybrid SCA and particle swarm optimization (PSO) by incorporating multiple mutation strategies into the original SCA_PSO. To validate the effect of ESCA_PSO in handling global optimization problems, ESCA_PSO was compared with quality algorithms on various types of benchmark functions. In addition, the proposed ESCA_PSO was employed to tune the best parameters of support vector machines for dealing with medical diagnosis tasks. The results prove the efficiency of the proposed algorithms in solving optimization problems.

1. Introduction 1.1. Motivation. Many problems in real life can be summarized as global optimization problems. When it comes to increasingly complex optimization problems, traditional methods are generally unsatisfactory [1]. Therefore, many scholars began to explore new solutions. The metaheuristic algorithm (MA) is developed for obtaining and grasping information to effectively find approximately optimal solutions through learning strategies. MA has been applied to many scenarios owing to its effective optimization capability [2]. For example, MA has found the great potential in wind speed prediction [3], scheduling problem [4], parameter optimization [5], PID optimization control [6], gate resource allocation [7], fault diagnosis of rolling bearings [8], cloud workflow scheduling [9], energy vehicle dispatch [10], combination optimization problems [11], traveling salesman problem [12], object tracking [13], neural network training [14], and multiattribute decision making [15].
In 2016, a new swarm intelligence algorithm named sine cosine algorithm (SCA) [16] was proposed. SCA searches the solution based on the sine function and cosine function. It owns strong global searchability, which can significantly increase the global optimal solution through enough iterations. However, SCA also is faced with some problems, for instance, slow convergence speed, low convergence accuracy, and easily falling into local optimum. To overcome the problems existing in SCA, a hybrid SCA and PSO algorithm (SCA_PSO) was put forward by Nenavath et al. [17], which aims to solve optimization problems and target tracking. The search mechanism of the PSO algorithm is added to the traditional SCA to guide the search for potential candidate solutions. It should be noted that though the SCA_ PSO has achieved promising results on object tracking when dealing with complex problems, it is still easy to skip the true optimal solution and lead to premature convergence.
According to "No Free Lunch" [18], we have introduced the differential evolution algorithm (DE) and combined mutation strategies into the original SCA_PSO to strengthen its capability of local search and reduce the occurrence of premature convergence. The proposed ESCA_PSO was validated on a benchmark test which includes various types of functions. Experimental results demonstrated that ESCA_ PSO was significantly better than SCA_PSO and other competitive counterparts. In addition, the ESCA_PSO was also used to construct an optimal support vector machine model (SVM) to deal with the medical diagnosis problems in an effective manner. In general, ESCA_PSO has improved the performance of SCA_PSO in a significant manner.

Literature
Review. SCA has been widely studied and applied in many fields because of its simple implementation and relatively excellent performance in realizing complex problems. For example, SCA was employed to tackle the scheduling problem in [19]. A multiobjective SCA was proposed to solve engineering optimization problems in [20] and forecast the wind speed in [21]. SCA was employed to predict the time series by constructing an optimal support vector regression model in [22]. In [23], SCA was utilized to optimize the parameters of fuzzy k-nearest neighbor to build an optimized classifier for predicting the intention of students for a postgraduate entrance examination. In [24], SCA was applied to optimize the SVM's parameters and the boosted classifier was trained to predict students' entrepreneurial intentions.
In addition to applications, scholars have also proposed many improved SCA variants. Issa et al. [25] proposed a new idea of SCA, that is, the combination of SCA and PSO. This algorithm was used to solve the problem of pairwise local alignment to look for the longest continuous Input End if End for Choose the particle with the best fitness value of all the particle as the PSO G best For each particle v t+1 End for End while best_fitness = PSO G best Return best_fitness Pseudocode 1: The pseudocode of the SCA_PSO. 2 Computational and Mathematical Methods in Medicine substring between two biological sequences, which has shown good performance in accuracy and calculation time. Nenavath and Jatoth [26] proposed to select the DE algorithm to merge into SCA and applied it to target tracking. Abd Elaziz et al. [27] mutated SCA (OBSCA) with opposition-based learning mechanisms, which has effectively boosted SCA's search efficiency and expanded its search scope. Qu et al. [28] improved the SCA by adding three strategies. Kumar et al. [29] tried to mix SCA with Cauchy and Gaussian distributions, which were named CGSCA. Simulations showed that the single-sensor tracking scheme based on CGSCA had better results in terms of tracking time and tracking efficiency. Long et al. [30] combined SCA with inertia weight based on a position updating equation and a nonlinear conversion parameter strategy to ameliorate SCA's ability in solving high-dimensional problems. Tu et al. [24] proposed to adopt the chaotic local search enhanced SCA for training an optimal support vector machine to predict students' entrepreneurial intentions. Turgut [31] proposed that mixing SCA with a backtracking search algorithm (BSA) was an efficient way to realize the shell and tube evaporator's optimal design. With the proposed optimization method, the optimal values of the total design cost of the heat exchanger and the total heat transfer  3 Computational and Mathematical Methods in Medicine coefficient were better than the results of the other optimizers in the literature. Guo et al. [32] introduced the Riesz fractional derivative and the OBL strategy into SCA and applied the method to deal with the engineering problems. Gupta and Deep [33] proposed to add a leading guidance mechanism and simulated quenching algorithm together to SCA and applied it to train multilayer perceptron. Gupta and Deep [34] utilized the OBL strategy and the selfadaptive component to enhance the SCA. And this improved algorithm's efficacy was confirmed by lots of benchmark problems and engineering problems. Tawhid and Savsani [20] proposed a novel and effective multiobjective version of SCA (MO-SCA). The difference between MO-SCA and SCA is mainly reflected in two aspects, one is to improve the nondominated level by adding an elite nondominated sorting strategy, and the other is to maintain the diversity of optimal solutions by adding crowded distance method. At present, machine learning methods are still a research hotspot [35][36][37][38][39][40][41][42]. However, the hyperparameters of the model have a crucial impact on its performance. Therefore, combining the improved version of SCA with machine learning methods to obtain the best hyperparameter combination model is also a novel research angle. The rest part of this paper is assigned as follows: The introduction of SCA and SCA-PSO is arranged in Section 2. Our proposed ESCA_PSO is presented in Section 3. The details information on experimental results and discussions are described in Section 4. Finally, in Section 5, conclusions and future works are summarized.
2. An Overview of SCA_PSO Algorithm 2.1. Standard SCA. In recent years, many new intelligent algorithms have been proposed to solve practical problems, such as hunger games search (HGS) [43], weighted mean of vectors (INFO) [44], Harris hawks optimization (HHO) [45], slime mould algorithm (SMA) [46], and Runge Kutta optimizer (RUN) [47]. These algorithms have shown great potential in various fields such as engineering, medicine, energy, finance, and education. In 2016, Mirjalili [16] put forward a novel swarm intelligence algorithm called SCA for global optimization tasks. Similar to other metaheuristics, it looks for the solution in a random searching space. SCA obtains the optimal solution through triangle sine cosine functions [16].
The following location updating formulas are proposed for two phases: where X t i is the position of the current solution in i-th dimension at t-th iteration, r 1 ,r 2 , r 3 are random numbers, P i is the position of the target point in i-th dimension, and || indicates absolute value. Combining the above, the following equation can be obtained: where r 4 is a random number in the range [0, 1].

Computational and Mathematical Methods in Medicine
As the above formulae reveal, r 1 , r 2 , r 3 and r 4 are four important parameters with different meanings. The r 1 mainly decides whether the position of the next move is within the boundary range of solution and destination or outside the range. r 2 represents the distance required for the movement to reach the destination. The coefficient r 3 carries a random load for the destination to stochastically emphasize (r 3 > 1) or deemphasize (r 3 <1) the effect of destination in describing the distance. r 4 switches from sine function to cosine function or vice versa in Eq. (3). The values of r 2 were set in ½0, 2π in this study.
To achieve a steady global and local search, the range of sine and cosine functions in Eqs. (1)-(3) are altered adaptively according to the following formula: where t means the current iteration, T means the maximum number of iterations, and a is a constant with the value of 2.

SCA_PSO Algorithm.
To eliminate the shortcoming of SCA, Nenavath et al. [17] came up with an improved hybrid SCA_ PSO for dealing with optimization tasks. The SCA_PSO integrates PSO's strengths in exploitation and the SCA's strengths in exploration to approach the global optimal solution. By adding internal storage to SCA, each individual is permitted to follow the coordinates associated with the adaptive values in the search space. And the personal historical best solution of each search agent in the present population is stored in the form of a matrix, SCA-P best , which is the same as the concept of P best in PSO each iteration. In addition, the solution also keeps track of the optimal value achieved so far by any nearby solution. As a new concept in SCA, P best and G best enhance the exploitation ability of SCA. The pseudocode of SCA_PSO is shown in Pseudocode 1.

Proposed ESCA_PSO
The improved ESCA_PSO is combined with two efficient strategies. Firstly, a DE with the "random variation" mode is successfully combined with the SCA_PSO to strengthen the global search capability of the SCA_PSO. Then, a combined mutation with the mixed distributions of Gaussian, Cauchy, and Lévy was added to the combination mutation strategy, which can further improve the accuracy of the solution.
3.1. Combined Mutation of Gaussian, Cauchy, and Lévy. Gaussian distribution (GD) [48] is a significant probability distribution in many subjects such as engineering and mathematics. GD has many excellent features. Plenty of random variables and objects in nature can be presumably expressed as GD, and many probability distributions can be approximated or exported by this distribution. The probability density function of the GD can be expressed according to Eq. (5): where μ and σ represent the mean and standard deviation, respectively. Cauchy distribution [49] is also called Cauchy-Lorentz distribution. It is a continuous probability distribution named after Augustine-Louis-Cauchy and Hendrick-Lorentz. It is similar to normal distribution. Cauchy distribution is also widely used in statistics. It has the characteristics of the nonexistence of expectation and variance and additivity. The probability density function of the Cauchy distribution can be expressed according to Eq. (6): where x 0 is the position parameter defining the location of the distribution peak; γ is the scale parameter defining half the width of the maximum half. Lévy flights [50] based on Lévy distribution are consistent with the search behavior of many organisms in the nature and are widely used in optimization algorithms and optimal search processes. Moreover, the stochastic process can maximize the search efficiency of resources under uncertain conditions. In the search process, Lévy flights can make the whole search process more effective and stable, balancing the proportion of local search and global search. Due to the existence of the random process of the Lévy flight, the shortrange exploratory local search and the occasional longdistance walk are in phase. Thus, the algorithm's local searching speed is faster, and the solutions are more easily searched near the current optimum. The decomposition can search far away from the current optimal value, thus ensuring that the algorithm does not fall into the local optimum (LO). The Lévy formula used is as follows: where Γ (x) is a continuation function of the factorial, that is, when the x is a positive integer Γðx + 1Þ = x! The value SCA-PSO M = 4; N = 9; vmax = 6; wmax = 0:9; wmin = 0:2 where γ is the flight step, v, is the standard normal distribution, and μ is a normal distribution with the mean of 0 and the variance of σ 2 .

Combined Mutation Strategy.
As mentioned in the foregoing, the DE algorithm can enhance the global search capability of SCA_PSO, while mutation of Gaussian, Cauchy, and Lévy is algorithms that can make the algorithm perform better in local search. A combined mutation strategy was proposed in this study, which combined the different characteristics of the three mutation strategies, making the SCA_PSO find a more balanced manner in performing explorative search and exploitative search. The whole algorithm steps are as shown below: Step 1. Search using the mentioned SCA.
Step 2. Mutate the SCA_PSO by the DE algorithm and the new individual will be retained if its fitness value is better than the original one.
Step 3. Update using the formula of SCA_PSO. The update formula used is as follows: x where c1, c2=2. r2, and r3 are in the range of [0, 1].
Step 4. Use combined mutation of Gaussian, Cauchy, and Lévy to mutate the current optimal individuals, find the individuals with the smallest of the three results, and update the fitness values and corresponding individuals.
where X_m_Lévy, X_m_gaus, and X_m_cauchy are the values obtained by the Lévy, Gaussian, and Cauchy strategies, respectively.   First of all, the current optimal individual SCA G best is obtained by SCA. Then, the particle swarm population is initialized with the help of SCA G best and mutated with the help of DE strategy. Next, the population is updated using SCA_PSO. Finally, it is updated by Gaussian, Cauchy, and Lévy flight strategies.

Experimental Results
To confirm the effectiveness of ESCA_PSO, the proposed ESCA_PSO is compared with other competitive metaheuris-tic algorithms on 30 functions of CEC2014 in this part. And then ESCA_PSO carries on the variation mechanism contrast experiment. Finally, ESCA_PSO is used for tuning SVM's parameters for medical diagnosis purposes.  Table 1, where range means the boundary of the search space for the relevant functions. As we all know, a unimodal function corresponds to a globally optimal solution; so, it can be employed to benchmark development capability. Conversely, the multimodal function possesses a lot of LO solutions, which leads to the algorithm falling into LO. Such functions can test the capability of the method to refrain from stagnation and exploration ability. Moreover, both the hybrid function and multimodal function only have one global optimum but multiple LO solutions. The structures of composition functions are more complex.
All the algorithms in the following experiments are coded on MATLAB 2014b. And to be fair, the experimental verification is carried out under the unified condition, i.e., the population size is set to 30, the maximum evaluation

Comparison with Other Algorithms.
In this experiment, the ESCA_PSO was contrasted to SCA, GWO [51], MFO [52], BA [53], and PSO [54] on the functions presented in Table 1. To further validate the effect of the proposed ESCA_PSO, two improved SCA variants including SCA_ PSO and SCADE [26] were involved for comparison and compare with LSHADE [55] which is the champion algorithm of CEC2014. The parameter configuration of algorithms is shown in Table 2. The detailed comparison results including the average value (Avg) of the best solution and standard deviation (Std) of every approach in 30 independent runs are displayed in Table 3.
We can see that the advantages of ESCA_PSO are not very obvious in the unimodal functions and multimodal functions. In these functions, ESCA_PSO is slightly better than the original algorithm SCA and its variants CGSCA and SCADE. But compared with high-quality algorithms such as LSHADE, there is still a certain gap. However, ESCA_PSO has a very good performance in the complex structure of the composition functions. Compared with other algorithms on F23-F30, it ranks first or second.
By the Friedman test, we can get the average ranking of test algorithms, which is usually used to get the difference between many test results. At the same time, to further analyze the experimental structure, Wilcoxon signed-rank test was adopted for statistical work.
In Table 4, all experimental results were taken from those two tests mentioned above. AVG in the table represents the average ranking of algorithms obtained by the Friedman test, and "+/-/=" represents the performance of the function compared with ESCA_PSO. Specifically, "+"

12
Computational and Mathematical Methods in Medicine 13 Computational and Mathematical Methods in Medicine means ESCA_PSO is better than this algorithm, "-" indicates that ESCA_PSO is inferior to this algorithm, and "=" means that the performance is similar to ESCA_PSO. In Wilcoxon signed-rank test, when the p value is less than 0.05, the performance between the two algorithms is significant. It was also used to evaluate the significance of ESCA_PSO versus other approaches. It can see from the table that ESCA ranks second on average, which is better than other algorithms overall. Compared with SCA, SCADE, CGSCA, GWO, and MFO, it is significantly better than 20 functions. However, it is indeed weaker than the champion algorithm LSHADE on multimodal functions and unimodal functions. Figure 2 shows nine graphs of convergence we selected. As shown in Figure 2, it can be seen that ESCA_PSO does have a good convergence rate on these functions. It quickly converges to a lower point. And ESCA_PSO has a significant improvement over than original SCA. Of course, it is undeniable that some algorithms converge faster than ESCA_ PSO, but ESCA_PSO has higher quality solutions.
Despite the great potential of the proposed ESCA_PSO, the approach of sacrificing a certain time complexity in exchange for an increase in terms of accuracy is insufficient side. Nevertheless, the algorithm is still competitive with LSHADE in unimodal and multimodal functions.

Comparison of Mutation Mechanism.
As mentioned earlier, three mutation mechanisms were added to ESCA_ PSO. To further analyze ESCA_PSO, we conducted comparative experiments on the mutation mechanism of ESCA in this section.
To compare the mutation mechanism, we construct three algorithms, namely, ESCA_PSO1, ESCA_PSO2, and ESCA_PSO3. Compared with ESCA_PSO, ESCA_PSO1 only uses Gaussian mutation while others remain unchanged. By analogy, ESCA_PSO2 only uses the Cauchy mutation while ESCA_PSO3 only uses the Lévy mutation. The population dimension and the total number of iterations of this experiment are the same as those in the previous experiment settings.
The results obtained from the experiments are shown in Table 5, and there is not much difference between these four algorithms, which can be concluded by comparing the whole data. This is because most of the four algorithms are the same, and only the mutation mechanism has changed. From the numerical value obtained from the experiment, ESCA_ PSO has not achieved the best results in functions many times. But relatively, ESCA_PSO is rarely ranked last. This is also because ESCA_PSO integrates three mutation mechanisms, which makes it applicable to more functions. Figure 3 shows several convergence graphs in this experiment. From the figure, we can see that in F2, F17, and F19, the convergence curves of the four algorithms are relatively similar, and there is no big difference in general. In F27 and F28, the performance of ESCA_PSO3 is not as good as the other three algorithms. In F29 and F30, ESCA_PSO2 is quite different from the other three algorithms. However, ESCA_PSO can keep a good level in these functions. This shows that the combination of three different mutation mechanisms can help the algorithm adapt to more functions.
However, ESCA_PSO that we proposed is not perfect, and there are certain limitations. In the benchmark functions experiment, it can be seen that there is still a gap between the performance of this algorithm and champion algorithms in unimodal functions and multimodal functions.  [56], SVM has many advantages such as "simple structure," "overcoming dimension disaster," and "small sample," which can overcome the weaknesses of conventional neural networks such as poor learning and generalization ability [57]. Since its introduction, SVM has found its application in many practical problems. Practice shows that the penalty factor C and kernel function variable g have the key influence on the recognition accuracy of the SVM model when solving the recognition problem based on the radial basis kernel function. When the penalty factor C is small, the recognition rate of training and test samples is low, and the SVM is under learning. When C is too large, the accuracy of the training sample is higher, the test sample recognition rate is lower, and the SVM is overlearning. The smaller the kernel function parameter g is, the higher the training sample recognition rate is and the lower the accuracy of the test sample is. When g is larger, the accuracy of training and test samples becomes lower, and SVM is under learning. Traditional methods such as trial and error method and network search method cannot meet the requirements of accuracy in practical application. Currently, with the development and maturity of MAs, good results have been achieved in improving the performance of the SVM model. For example, Li et al. [58] proposed moth-flame optimization (MFO) to tune the best parameters of SVM and applied it to the diagnosis of tuberculous pleural effusion. Li et al. [59] proposed a chaotic enhanced gravitational search algorithm (GSA) for optimizing the parameters of SVM. Das et al. [60] proposed to use the teaching-learning-based optimization (TLBO) for parameter optimization of SVM, and the good performance was validated by a financial case. Tang et al. [61] proposed a Lévy flight-based shuffled frog-leaping algorithm for determining the best parameters of SVM. Ahmadi et al. [62] developed the imperialist competition algorithm (ICA) to determine the best parameters of SVM for stock market timing. Li et al. [22] proposed SCA to tune the best parameters of SVM, and the good results were verified on several benchmark datasets. Rojas-Dominguez et al. [63] proposed to use     several metaheuristics to search for the best parameters of SVM, and the results showed that the estimation of distribution algorithms can achieve the best results. Tharwat et al. [64] proposed a chaotic antlion optimizer for tuning the best parameters, and the effectiveness was validated on an array of well-known datasets. Bablani et al. [65] proposed to use the bat algorithm (BA) to simultaneously determine the optimal parameters of SVM and the best subset of features and applied the model for dealing with the electroencephalography (EEG) data.
In this study, we applied ESCA_PSO to search for the best parameters of SVM, and the resultant model was called ESCA_PSO-SVM as shown in Figure 4. ESCA_PSO-SVM was applied to predict two different medical problems including the Bupa liver and the Cleveland heart.
The Bupa liver diabetes dataset has a total of 345 samples and 7 features. Table 6 demonstrates the detailed results got by ESCA_PSO-SVM via 10-fold crossvalidation. As shown in Table 6, ESCA_PSO-SVM has got an average accuracy (ACC) of 73.04%, an average sensitivity of 59.18%, an average specificity of 83.81%, and an average Mathews correlation coefficient (MCC) of 0.4404.
From Figure 5, it is clear that ESCA_PSO-SVM has more excellent performance than SCA-SVM in such four indexes. Moreover, compared with the prediction accuracy, the ESCA_PSO-SVM has the best precision, while the KNN   In terms of the obtained specificity, ESCA_PSO-SVM was the best, followed by SVM, SCA-SVM, BP, KNN, and CART. In terms of the MCC, ESCA_PSO-SVM provided the best value, followed by SVM, SCA-SVM, CART, BP, and KNN. It suggests that ESCA_PSO-SVM is more advantageous and stable in solving the Bupa liver problem. The Cleveland heart data was got from the UCI repository, and it includes 303 samples and 76 features. Table 7 shows the detailed results of ESCA_PSO-SVM through 10-fold crossvalidation on this dataset. From Table 7, ESCA_ PSO-SVM has got an average ACC of 82.81%, a sensitivity of 76.88%, a specificity of 86.38%, and an MCC of 0.6486.
In Figure 6, ESCA_PSO-SVM is superior to SCA-SVM in terms of four evaluation indexes. Concerning the classification accuracy, it can be seen clearly that the ESCA_PSO-SVM has got the best ACC, whereas BP has the lowest precision. In terms of the sensitivity metric, the value of SVM is the same as that of ESCA_PSO-SVM which takes the first place. As for the specificity metric, although SCA-SVM ranked first place, it was only slightly superior to ESCA_  PSO-SVM. According to the MCC metric, ESCA_PSO provided the best value, followed successively by SCA-SVM, SVM, KNN, CART, and BP. These all proved the robustness and stableness of the ESCA_PSO-SVM on the Cleveland heart problem. Shortly, many problems are waiting to be optimized for which ESCA_PSO can be applied, such as disease module identification [66], molecular signatures identification for cancer diagnosis [67], drug-disease associations prediction [68], drug discovery [69], and pharmacoinformatic data mining [70].

Conclusions and Future Directions
To make up for the deficiency of SCA_PSO, this paper proposed ESCA_PSO, an enhanced version of SCA_PSO. The chance of prematurely falling into convergence was effectively reduced by introducing DE and joint mutation mechanisms. To verify its performance, it was compared with seven advanced algorithms on 30 benchmark function sets.
The experimental results showed that the performance of the proposed algorithm was better than that of the traditional optimization algorithms and had certain competitiveness with LSHADE. Inspired by the "No Free Lunch" theory, this paper further explored the application of ESCA_PSO in medical diagnosis and successfully applied it to hyperparameter optimization of support vector machine.
The results showed that the support vector machine model combined with the proposed algorithm outperformed the other five existing models and achieved an average accuracy of 82.81%. In conclusion, the proposed algorithm can be regarded as a reliable technique for solving practical problems.
For future work, there are still many problems worthy of study. First of all, we will continue to improve the algorithm, by means such as trying to introduce other metaheuristic algorithms or optimizing the time complexity while ensuring the effect. In addition, we will try to apply ESCA_PSO to other fields such as image segmentation, clustering optimization, and discrete optimization.

Data Availability
The data involved in this study are all public data, which can be downloaded through public channels.

Conflicts of Interest
The authors declare that they have no conflicts of interest.