Metaheuristics with Deep Learning-Enabled Parkinson's Disease Diagnosis and Classification Model

Parkinson's disease (PD) affects the movement of people, including the differences in writing skill, speech, tremor, and stiffness in muscles. It is significant to detect the PD at the initial stages so that the person can live a peaceful life for a longer time period. The serious levels of PD are highly risky as the patients get progressive stiffness, which results in the inability of standing or walking. Earlier studies have focused on the detection of PD effectively using voice and speech exams and writing exams. In this aspect, this study presents an improved sailfish optimization algorithm with deep learning (ISFO-DL) model for PD diagnosis and classification. The presented ISFO-DL technique uses the ISFO algorithm and DL model to determine PD and thereby enhances the survival rate of the person. The presented ISFO is a metaheuristic algorithm, which is inspired by a group of hunting sailfish to determine the optimum solution to the problem. Primarily, the ISFO algorithm is applied to derive an optimal subset of features with a fitness function of maximum classification accuracy. At the same time, the rat swarm optimizer (RSO) with the bidirectional gated recurrent unit (BiGRU) is employed as a classifier to determine the existence of PD. The performance validation of the IFSO-DL model takes place using a benchmark Parkinson's dataset, and the results are inspected under several dimensions. The experimental results highlighted the enhanced classification performance of the ISFO-DL technique, and therefore, the proposed model can be employed for the earlier identification of PD.


Introduction
Parkinson's disease (PD) is a brain disorder that occurs as a consequence of the loss of brain cells. It mainly affects body mobility. Its symptom gradually becomes evident. Some of these symptoms that perform at early stages are tremors, slowness in movement, poor body posture, rigidness in muscles, deviation in speech, handwriting strokes, and imbalance [1]. In this disorder, a person's nerve cell gradually loses their ability to communicate between them, which results in nervous system disorders such as depression. is disease must be diagnosed at earlier stages because it is incurable. When the accurate symptom of PD is recognized with their relative weightage, then doctors can suggest a pathology lab test for this feature and diagnosis might take place at an initial consultation itself. It will result in an earlier diagnosis of Parkinson's disease. e symptoms such as changes in speaking patterns and handwriting strokes might assist in an earlier diagnosis of this disorder [2]. Erdogu Sakar and team lately received a speech dataset by examining the pronunciation of vowels "a" and "o" of disease-affected persons. Except speaking patterns, handwriting stroke patterns might help in detecting the disorder [3]. Factors studied for distinguishing a person from a healthier patient are individual age, fare handedness (right/ left), maximum and mean distance among given summary in test, handwriting strokes noted in the drawing, and test time duration.
Recently, data have been improved by number of instances and numbers of features that make data noisier [4]. e noisier datasets could create the model to decrease the predicted accuracy, increase the computation cost, increase the complexity, and train the data slower. erefore, feature selection developed an essential task for machine learning (ML) beforehand training the models [5]. e feature selection (FS), also known as attribute selection, is a method that focuses on finding a subset from the provided comprehensive set of features and fewer downgrades of the system performance; thus, the subsets of feature forecast the target with accuracy analogous to the performances of the original set of features and with the reducing computation costs. e FS method is categorized into wrapper-based and filter-based algorithms. e filter-based method utilizes a statistical method for finding the vital of all features (attributes). e wrapper-based method utilizes the machinelearning (ML) method. e wrapper-based method is computationally costly when compared to the filter-based method [6]. e wrapper method is additionally classified as heuristic search algorithm and sequential search algorithm.
An evolutionary algorithm is a part of artificial intelligence (AI) system that primarily focused on biological evolution. Biological evolution includes 4 major procedures such as selection, reproduction, mutation, and recombination [7]. Different from conventional optimization models, evolutionary algorithms depend on random sampling. is process is continuously employed on the solution officially reported as population, and the FF was employed for determining the quality of solutions.
is solution changes based on the evolutionary procedure that finally assists to discover the global solution to the problems [8]. e evolutionary method has been recognized for performing well under distinct scenarios since it does not consider the fundamental fitness landscape. Even an easy evolutionary algorithm could easily resolve difficult challenges [9]. e only drawbacks in the evolutionary algorithm are the computational cost factor that is decreased by the fitness function calculation.
is study presents an improved sailfish optimization algorithm with deep learning (ISFO-DL) model for PD diagnosis and classification. e presented ISFO-DL technique designs an ISFO-based feature selection technique to derive an optimal subset of features with a fitness function of maximum classification accuracy. At the same time, the rat swarm optimizer (RSO) with the bidirectional gated recurrent unit (BiGRU) is employed as a classifier to determine the existence of PD. e experimental validation of the IFSO-DL model is carried out using a benchmark Parkinson's dataset, and the results are inspected under several dimensions. e rest of the paper is arranged as follows. Section 2 offers the related works, Section 3 provides the proposed model, Section 4 inspects the performance validation, and Section 5 draws the conclusion.

Related Works
Huseyn [10] presented the DL methodology for realizing healthy people, analysis of PD, and multiple system atrophy. Oh et al. [11] employed the EEG signal of 20 PD and 20 standard subjects in this work. A 13-layer CNN framework could conquer the requirement for the traditional feature representation phases that are carried out. Wang et al. [12] introduced a novel deep-learning model for the earlier detection and classification of PD using the premotor features. In particular, to diagnose PD at earlier stages, various symptoms have been taken into account. Shahid and Singh [13] developed a DNN method with the decreased input feature space of Parkinson's telemonitoring datasets for predicting PD evolution. PD is a progressive and chronic nervous system disorder, which impacts the motion of body. PD is measured by utilizing the unified PD rating scale (UPDRS).
Kaur et al. [14] surged a feasible medical decision-making method, which assists the medical professionals in detecting the PD-affected person. In this study, a certain architecturebased grid searching optimization method is presented for developing an enhanced DL algorithm to forecast the earlier diagnosis of PD; therefore, various hyperparameters are to be tuned and set for the assessment of DL algorithm. e grid searching optimization method includes its performance, the optimization of DL method, and the hyperparameters. In the study by Sivaranjini S. and Sujatha [15], an effort has been made for classifying the MR images of healthier control and PD subjects with the DL-NN model. e CNN framework AlexNet is utilized for refining the detection of PD. e MR image is tested to provide the accuracy measures and trained with the transfer learned network.
Quan et al. [16] presented a Bi-LSTM method for capturing the time-series dynamic feature of a speech signal to PD diagnosis. e dynamic speech feature is evaluated on the basis of energy content evaluation from the transition under voiced to unvoiced segments (offset) and the transition from unvoiced to voiced segments (onset). Sigcha et al. [17] proposed a novel methodology-based RNN and a single waist-worn triaxial accelerometer for enhancing the FOG recognition accuracy to be utilized in real home environment.
Leung et al. [18] focused on developing DL, an ensemble method for the prediction in person with PD. e initial and next phases of the method extracted features from DaTscan and medical measures of motor symptoms, respectively. en, an ensemble of DNN model was trained on distinct subsets of the extracted feature for predicting the person results from 4 years afterward early baseline screening. Masud et al. [19] introduced an ACSA-and DL-based optimal FS technique. e presented method is the integration of CROW Search and DL (CROWD) SSAE-NN. PD dataset has been taken for experimental purposes.

The Proposed ISFO-DL Model
In this study, the ISFO-DL technique has been developed for PD detection and classification. e proposed ISFO-DL technique is mainly intended to determine PD and thereby enhance the survival rate of the person. e presented ISF-DL technique involves three major processes namely ISFO-based feature selection, BiGRU-based classification, and RSO-based hyperparameter optimization.
ese three processes are elaborated in the succeeding sections.

Design of ISFO-Based Feature Selection
Technique. At this stage, the ISFO algorithm is employed to choose an optimal subset of features and thereby boost the classifier results. Research has established that group hunting is the major social behavior in groups of fish, birds, mammals, and arthropods. In comparison with individual hunting, group hunting could save the energy utilization of the hunter to attain the aim of catching prey. Sailfish is employed for saving the present optimum solution, although sardines are applied in the searching space for finding an optimal solution. e arithmetical expression of the model is given as follows.
e population locations of sardines and sailfish are arbitrarily initiated, and every sardine and sailfish are allocated a randomized location X k SF(i) and X k S D(j) , successively, where i ∈ sail fish { }, j ∈ sardlines { }, and k represent the iteration count. e upgraded location of sailfish has been arithmetically given as follows: Let X k SF(i) be the preceding location of the ith sailfish, and μ k indicates a coefficient created at kth iteration, using equation (2). To conserve the optimum solution of all the iterations, the sardine and sailfish with optimal fitness value are known as "elite" sailfish and "injured" sardine, respectively, and their location at iteration k is represented as X k eilie and X k injure . P d denotes the density of prey sardines that indicates the number of prey in all the iterations, as in equation (3). NumSF and NumSD stand for the population of sailfish and sardines [20], and the relation is NumSP � NumSD× percent, in which percent characterizes the primary species of sailfish as a percentage of sardine populations.
A novel location of the sardines at k iteration is estimated as follows: Here, X k S D(j) signifies the preceding location of the jth sardine. iter denotes the amount of existing iterations. ATK means the sailfish attacking strength, i.e., decreased linearly on all the iterations given by equation (5). Once the A � 4 and ε � 0.001, if ATK < 0.5, the amount of sardines that upgrade the location (α) and the number of parameters of them (β) is evaluated by equations (6) and (7). When ATK ≥ 0.5, each sardine gets upgraded.
For simulating the procedure of the sailfish catching sardines, when f(SD j ) < f(SF i ), then the location of later can be substituted with the place of the sardine i, as follows: Chaotic mapping algorithms have both randomness and certainty and stochastic behavior and nonlinear motion. Chaos concept is the study of dynamic systems. e stimulating property of this system is that if there is a slight modification in the algorithm, the entire algorithm gets affected. e research has shown that the primary value of chaotic technique, the population of metaheuristic model, was initiated based on the relationship of chaotic mapping, and chaotic order was made, which could efficiently save the variety of populations and conquer the premature problems of traditional optimization method. Figure 1 illustrates the process flow of SFO technique. e population initiation of sardines and sailfish in the SFO is a stochastic approach. It is based on population initiation while searching for an optimum solution. For enhancing the global searching capacity of the model and preventing the problems that the diversities of sardine and sailfish population reduce in late searches, hence we proposed a population initialization of sailfish and sardines using tent chaotic operator. e tent map can be described as follows: In the equation, T i denotes that the sequence of ith iteration (T i ∈ (0, 1)) indicates the tent chaotic sequence distribution of T n with the primary value T 0 � 0.9 in 200 iterations. Next, the sardine and sailfish populations are initiated: Journal of Healthcare Engineering While X SF(i+1) and X S D(j+1) indicate the location value of individual sardines and sailfish, X ub and X lb represent the upper and lower bounds of the individual sardines and sailfish in each dimension.
Assume the novel feature set be F � f 1 , f 2 , . . . , f D , where D implies the entire amount of features or dimension of feature set, and consider the class label be C � c 1 , . . . , c l , where l stands for the amount of classes. e FS technique determines a subset S � s 1 , . . . , s m , where m < D, S ⊂ F, and S is minimal classification error rate than some other subsets of similar size or some appropriate subset of S. FS is the binary optimized issue, where the solution was restricted to binary values from 0 to 1. At this point, the solution has signified utilizing a binary vector where 1 refers that the equivalent feature was chosen and 0 demonstrates the equivalent feature is not chosen. e size of this vector was equivalent to the number of features from the original dataset.
e ISFO was presented for solving continuous optimized issues in which the solution contains the real value. For mapping the continuous search space of typical ISFO to binary one, it can utilize a transfer function [21]. It can be utilized as a sigmoid transfer function and written as follows: At this point, utilizing the probability value attained in equation (11), the present place of sailfish was upgraded by the following equation: Usually, the FS is a multiobjective issue, with 2 objectives: (a) for achieving maximum classification accuracy (for instance, maximized issue) and (b) for selecting minimal number of features (for instance, minimized issue). Using equation (15), these 2 objectives are joined and the FS issue was changed to single-objective issue.
where S stands for the chosen feature subset, |S| defines the cardinality of chosen feature subset or the number of chosen features, c(S) signifies the classification error rate of S, D refers the novel dimensional of dataset, and ω ∈ [0, 1] signifies weight.

Design of the RSO-BiGRU-Based Classification Model.
During the classification process, the RSO-BiGRU model is applied to carry out the classification process. Learning is a continuous representation that is effective to control sequential data. An RNN is mostly appropriate to encoded sequential data. Figure 2 demonstrates the framework of BiGRU. During this analysis, it can utilize BiGRU for learning [22]. e computation of BiGRU was separated into 2 parts: forward and reverse order data broadcasts. To provide sentence X � (x 1 , x 2 , . . . , x n ), x ∈ R k , x refers the concatenating vector of present word and place, and the forward GRU was computed as follows: where W * and b * signify the weight matrix and bias vectors, respectively; σ refers the sigmoid functions; and ⊙ stands for the element-wise multiplication. x t implies the input word vector at time steps τ, and h t signifies the hidden state of current time step r. h i → and h i ← demonstrate the outcome of forward and backward GRUs, respectively. e BiGRU output is represented as follows: To effectively tune the hyperparameters involved in the BiGRU model, the RSO is applied to it. e rats are territory animals that live from the set of combined males and females. e performance of rats is very aggressive from several analyses that are outcome from the death of any animals.
is aggressive performance is a vital simulation of this work but chase and fight with prey. e chasing and fighting behavior of the rats can be used to model the RSO algorithm and can be utilized to solve optimization problems. is subsection explains the performance of rats, for instance, chasing and fighting. Afterward, the presented RSO technique is summary. Step 1 Step 2 Step 4 Step 3 Step 5  Journal of Healthcare Engineering

Chasing the Prey.
In general, the rats are social animals to chase the prey under the set with situation social agonistic efficiency. For defining this efficiency mathematically, it can be assumed that optimum search agents have skill of place of the prey. Another search agent has upgraded its places in terms of optimum search agents attained so far. e subsequent formulas are presented under this process: where P → i (x) demonstrates the places of rats and P → r (x) signifies the better optimum solutions.
However, A and C parameters were calculated as follows: So, R and C imply the arbitrary numbers among [1,5] and [0, 2], respectively. e parameters A and C are responsible for optimum exploration and exploitation over the course of rounds.

Fighting with Prey.
For mathematically defining the fight procedure of rats with prey, the subsequent formula was projected: where P → i (x + 1) implies the upgraded next places of rat. It stores the optimum solution and upgrades the places of other search agents in terms of optimum search agent. e rat (A, B) upgraded their place nearby the place of prey (A * , B * ). By altering the parameters as revealed in equations (20) and (21), the distinct amount of places is achieved on the present place [23]. Also, this technique is comprehensive from n-dimensional environments. Consequently, the exploration and exploitation have been guaranteed using the value of parameters A and C. e projected RSO technique stores optimum solutions with many operators.

Performance Validation
is section inspects the PD classification result analysis of the presented IFSO-DL technique. e results are investigated against four datasets namely HandPD Spiral, HandPD Meander, Speech PD, and Voice PD [24][25][26]. Table 1 and    e loss graph analysis of the IFSO-DL technique is investigated in Figure 5. e figure shows that the IFSO-DL technique has accomplished enhanced outcomes with the lower validation loss compared with training loss. It also demonstrates that the IFSO-DL technique has obtained reduced validation loss compared with training loss. Table 3 suggests a detailed comparative outcome analysis of the IFSO-DL technique with recent approaches on the test HandPD Meander dataset. e results outperformed that the MGWO-KNN and MGOA-KNN systems have obtained minimum accuracy of 0.728 and 0.748, respectively. Afterward, the MGOA-DT manner has gained moderate accuracy of 0.890. Also, MGOA-RF, MGWO-RF, and MGWO-DT systems have accomplished reasonable accuracy of 0.937, 0.930, and 0.880, respectively. However, the IFSO-DL method has exhibited the other methodologies with the maximal accuracy, DR, and FAR of 0.940, 1.000, and 0.135, respectively. Figure 6 reveals the accuracy graph analysis of the IFSO-DL manner on the test HandPD Meander dataset. e figure shows that the IFSO-DL technique has reached improved training and validation accuracies. It can be clear that the IFSO-DL algorithm has accomplished improved validation accuracy over the training accuracy.
e loss graph analysis of the IFSO-DL system is studied in Figure 7. e figure portrays that the IFSO-DL technique has accomplished enhanced outcomes with the lower validation loss related to training loss. It also outperforms that the IFSO-DL technique has gained lower validation loss related to training loss. Table 4 provides a brief comparative outcome analysis of the IFSO-DL system with recent approaches on the test Speech    Total features  MGOA  MGWO  OCFA  IFSO-DL  HandPD Spiral  13  5  7  8  4  HandPD Meander  13  8  8  7  6  Speech PD  23  11  12  13  10  Voice PD  26  8  9  17  7   6 Journal of Healthcare Engineering   e loss graph analysis of the IFSO-DL algorithm is explored in Figure 9. e figure depicts that the IFSO-DL technique has accomplished superior results with the lower validation loss compared with training loss. It can also portray that the IFSO-DL technique has reached reduced validation loss related to training loss.  Figure 10 exhibits the accuracy graph analysis of the IFSO-DL system on the test Voice PD dataset. e figure portrays that the IFSO-DL technique has reached increased training and validation accuracies. It is noticeable that the IFSO-DL methodology has accomplished higher validation accuracy over the training accuracy.
e loss graph analysis of the IFSO-DL approach is examined in Figure 11. e figure outperforms that the IFSO-DL method has accomplished enhanced outcomes with the lesser validation loss related to training loss. It also  8 Journal of Healthcare Engineering shows that the IFSO-DL manner has obtained reduced validation loss connected to training loss. Figure 12 shows the accuracy analysis of the IFSO-DL technique with other recent techniques on the four test datasets [27]. e figure portrays that the IFSO-DL technique has gained effective outcomes with the maximum accuracy values on all the test datasets. Figure 13 illustrates the DR analysis of the IFSO-DL algorithm with other recent manners on the four test datasets. e figure shows that the IFSO-DL technique has achieved effective outcomes with the maximal DR values on all the test datasets. Figure 14 depicts the FAR analysis of the IFSO-DL method with other recent approaches on the four test datasets. e figure outperforms that the IFSO-DL system has reached effective outcomes with higher FAR values on all the test datasets. From the abovementioned tables and figures, it is apparent that the IFSO-DL technique has been found to be an effective tool for PD detection and classification.

Conclusion
In this study, the ISFO-DL technique has been developed for PD detection and classification. e proposed ISFO-DL technique is mainly intended to determine PD and thereby enhance the survival rate of the person. e presented ISF-DL technique involves three major processes namely ISFO-based feature selection, BiGRU-based classification, and RSO-based hyperparameter optimization. e design of ISFO and RSO algorithms finds useful to significantly enhance the PD classification performance. e experimental validation of the IFSO-DL model is carried out using a benchmark Parkinson's dataset, and the results are inspected under several dimensions.
e experimental results highlighted the enhanced classification performance of the ISFO-DL technique, and therefore, the proposed model can be employed for the earlier identification of PD. In future, the PD classification performance can be boosted by the use of outlier detection and clustering approaches.

Data Availability
e dataset used in this study is publicly available via the following link: https://wwwp.fc.unesp.br/∼papa/pub/ datasets/Handpd/.

Ethical Approval
is article does not contain any studies with human participants performed by any of the authors.

Conflicts of Interest
e authors declare that they have no conflicts of interest.