Optimization is the process of achieving the best solution for a problem. LabVIEW based on an SVM model is proposed in this paper to get the best SVM parameters using the hybrid CS and PSO method. PCA is used as a preprocessor of SVM for reducing the dimension of data and extracting features of training samples. Also, SVM parameters are optimized for Parkinson’s disease data by combining CS and PSO. The designed system is used to determine the best SVM parameters, and it is compared to PSO and CS optimization methods and found that the used CS-PSO hybrid optimization method is better. The hybrid model shows that the accuracy of the performance achieved is 97.4359%. Also, the data classification results obtained by using SVM parameters determined by optimization are measured by precision, recall,
Parkinson’s disease (PD) is a neurological disorder that affects the standard of life of the patients and their relatives. PD is more widespread in countries where the pretty old population is high. According to statistics, in the US, about one million people will be affected with Parkinson’s disease (PD) by 2020 and more than 10 million people worldwide will be living with PD. The possibility of men with Parkinson’s disease is 1.5 times higher than that of women [
Before the SVM is applied, a “feature transformation” operation must be performed, which is the process of transforming the data into a new set of data at a dimension that can express less features. With this, dimension of data reduces and excessive numbers of unimportant features are removed. In this study, PCA is used for dimension reduction.
In cases where the difference between the reduced dimension data is too high, the normalization process is performed to handle the data in a single order. In addition, normalization is also used which makes use of mathematical functions to move data in different scaling systems to a common system and make them comparable. In this study, the
Particle swarm optimization (PSO) is an optimization technique designed to deliver the best solution to the system. Also, the cuckoo search algorithm (CS) is another optimization method. It has less parameters, easy to implement, and efficient. In this paper, both of them are used for optimization of SVM parameters.
Accurate and reliable diagnosis is very important for human health. In this study, different optimization algorithms have been used to obtain the best SVM parameters for predicting Parkinson’s disease. The proposed hybrid CS-PSO-SVM model provided an accuracy of 97.4359% and is superior to the PSO-SVM and CS-SVM models.
A different work environment for researchers has been proposed using LabVIEW, a visual programming language instead of only text-based programming languages Hybrid optimization methods are used for obtaining the best SVM parameters
The paper is organized as follows: In Section
Appropriate C value parameter is chosen [
Attributes of used data.
Number | Attributes | Information |
---|---|---|
1 | MDVP:Fo (Hz) | Average vocal fundamental frequency |
2 | MDVP:Fhi (Hz) | Maximum vocal fundamental frequency |
3 | MDVP:Flo (Hz) | Minimum vocal fundamental frequency |
4 | MDVP:Jitter (%) | Several measures of variation in the fundamental frequency |
5 | MDVP:Jitter (Abs) | |
6 | MDVP:RAP | |
7 | MDVP:PPQ | |
8 | Jitter:DDP | |
9 | MDVP:Shimmer | Several measures of variation in amplitude |
10 | MDVP:Shimmer (dB) | |
11 | Shimmer:APQ3 | |
12 | Shimmer:APQ5 | |
13 | MDVP:APQ | |
14 | Shimmer:DDA | |
15 | NHR | Two measures of ratio of noise-to-tonal components in the voice status |
16 | HNR | |
17 | RPDE | Two nonlinear dynamic complexity measures |
18 | D2 | |
19 | DFA | Signal fractal scaling exponent |
20 | Spread1 | Three nonlinear measures of the fundamental frequency variation |
21 | Spread2 | |
22 | PPE |
Dimension reduction and normalization procedures were performed to extract properties from the used data to ensure that they are in a single order. Then, optimization methods are applied to SVM. Figure
The diagram of the used techniques.
PCA is a technique that has a wide range of uses for reducing the insignificant features of the data. The idea underlying the PCA is to represent a data plane by separating it into orthogonal axes to reflect the data in small linear combinations. In other words, PCA reduces the data dimension to extract features. Figure
Dimension reduction program for PCA on LabVIEW.
Statistical normalization is performed to treat the data in a single order when there is a lot of difference between the data. Also, another objective is to use mathematical functions to translate data from different systems into a common system and make them comparable. In the
SVM is an important tool for machine learning (ML) derived from statistical learning theory [
Nonlinear SVM.
Transfer of input data to the property plane.
Theoretically, any linearly separable SVM can be correctly classified. For a linearly separable dataset, there are
These data can be separated from each other by the separator function given by
The following equations are used for correct classification:
The appropriate values of
In this function,
Polynomial kernel, sigmoid kernel, and Gaussian kernel functions are used commonly to find the optimal hyperplane to distinguish linearly nonseparable data. Gaussian kernel: Polynomial kernel:
In order to classify with SVM, the first thing to do is to select a kernel function and related parameters that allow linear separation of the data. For classification of data, the following equation is obtained:
Appropriate
Provided the 0 ≤
Optimization is the process of achieving the best solution for a problem. Since the methods used in optimization problems defined by mathematical functions are not flexible and the desired result cannot be achieved, new methods have been developed with reference to natural phenomena and PSO is the most common of these algorithms. Inspired by fish and insects moving in flocks, Kennedy and Dr. Eberhart developed PSO in 1995 [
The basic PSO algorithm: every individual in the swarm can be a solution, and every individual is represented by the dimension vector:
The speed of each individual in the herd is randomly generated. Each individual has the same speed as in equation (
The best local and global positions are determined. Here, the position of each individual is defined as follows:
Each individual in the PSO adjusts its position around the individual to pbest, global, and gbest. The speed and position information of the individuals are given in the following equations:
Here,
The pseudocode of the PSO [
CS is a next-generation optimization method based on the hatching parasitic nature of cuckoo birds [
Dropped eggs can be familiarized by the host with a probability of
The global random walk performed with a Levy flight is performed with equation (
Local random walk is performed with equation (
The emerging technology needed the development of object-oriented programming languages instead of text-based programming languages. Thus, visual programming was possible without writing code. With National Instruments’ development of the LabVIEW program, it was possible to program the model graphically with ready-made functions, and there was no need to write code. With LabVIEW (Laboratory Virtual Instrument Engineering Workbench), it was possible to make programs more quickly and to avoid time loss. LabVIEW generally uses a data flow model instead of text codes. Also, LabVIEW has an ability of multiple parallel processes [
LabVIEW consists of two components: the first one is the front panel that is the user interface and the second one is the block diagram in which graphical codes are shown. Both of them are shown in Figures
LabVIEW front panel.
LabVIEW block diagram.
In this study, PSO-SVM, CS-SVM, and CS-PSO-SVM methods are compared with each other. The created hybrid program in the LabVIEW environment is shown in Figure
The created hybrid program.
The classification performance results obtained by using SVM parameters determined by optimization are measured by accuracy, precision, recall,
Confusion matrix.
Prediction | Actual | |
---|---|---|
Positive | Negative | |
Positive | TP | FP |
Negative | FN | TN |
Accuracy is the correct classification ratio:
Precision is a situation that shows success in a positively predicted situation:
Recall shows how well the positive cases are estimated:
FPR, sometimes called the fall-out, is the ratio of misclassified events (FP) to all actual negative events:
FDR is the expected percent of false predictions in a set of predictions:
FNR, sometimes called the miss rate, is the proportion of individuals with a known positive condition for which the test result is negative:
NPV is the proportion of individuals with a negative test result for which the true condition is negative:
MCC is a reliable metric used to assess the quality of binary classifiers by taking into account TP, TN, FN, and FP. In fact, MCC is a correlation coefficient between the actual and predictor labels. This parameter takes a value between −1 and +1. The +1 coefficient means an excellent estimate, 0 indicates that the classifier is not better than random estimates, and −1 means a discrepancy between the actual and predicted values [
The dataset used in the study is obtained from [
Dataset information.
Number of instances | Number of attributes | Normal | PD |
---|---|---|---|
195 | 22 | 8 | 23 |
Parameter settings.
Method | Population size | Iteration |
|
|
---|---|---|---|---|
PSO-SVM | 18 | 120 | 1.3 | 1.87 |
CS-SVM ( |
18 | 120 | — | — |
CS-PSO-SVM ( |
18 | 120 | 1.3 | 1.87 |
Obtained results.
Method | Accuracy (%) | Precision (%) | Recall (%) |
|
FPR | FDR | FNR | NPV | MCC |
---|---|---|---|---|---|---|---|---|---|
PSO-SVM | 82.05 | 88.89 | 57.14 | 69.57 | 0.04 | 0.1111 | 0.4286 | 0.80 | 0.6051 |
CS-SVM | 92.3077 | 83.33 | 90.91 | 86.96 | 0.0714 | 0.1667 | 0.0909 | 0.9630 | 0.8167 |
CS-PSO-SVM ( |
97.4359 | 100 | 90.91 | 95.24 | 0 | 0 | 0.0909 | 0.9655 | 0.9369 |
The
As can be seen in Table
The population average fitness value for the used dataset is shown in Figure
Population average fitness value.
Error rate of each method.
Accurate and reliable diagnosis is very important for human health. Different optimization algorithms have been used for optimizing the SVM parameters in this paper. The aim of this paper is to find the best SVM parameters with the hybrid CS-PSO optimization method and obtain best classification accuracy. For this, to analyze the performances of the used methods, the programs were run several times, and the results are presented as tables. Table
The proposed model achieves a classification accuracy of 97.4359%, while this rate is 92.3077% in CS-SVM and 82.05% in PSO-SVM. The MCC contains all parameters in the confusion matrix. The higher value of MCC proves that the proposed classification method is successful. As shown in Table
As results of this study, hybrid models created by combining the good characteristics of different optimization algorithms can be used to find the parameters of the classification methods, and the success rate of the model can be increased.
In an increasingly widespread LabVIEW environment, it is possible to quickly create subprograms and to obtain results quickly.
The data that support the findings of this study are available from the authors upon reasonable request.
The author declares that there are no conflicts of interest.