Computer-Aided Diagnosis of Parkinson’s Disease Using Complex-Valued Neural Networks and mRMR Feature Selection Algorithm

Parkinson’s disease (PD) is a neurological disorder which has a significant social and economic impact. PD is diagnosed by clinical observation and evaluations, coupled with a PD rating scale. However, these methods may be insufficient, especially in the initial phase of the disease. The processes are tedious and time-consuming, and hence systems that can automatically offer a diagnosis are needed. In this study, a novel method for the diagnosis of PD is proposed. Biomedical sound measurements obtained from continuous phonation samples were used as attributes. First, a minimum redundancy maximum relevance (mRMR) attribute selection algorithm was applied for the identification of the effective attributes. After conversion to a complex number, the resulting attributes are presented as input data to the complex-valued artificial neural network (CVANN). The proposed novel system might be a powerful tool for effective diagnosis of PD.


INTRODUCTION
Parkinson's disease (PD) is a central nervous system disorder which causes partial or complete passivation of motor reflexes, speech, and other vital functions [1]. PD affects a significant portion of the world population, and impacts on approximately 1% of those over 50 years of age in 2005; this percentage is expected to increase as people live longer [2]. method combined with SVM-based classifier, and achieved an accuracy rate of 92.75%. Ozcift and Gulten [27] proposed a method that combined 30 machine learning algorithms with a rotation forest (RF) group classifier. In that study, where a correlation-based feature selection (CFS) algorithm was used as a feature selection algorithm, an 87.13% classification accuracy was obtained. Chen et al. [28] proposed a system for the detection of PD using a fuzzy k-nearest neighbor approach and principal component analysis (PCA), and achieved a 96.07% accuracy. Rouzbahani and Daliri [29] used an SVM-based feature selection method for the diagnosis of PD from sound signals. SVM, KNN and some discrimination-function-based (DBF) algorithms were used as the classification algorithms. The highest accuracy rate was obtained with KNN algorithm with 93.82%. Ma et al. [30] used a kernel-based extreme learning machine with a subtractive clustering features weighting approach and obtained high accuracy rates. Some of the previous studies have also been carried out using gait variability extracted from the sound recordings. For instance, Khorasani and Daliri [31] developed a method based on hidden Markov model (HMM) with Gaussian mixtures using the raw gait data for diagnosis of PD. In this study, they obtained a 90.3% classification accuracy. Daliri [32] also used gait dynamics along with SVM for the automatic diagnosis of PD (a neuro-degenerative diseases) and obtained a classification accuracy of 89.33%.
In this study, a new hybrid approach consisting of feature selection and complex valued neural networks is proposed. For model building and PD diagnosis, a rich dataset consisting of features extracted from speech sound samples, is used. To develop the prediction models, a minimum redundancy maximum relevance (mRMR)-based feature selection algorithm is first applied to the raw data to determine the effective features. With the elimination of the irrelevant and redundant features, the aim was to improve the prediction accuracy of the classification algorithm, and also to reduce the computational burden. The mRMR algorithm was preferred because of its superior performance reported in a large number of previous studies where it was compared to other feature selection methods. After the feature selection, the attributes/features consisting of real values were converted to complex values. These complex valued attributes were then presented to the complex-valued neural network as the input vector. The proposed method was named/entitled mRMR + CVANN (complex-valued artificial neural networks). In the final stage, the classification results obtained from the proposed method were compared to those of the results obtained from previous studies found in the literature.
The rest of the paper is organized as follows. In section 2, a brief description of the individual methods used in this study is provided. In section 3, the details of the proposed diagnosis system is given. In section 4, the experimental design is described in detail. In section 5, the experimental results are summarized and the comparative analysis of these results are presented. In section 6, the findings are summarized and the paper is concluded with final remarks.

METHODS
This section provides a short description of the feature selection algorithms used to process the sound data samples. The details of CVANN architecture developed and used to build the prediction/classification models are then explained.

Minimum Redundancy Maximum Relevance (mRMR) Algorithm
In classification type applications, features that are extracted from the original/raw dataset are used as the inputs to the classification method. In some applications, the number of features may be limited to just a few, while in others the number of features may be too many. The features that are extracted for each application are stored in an attribute matrix. Thus, both number of rows (sample size) and the number of features define the size of the data table and affect the processing time [33]. Features that can distinguish between classes more effectively are called high-level features and are particularly important in terms of the performance of a classifier [34,35]. Instead of using all features, using only the high-level features (a subset of the total number of features) can reduce processing time and potentially can improve the prediction performance.
The mRMR feature selection algorithm is employed in the study. The mRMR is essentially a filtering algorithm that tries to select the features that are most relevant to the class labels and to filter out the rest. While identifying the most relevant features, the algorithm also tries to minimize the redundancy among the selected/relevant features [36]. Specifically, the mRMR algorithm treats each feature and the class vector (response variable or output variable) as a discrete random variable. To measure the similarity between two features or between one attribute and the class vector, it uses mutual information measure (I(x, y)). Mutual information is defined as: (1) For x and y features, p(x i ) and p(y j ) are marginal probability functions, and p(x i , y j ) is the joint probability distribution. The mutual information value is 0 where two random variables are completely independent [37], and this value is symmetric and cannot be negative (I(X, Y) ≥ 0, I(X, Y) = I(Y, X)).
Let S be the desired feature set to be selected, while |S| denotes the number of elements of this set. According to the mRMR algorithm, two conditions have to be met in order to select the attribute. The first one is the maximum relevance, MaxMR: The second one is minimum redundancy, minMV: where h = {h 1 , h 2 , …, h k }. This is the class variable of a dataset with K possible classes. Ω indicates the whole feature set, Ω s indicates all of the features except the selected one Ω s = Ω − S). There are two approaches to combine the two conditions mentioned above: Computer-Aided Diagnosis of Parkinson's Disease Using Complex-Valued Neural Networks and mRMR Feature Selection Algorithm Mutual Information Difference (MID), defined as max(MR-MV), and Mutual Information Quotient (MIQ), defined as max(MR/MV) [38]. In this study, the feature selection is carried out using MID (because of its superior performance). In this study, a variety of feature selection algorithms are applied to the original dataset and the results are compared to those of the ones obtained with mRMR to determine the effectiveness. These feature selection algorithms included Fisher score, Chi-square, sequential forward selection (SFS), sequential forward floating search (SFFS) and ReliefF. The Fisher score feature selection algorithm uses productive statistical models that can distinguish the most appropriate features [39]. The Chi-square feature selection algorithm is also one of the most commonly used methods in determining the effective features. With this method, the information value of a feature is measured by calculating the statistical value of chi-square [40]. In the SFS method, the feature selection process begins with an empty subset; for each step thereon, the feature that maximizes the classification accuracy is added to the current feature set. This process is repeated until all the features have been tested and thoroughly evaluated. The subset that maximizes classification accuracy is selected as the best feature set [41]. In the SFFS algorithm, subsets are evaluated using a forward-and-backward motion. Specifically, if a subset produces better results than the previous one, one back step is applied. If performance is not improved, the back step does not apply. In this way, the reverse direction tracking is carried out without the need for dynamic parameter settings [42]. ReliefF is a simple yet effective algorithm that estimates the value of features by measuring the interdependencies. Specifically, this algorithm changes the weight of feature conformity/value using the nearest neighbor algorithm [43].

Complex Valued Neural Network (CVANN)
CVANN is a type of artificial neural network architecture that has its network parameters in the form of complex numbers. These network parameters include weights, the threshold values as well as inputs and outputs. There are a number of studies in the literature emphasizing the advantages of CVANNs compared to the ordinary real-valued ANNs [44][45][46][47][48]. The use of complex valued input/output, weight and activation functions makes CVANN capable of boosting the functionality (and the resulting performance) of a single neuron and the network of neurons (the neural network), and can also decrease the model building/training time [49,50]. Figure 1 graphically illustrates a simple comparison between ordinary and complex valued neural network. This example shows that a 2-input ordinary neural network can be reduced to a single input by using a complex valued neural network. Simplifying and using the input values in this way was first proposed by Chen et al. [51]. According to Chen et al. [51], this type of input representation provides a significant reduction in complexity of larger networks and hence faster training and mode building opportunities. The main reason behind the advantages of this method is largely credited to the use of complex numbers that have both real and imaginary parts, and hence, have the ability to contain and pass along two-dimensional information as a single dimension. As mentioned above, this innovative representation leads to downsizing of the network and faster training of the prediction/classification model.
In addition to that mentioned above, CVANNs also have other advantages compared to real-valued neural networks, including high level functionality, better plasticity and greater flexibility. They tend to learn faster and achieve better generalizations [52]. The capability of a single neuron in a complex-valued neural network is enhanced with its flexibility-it can learn complex and nonlinear input/output mappings at both linear and nonlinear levels. That is, these complex-valued neurons have the ability to learn without generating higher degree inputs and progressing to a higher dimensional space. In a comparative study, Nitta et al. [53] showed that the XOR problem, which cannot be solved using two-layer, real-valued neural networks, can easily be solved using a two-layer CVANN.
In this study, a complex-valued, back-propagation (CBP), feed-forward learning algorithm is used to train the CVANN models. A simple representation of a single neuron used in the CBP algorithm is shown in Figure 2.
Y n , the activity value of the neuron n can be defined as follows: where W nm is the complex valued connection weight between n and m neurons, X m is the complex-valued input signal of m neuron, and V n is the complex-valued threshold value of n neuron. To obtain a complex-valued output signal, Y n activity value is converted into two components in the form of real and imaginary parts, as shown below: Computer-Aided Diagnosis of Parkinson's Disease Using Complex-Valued Neural Networks and mRMR Feature Selection Algorithm a Transformation Output Output Figure 1.
The representation of neural network with one input and one weight in the complex value (right part of the figure) which is normally realized with 2 inputs and 2 weights in real values (left part of the figure). RVN: Real-valued neuron, CVN: Complex-valued neuron. z = a + ib and w = w1 + iw2. where x and y indicate real and imaginary part of Y n value, respectively; i represents . Considering the various output functions of each neuron, the overall output functions can be defined using the following equation, where f c and f R represent complex and real-valued functions, respectively: One of the difficulties encountered in CBP applications is the selection of the most appropriate activation function. It is necessary for the activation function to be suitable to the practical applications of complex multilayer perceptron. Detailed information about the features that the complex activation function needs to have can be found in [54]. In the literature, there are several activation functions proposed for CBP. In this study, the preferred complex activation function is a superposition of real and imaginary logarithmic sigmoid [55]. This function is expressed as a complex sigmoid activation function. A complex sigmoid activation function can be defined using the following equation: The CVANN used in this study has three layers (input, hidden and output). Figure 3 illustrates the three-layered CVANN structure used in this study. A detailed description of the underlying mathematical model of three-layered CVANN can be found in [56,57].
In Figure 3, W ml is the weight between the input layer neuron I and the hidden layer neuron m, V nm is the weight between the hidden layer neuron m and the output layer neuron n, θ m is the threshold value for the hidden layer neuron m, and γ n represents the Journal of Healthcare Engineering · Vol. 6 · No. 3 · 2015 287 Input layer

Hidden layer
Output layer This study chose to use a square error function, which can be expressed as: (12) where N is the number of neurons in the output layer, and δ n = T n − O n is the error between the actual pattern O n and the target pattern T n of output neuron n. The learning rule for the complex-valued back-propagation model is performed using the equations given in [58]. The goal here is to minimize the squared error (E p ). The weights and threshold values are determined using the following equations (where η > 0, η is a learning rate): The expression from eqns. 13 to 16 can be rewritten as follows: .
Summary of the CVANN algorithm: 1. Initialization: Assign all the weights and threshold values as small complexvalued numbers (greater than zero).

Presenting inputs and outputs (target): Present the complex-valued input vectors
(I 1 , I 2 , I 3 , …, I N ) and corresponding complex-valued output vectors (target) (T 1 , T 2 , T 3 , …, T N ) to the network, where N is the number of patterns to be used in training.

Calculating the actual output (Y n ):
The actual output is calculated using eqn. 5. 4. Calculating the stopping criterion according to eqn. 21: The algorithm is stopped when the condition in the equation is met.

THE PROPOSED mRMR + CVANN DIAGNOSIS SYSTEM
This study proposes a novel hybrid method for PD diagnosis. A block diagram of the proposed system is presented in Figure 4. As shown in the block diagram, first, the attribute set containing the sound measurements of people (both healthy and PD) is presented to the system. Next, a normalization process is applied to the data to scale the variable valued between 0 and 1 to make the classification process unbiased towards any variable and to make the learning process more efficient. The min-max method, arguably the most preferred method, is adopted for variable normalization. Specifically, eqn. 22 was used to convert the variable values to the 0-1 value-range as per the minmax method.m (22) In this equation, x′ represents the normalized value; x i represents the input value; x min represents the smallest number within the input set; x max represents the maximum number within the input set.
After the normalization process, the mRMR algorithm was used for the determination/selection of the most effective attributes. After being converted to complex numbers, the resulting attributes are presented to the CVANN as the input dataset.

EXPERIMENTAL DESIGN 4.1. Data Description
The voice/speech dataset used in this study was originally obtained by Max Little from the University of Oxford in cooperation with the National Voice and Speech Center, Denver, Colorado. The data consists of the speech samples (continued phonation test records) obtained from people with and without PD [59]. The complete dataset consisted of 195 biomedical sound measurements of 23 PD patients and eight non-PD/healthy people. The dataset also contained a status column defined as 1 for PD patients, and 0 for non-PD/healthy people. Table 1 presents the statistical values of all variables in the dataset [3,4].

Experimental Setup
All of the experiments were conducted within the MATLAB environment using a PC with Intel Core i7-2670 QM (2.2 GHz) microprocessor and 8 GB RAM. For all varieties of experiments, the selection of training and testing data samples was performed using a 10-fold cross-validation (CV) methodology. With 10-fold CV, the data splitting process is carried out as follows. First, the complete dataset is randomly divided into 10 disjoint subsets, of which, nine subsets are used for training while the remaining one subset is used for testing the trained prediction model. This process repeated 10 times; each time a different subset was used for testing while the remaining nine were used for training. The prediction results of all 10 trials are then combined to determine the true accuracy of the prediction model. Compared to simple split with one training and one test dataset, this CV methodology tends to provide less biased measure of accuracy with a certain degree of reliability and validity. In the dataset used for this study, there were 195 sound measurement samples. In each of the 10 folds, 174-175 samples were selected as training dataset and the remaining 19-20 samples were selected as test dataset. The process is repeated 10 times with different fold as the test set, and the results are collected and aggregated. The data in the training and testing datasets are also stratified for the output variable to maintain the proportional representation of PD and non-PD samples.
In order to assess the performance of the proposed method, four different statistical accuracy measures were evaluated: the accuracy (i.e., hit rate), sensitivity, specificity, Fmeasure and Kappa coefficient. Formulas for these parameters are shown in eqns. 23-27.
where TP (true positive) is the number of PD patients that are accurately classified as PD, TN (true negative) is the number of non-PD patients that are accurately classified as non-PD, FN (false negative) is the number of PD patients that are inaccurately classified as non-PD, and FP (false negative) is the number of non-PD patients that are inaccurately classified as PD.
The F-measure, calculated based on the harmonic mean of the precision and recall, is often used as a complementary performance evaluation metric to assess classification methods. The F-measure takes numerical values in the range of 0 to 1, where F-measure values close to one denote the higher classification performance. Kappa coefficient (KC) is another alternative to the ordinary classification performance metrics. Generally speaking, KC is used to measure the degree of consistency between two observers [60]. In the field of machine learning, this criterion is used to compare the accuracy of a classifier with the accuracy of a random classifier (i.e., random chance) [61]. This measure is algorithmically defined as: (27) where P 0 is the accuracy of the classifier, and P c is the accuracy obtained with random estimation/chance on the same dataset. Kappa statistics produces values in the range of −1 to 1, where values close to −1 indicate low level of consistency (higher rate of misclassification) while values close to 1 indicate high level of consistency (higher rate of accurate classification).

EXPERIMENTAL RESULTS AND DISCUSSIONS
In the execution of our proposed methodology, the most effective features were determined by applying six different feature selection algorithms: mRMR, Fisher score, Chi-square, SFS, SFFS, and ReliefF. The best feature sets obtained using each of these selection algorithms are presented in Table 2.
Using the specified order, attribute values are converted into complex number format before being submitted to the classifier as an inputs. Accordingly, an input set was created by obtaining 1 complex value from 2 real values. Complex-valued attributes given as input to the CVANN are shown in Table 3. The example given in the table were prepared using the mRMR ranked attributes.
In order to achieve a high level of efficiency for the CVANN algorithm, the required values for all parameters were identified using an in-depth trial-and-error methodology. Specifically, the most effective parameter values were determined via a 10-fold Journal of Healthcare Engineering · Vol. 6 · No. 3 · 2015 293 cross-validation-based experimentation method from the training data, and these parameter values were used during the testing phase. For a fair comparison, the same set of parameter values were used in all experiments. Accordingly, an optimal network structure (input-hidden-output) was determined as [the number of input variables]-10-2 (representing the optimal number of neurons to use in input-hidden-output layers).
The learning coefficient was determined as 0.9, and eqn. 21 was used as the stopping criterion. A complex sigmoid function was selected as the activation function. Figure 5 illustrates the relationship between the number of features and the classification accuracy, showing that the classification accuracy increases with the 294 Computer-Aided Diagnosis of Parkinson's Disease Using Complex-Valued Neural Networks and mRMR Feature Selection Algorithm  number of attributes selected/ranked by each of the six feature selection algorithms. The figure also shows that the classification accuracy stabilizes beyond 12 features, and among all six feature selection methods, mRMR produces the best prediction accuracy at the rate of 98.12% with only 12 features. The best complex feature combinations obtained using the six different feature selection algorithms, and the best accuracy rates obtained with that feature combinations are given in Table 4. According to Figure 5 and Table 4, the best results were obtained using the mRMR + CVANN method. The ReliefF + mRMR hybrid method produced the second best results. In general, the lowest accuracy rates were produced with the feature sets obtained using the SFFS algorithm. It is somewhat surprising to see that SFFS produced lower accuracy rates compared to SFS as SFFS is presumably the improved version of SFS. According to the literature, depending on the dataset used and properties set, it is possible in rare cases where SFS performs better, while in most other cases, SFFS produces better accuracy results [62][63][64][65].
The results obtained in terms of the performance evaluation criteria mentioned above are presented in Table 5. Also included in this table are the results obtained using all the features presented to the classifier. In addition, the results obtained using ANN are also presented for a direct comparison with CVANN.
As shown in Table 5, the mRMR + CVANN model achieved the highest accuracy results of 98.12%, 99.24% and 98.96% in terms of Accuracy, sensitivity and specificity, respectively, and obtained the highest F-measure of 0.9905 and Kappa statistic value of 0.9896. Compared to the CVANN without feature selection, mRMR + CVANN improved the average performance by 3.77%, 3.91%, and 7.85% in terms of ACC, sensitivity and specificity, respectively. Also, the CVANN algorithm produced better results compared to the traditional ANN. When mRMR + CVANN and mRMR + ANN Journal of Healthcare Engineering · Vol. 6 · No. 3 · 2015 295 Table 4. Feature rankings and complex combinations obtained using different feature selection algorithms. are compared, mRMR + CVANN has improved the average performance by 3.84%, 3.94%, and 7.94% in terms of ACC, sensitivity, and specificity, respectively. Better results were also obtained with mRMR + CVANN for Kappa and F-measure values. The standard deviation of the mRMR + CVANN method was lower than that of ANN. This shows that the proposed method is more robust and more reliable than the other methods mentioned above. Table 5 also shows that the feature selection method and the CVANN produces better results in terms of computation time requirement. After the application of the feature selection method, the computing time decreases. As a result, the proposed method is deemed to be a fast, accurate and reliable prediction method for this application domain.

Method Best complex combination Accuracy
The classification accuracy rates obtained in this study and in the previous studies on the same dataset are compared, and the results are exhibited in Table 6. Only the studies that used the same dataset were compared for a reliable and fair comparison. As shown in Table 6, previous prediction methods provided fairly good results, with accuracy levels ranging between 80% and 97%. Our proposed method produced a better prediction performance with 98.12% overall accuracy on text dataset, compared to previous studies. The two methods with accuracy results closest to the present method were those proposed in [23] and [25]. Polat et al. [23] adopted 50%-50% training-test data selection for cross validation. They obtained a classification accuracy of 97.93% for the diagnosis of PD. In order to perform a fair comparison with their method, the proposed method of the current study was re-run with 50% -50% training-test data split. In this re-run, the classification accuracy of our method came out as 98.25%, which is slightly better (by 0.32%) than Polat et al.

296
Computer-Aided Diagnosis of Parkinson's Disease Using Complex-Valued Neural Networks and mRMR Feature Selection Algorithm In previous studies, we observed that CVANN produces higher accuracy dataset compared to traditional real-valued ANN applied to the same problem and the same dataset [50]. Especially for the systems that naturally work with complex values, CVANN provides significantly better prediction results [44,66]. For important applications such as critical diagnostics and diagnostic systems in medicine, even a slight increase in accuracy rate makes a significant difference. The present study further corroborates that CVANN is a viable (and perhaps superior) alternative tool for building and deploying highly accurate medical diagnostic systems.
There are a number of possible reasons behind the success of CVANN, such as the following: • Mapping capability of CVANN: A neuron has two main functions to perform: an aggregation function and an activation function. The aggregation function maps a multidimensional input space into the neuron's net-input space, which is one dimensional for a real-valued network and two dimensional for a complex-valued network [68]. The activation function allocates net input space into discrete clusters that represent different classes using a threshold operation on the output provided by the activation function collector. In the mapping by the aggregator, each input is multiplied by a connection weight and then the resulted weighted Journal of Healthcare Engineering · Vol. 6 · No. 3 · 2015 297 inputs are added. If we consider ¡ R as the set of all possible mappings for an realvalued network and ℑ C as the set of all possible mappings for a complex-valued networks, it can be seen that ℑ R ʚ ℑ C . This is because a complex multiplication scales and rotates an input with any optional amount, whereas a real multiplication does a scaling with an optional amount but a rotation of only 0 or π [68]. In other words, the mapping capabilities of a complex-valued network is superior to a realvalued network, and this may be one of the main reasons for its superior performance. • High functionality is the ability of a single neuron to learn linearly inseparable input/output mappings. Thus, a neuron has the ability to learn these mappings in the initial stage before producing a higher level of input, and transforming to a higher dimensional space, respectively. Studies showed that a single neuron with complex-valued weights can solve linearly inseparable problems such as the exclusive or (XOR) classification problem. This ability suggests that a single CVANN has a higher functionality than a single ANN [52]. • In ANNs, input variables are single values (i.e., real numbers), while in CVANNs, input variables are complex values (complex numbers consisting of real and imaginary parts). Therefore, in CVANN, two-dimensional data inputs are possible. As described in Section 2.2, this multi-dimensional data representation and complex multiplication operations may be among the main factors that improves the accuracy and thus increasing popularity of CVANN. In summary, the main reason for CVANN to achieve better diagnosis performance than its traditional counterparts is its superior mapping capabilities coupled with efficacy in high functionality.

Limitations and Future Research Directions
As is the case in any developmental research, there are some limitations to the proposed method. First, the usability of the method needs to be improved. In order to use the developed software system, domain experts may need an intuitive, somewhat automated graphical visual interface. In the near future before real-world deployment, we plan to develop a graphical user interface that encapsulates the prediction models and improves user-friendliness. Additionally, in order to increase the applicability, the program's processing time (time to build and calibrate the prediction models and to use the model for diagnosis purposes) needs to be reduced and the efficiency be increased. Experimental results showed that the computation time of the proposed method is longer than desired. A significant improvement in the computational efficiency is possible and is part of our near future development plans. A program capable of processing user's requests at the level of milliseconds will increase the usefulness and adaptation of the prediction system.
In this study, 22 features are selected in the feature selection process. Additional features including socio-demographic and medical/diagnostic characteristics may have significant impacts on accurate diagnosis of PD. A more comprehensive study with a significantly extended feature set is among the future development paths of our research efforts. A database with limited number of features frequently used in the extant 298 Computer-Aided Diagnosis of Parkinson's Disease Using Complex-Valued Neural Networks and mRMR Feature Selection Algorithm literature is preferred in this study in order to compare and contrast the results of the study with those presented by previous studies. Even though the size of this dataset is small, we propose to apply the same method to a more comprehensive/extensive dataset to develop more robust prediction models. ANN, generally considered a popular member of the family of black box models, is a biologically inspired mathematical method capable of generating solutions based on historical cases, i.e., previously recorded input and output data. Even though it creates highly efficient models, it is not capable of explaining its inner structure (how and what it does). In other words, ANN cannot explain how the inputs are used to generate results. This black box designation applies not only to ANN but also to CVANN. A variety of research streams are dedicated to shed light to the black box, that is, to better understand the internal structure of the prediction system. Among these streams, sensitivity analysis has received significant interest, where the input variables of a trained neural network is perturbed one variable at a time and the impact of this perturbation is recorded and translated/transformed to a rank ordered variable importance measure for all input variables.

CONCLUSION
In this study, we proposed a new approach for accurately diagnosing PD that can help medical personnel to make better and faster decisions. The proposed approach is capable of automatically analyzing data related to PD to develop prediction/diagnostic models with a high degree of accuracy in a relatively short time. The main novelty of the proposed study relates to the use of a hybrid methodology herein referred to as mRMR + CVANN, which integrates an effective feature selection method and a strong classifier. In this methodology, an effective feature set was obtained using an mRMR algorithm. Application of this algorithm resulted in a smaller feature set by eliminating less relevant features. Complex numbered features were then obtained from the optimally selected/reduced feature set. The complexvalued feature combinations produced and used in this study are among the most important contributions/innovations of the proposed method. A CVANN algorithm with high functionality and a very good classification capability was designed and developed during the classification stage of the proposed method. The prediction results obtained were very promising. Thus, a prediction system that can be used as a part of a computer-aided diagnosis system was developed. This system has the capability and potential to help doctors and other medical professionals in the diagnostic related decision processes for different diseases.