Heart Disease Diagnosis Utilizing Hybrid Fuzzy Wavelet Neural Network and Teaching Learning Based Optimization Algorithm

Among the various diseases that threaten human life is heart disease. This disease is considered to be one of the leading causes of death in the world. Actually, the medical diagnosis of heart disease is a complex task and must be made in an accurate manner. Therefore, a software has been developed based on advanced computer technologies to assist doctors in the diagnostic process. This paperintendstousethehybridteachinglearningbasedoptimization(TLBO)algorithmandfuzzywaveletneuralnetwork(FWNN)forheartdiseasediagnosis.TheTLBOalgorithmisappliedtoenhanceperformanceoftheFWNN.ThehybridTLBOalgorithmwithFWNNisusedtoclassifytheClevelandheartdiseasedatasetobtainedfromtheUniversityofCaliforniaatIrvine(UCI)machinelearningrepository.Theperformanceoftheproposedmethod(TLBOFWNN)isestimatedusing 𝐾 -fold cross validation based on mean square error (MSE), classification accuracy, and the execution time. The experimental results show that TLBO FWNN has an effective performance for diagnosing heart disease with 90.29% accuracy and superior performance compared to other methods in the literature.


Introduction
Heart disease is a term that refers to any disturbance that makes the heart function abnormally [1].When the coronary arteries are narrowed or blocked, the blood flow to the myocardium is decreased.This represents the main reason for the emergence of heart disease in humans.There are several risk factors for this disease, including diabetes, smoking, obesity, and a family history of heart disease, high cholesterol, and high blood pressure [1][2][3].
Actually, the incidence rate of heart disease is on the rise.Every year about 720,000 Americans have a heart attack.Among them, 515,000 have their first heart attack and 205,000 people have a second (or third, etc.) heart attack.Heart disease causes the death of about 600,000 people in the United States every year, which makes it responsible for one in every four deaths [4].
Due to the large number of patients with heart disease, it became necessary to have a powerful tool that works accurately and efficiently for diagnosing this disease and helping the physicians make decisions about their patients.This is because the process of diagnosis and decision making is a difficult task and needs a lot of experience and skill.
Recently, a lot of research has been published regarding the field of medical diagnosis of heart disease.In 2008, Kahramanli and Allahverdi designed a hybrid system that represents a combination between fuzzy neural network (FNN) and artificial neural network (ANN), trained by a backpropagation (BP) algorithm.The results have shown that the proposed method is comparable with other methods [5].Besides, Das et al. in 2009 proposed a system for diagnosing heart disease by using the neural network ensemble model, which combines several individual neural networks that are trained on the same task.However, this method increased the complexity and therefore the execution time [3].In 2011, Khemphila and Boonjing presented a classification approach to diagnose heart disease using multilayer perceptron (MLP) with a backpropagation learning algorithm, as well as a feature selection algorithm to use 8 features instead of 13.The results have shown that the accuracy rate was enhanced 2 Advances in Artificial Neural Systems by 1.1% in the training data set and by 0.82% in the testing data set [6].In 2013, Beheshti et al. applied the centripetal accelerated particle swarm optimization (CAPSO) to evolve the learning of the artificial neural network (ANN), which was used to classify a heart disease data set.The results have shown that the diagnosis rate still needs to be improved [7].From all previous studies, the classification accuracy did not reach a desirable level.However, the main objective of all previous studies was to make the diagnosis process of heart disease more accurate and efficient.
Moreover, fuzzy neural network (FNN) is the combination of neural network and fuzzy logic in one system that contains both the interpretability property and inference ability of fuzzy logic to handle the uncertainty and the self-learning ability of a neural network to improve the approximation accuracy [8][9][10].In spite of the fact that the FNN/NN has many advantages, it still suffers from some drawbacks, such as slow training speed, high approximation error, and poor convergence problems [8].
In recent years, many researchers have emphasized the use of wavelet neural network (WNN), which combines the wavelet function and a neural network.WNN integrates the learning capability of NN with the decomposition capability [8,11], orthogonality [12], and time frequency localization properties [10,[12][13][14] of the wavelet function.The main advantages of the WNN are better generalization capability [15], faster learning property [8,15], and smaller approximation errors and size of networks than NN [8].Therefore, it is able to overcome the obstacles of FNN/NN, especially in highly nonlinear systems [8].
According to the mentioned properties of the WNN and fuzzy system, fuzzy wavelet neural network (FWNN) will be considered in our work.FWNN has been presented in some application areas: function learning [16], chaotic time series identification [14], and the approximation of function, identification of nonlinear dynamic plants, and prediction chaotic time series [17].FWNN combines the main advantages of a fuzzy system, wavelet function, and neural networks and, therefore, could bring the low level learning and good computational ability of WNN into a fuzzy system and the high humanlike thinking of fuzzy system into the WNN [8].
The training process of FWNN is a crucial task and requires robust optimization techniques to make the performance of this network more accurate and efficient.According to [18][19][20][21], which make a comparison study among some naturally inspired optimization algorithms, such as particle swarm optimization (PSO) [22], differential evolution (DE) [23], teaching learning based optimization (TLBO) [24], artificial bee colony (ABC) [25], and firefly algorithm (FA) [26], the TLBO algorithm is accurate, effective, and efficient and shows superior performance in comparison to others.The attractive property of the TLBO algorithm is that it is a simple mathematical model and is one of the most powerful tools to find the optimal solution in a shorter computational time period [18].In addition, TLBO has balanced exploration and exploitation abilities so it does not get stuck in local minima [27].
In accordance with the above-mentioned advantages of both FWNN and TLBO, in this paper a new method (TLBO-FWNN) is proposed to increase the efficiency of the heart disease diagnosis process.
The rest of this paper is organized as follows: section two represents a background about FWNN; section three explains the TLBO algorithm; in section four, the proposed method of TLBO-FWNN is illustrated; section five describes the heart disease data set; section six demonstrates the -fold cross validation; and finally section seven presents and discusses the experimental results.

The Background of Fuzzy Wavelet
Neural Network (FWNN) One of the most commonly used methods for diagnosing of heart disease is artificial neural network (ANN) [6,7].ANN is considered an effective artificial intelligence method if it has enough training data.The description of an ANN structure is a difficult task [28].Fuzzy logic [29] has the ability to deal with cognitive uncertainties in a manner like humans [30].Fuzzy logic is used for improving the ability of the neural network and increasing its learning rate [31].So, the combination of a neural network and fuzzy logic in one system leads to the creation of another powerful computational tool called fuzzy neural network (FNN), which combines the advantages of both approaches [30,31].Most fuzzy neural networks use the sigmoid function as the activation function in the hidden layer, but this type of function leads the training algorithm to converge to the local minima [32] and generally decreases the convergence speed of the network.Therefore, the wavelet function is used as an alternative to the sigmoid function [33,34].The wavelet function is a waveform that has limited duration, and the mean value of this duration is zero.This function has two parameters, the dilation parameter and the translation parameter [33].
The combination of the wavelet theory with a fuzzy system and neural network generates the so-called fuzzy wavelet neural network (FWNN) [13].The structure of the FWNN is described in Figure 1, which contains seven layers [17,33,34].
(i) In the first layer, the number of nodes is equal to the number of input variables.(ii) In the second layer, each node represents one fuzzy set, in which the calculation for the membership value of the input variable to the fuzzy set is carried out.(iii) In the third layer, each node represents one fuzzy rule.Therefore, the number of nodes is related to the number of fuzzy rules.The output of this layer can be calculated using the following: where Π refers to the AND operation and    represents the membership function used to calculate

Layer1
Layer 2 Layer 3 Layer 4 Layer 5 Layer 6 Layer 7  the membership degrees of the input variables.In this study, a Gaussian membership function is used that can be calculated using In ( 6),    and    represent the center and the width of the membership function, respectively.(iv) The fourth layer contains the wavelet functions   in its neurons, which represents the consequent part of the fuzzy rules.In FWNN, fuzzy rules can be represented by the following equation: where   refers to the th rule, and (1 ≤  ≤ ), where  is the number of the rule.  refers to the th input, and (1 ≤  ≤ ), where  is the number of input parameters, while   refers to the th output from the wavelet neural network (WNN).
In the wavelet neural network, which is described in Figure 2, the output can be calculated using where   refers to the weight coefficients and   represents a set of wavelet functions, which is called a wavelet family that can be defined computationally using where   and   represent the dilation and translation parameters, respectively, and  refers to the mother wavelet that can be calculated using the Mexican hat wavelet function that can be represented in (v) In the fifth layer, multiplication between the outputs of the third layer and the outputs of the fourth layer is done.The output of this layer ( 5 ) can be calculated using where 1 ≤  ≤  and  refers to the number of fuzzy rules and wavelet functions.
(vi) In the sixth layer, the output involves two parts.The first part (  6 ) aggregates the outputs of the fifth layer, which can be represented using The second part (  6 ) aggregates the outputs of the third layer, which can be represented using (vii) The seventh layer represents the defuzzification process, which is used to calculate the overall output of FWNN by using

Review of Teaching Learning Based Optimization (TLBO) Algorithm
Teaching learning based optimization algorithm (TLBO) is a new optimization algorithm proposed by Rao et al. in 2011 [24].This algorithm is inspired by learners receiving teaching from the teacher in a class.The teaching process is done either by means of the teacher who is considered to be a highly learned person and has a great influence on the output of students (teacher phase) or through the interaction among learners (learner phase) [20,21,24,[35][36][37].
TLBO is a population-based algorithm in which the population is considered as a class of  learners.Each learner represents a solution and the dimension of each solution that is considered as different subjects offered to the learners actually represents the parameters involved in the objective function of the given optimization problem.The evaluation of each solution (fitness value) of the optimization problem represents the students' grades.The best fitness value is considered to be the teacher [36,[38][39][40].
The main characteristic of this algorithm is that it does not contain any specific parameters; it includes common parameters only [27,41].The implementation of a TLBO algorithm will be explained in the following subsection [24,38].
(1) Define the optimization problem and initialize the common parameters, which are population size (ps) that represents the number of learners () and the dimension of each learner (), which represents the subjects offered to the learners.In addition, set the value of the maximum number of iterations and the values of the constraints variables (lb, ub), which denote lower and upper boundaries, respectively.
(2) Generate the initial population randomly with () rows and () columns within [lb, ub] and then calculate the objective function value of each solution using (), where  is 1, 2, 3, . . ., .The results were sorted in an ascending order corresponding to ps (ascending order is convenient for finding minimum value; maximum value can be obtained by multiplying by −1 before the objective): . . .
(3) In the teacher phase, calculate the mean of the population column wise where Advances in Artificial Neural Systems 5 (4) The teacher tries to improve the grade average of the students using where  new, represents the improved learners,   represents the current learners,   is a random number in the interval [0, 1],  1 is the desired mean,  mean is the current mean [42], and   is a teaching factor that is not a parameter of the TLBO algorithm: it is calculated randomly using (14), which decides the value of the mean to be changed [43]: In  new, , if the value of any variable is less than lb or bigger than ub, it is equal to lb or ub, respectively [35]: (5) In the learner phase, a learner interacts randomly with other learners to enhance his or her knowledge.( 6) Randomly select two learners and   and   where  ̸ = : In  new, , if the value of any variable is less than lb or bigger than ub, it is equal to lb or ub, respectively: If  ( new, ) < ()  , then   =  new, else   =   .
(7) The duplicate solutions are modified in order to avoid trapping in the local optima by using mutation process on randomly selected dimensions of the duplicate solutions before executing the next iteration.(8) Sort the results in an ascending order corresponding to ps. (9) Repeat processes (3) to (5) until the termination condition is satisfied.

The Proposed Method of TLBO-FWNN
To increase the accuracy of the diagnosis process of heart disease using FWNN, one of the robust optimization algorithms should be used for FWNN training.Thus, the methodology of this study includes two important procedures.The first one is constructing the structure of FWNN for the heart dataset that will be used in both the training and testing phases.
The second procedure represents the training process for the constructed FWNN by utilizing the TLBO algorithm.For conducting the training process, a sample of data related to heart disease, which is called training data, is used as the input variables to the FWNN.Then, the mean square error (MSE) is calculated, which represents the difference between the actual output and the desired output of the FWNN.MSE is computed using where 1 ≤  ≤  represents the number of input patterns, 1 ≤  ≤  refers to the iteration number,    represents desired output, and   represents the actual output of the FWNN.
The MSE represents the objective function of the TLBO algorithm, which is used to calculate the objective function value of each individual in the population.The population in this algorithm is represented by a set of solutions; each solution refers to one learner and that solution has a number of values that indicate the number of parameters (subjects) to be updated in the FWNN.In this study, the parameters are linkage weights in the FWNN and the wavelet parameters (dilation and translation).The value of these parameters is initialized randomly.Then, these values are updated using the TLBO algorithm to obtain optimal values with the minimum error rate and the highest classification accuracy.
In the testing phase, the optimal values obtained from the training phase with the testing data will be used as the input variables to test the FWNN trained by the TLBO algorithm.The output of the FWNN is calculated and compared with the desired output to investigate the learning ability of the FWNN to classify the heart dataset.
The main steps of training the FWNN using the TLBO algorithm are as follows.
(1) Initialize randomly the values of each learner (i.e., weights, dilation parameters, and translation parameters) within the interval [−1, 1], which represents the lower and upper boundaries, respectively.Then, initialize the common parameters of the TLBO algorithm, which are population size, maximum iteration number, and the dimension.
(3) Evaluate each learner by calculating its objective function value based on the FWNN, which gives the error rate of each learner.
(4) Update the weight, dilation, and translation parameters using the TLBO algorithm.
(5) Keep the best learner, which represents the teacher (the best values of weights, dilation, and translation parameters).
(7) Repeat steps (3) to ( 6) until the maximum iteration number is reached.Figure 3 represents the flowchart of the TLBO-FWNN method.

The Heart Disease Dataset
The Cleveland heart disease dataset was obtained from the Cleveland Clinic Foundation, collected by Robert Detrano.This data was used to predict the presence or the absence of heart disease.instances of them were used in this study because 6 instances were missing some of their attributes.The heart dataset contains 160 normal instances and 137 abnormal instances.Each instance has 76 attributes but all published experiments prefer to use 14 of them.Tables 1 and 2 briefly illustrate the properties of these attributes [44].

𝐾-Fold Cross Validation
During the training process of a neural network, the use of fold cross validation makes the results of the testing process more reliable [45] because it guarantees that all data is used for training and testing.In -fold cross validation, the data is randomly divided into  parts called folds, where each fold is equal to another.Among the  folds, one fold will be selected for testing and the  − 1 folds will be used for training.This process is repeated  times.Finally, all testing results are averaged to produce a single estimation result [45,46].In this study, fivefold cross validation is used.

Experimental Results and Discussion
In this study, the performance of the proposed system for diagnosing the presence (1) or the absence (0) of heart disease is investigated using a common heart dataset (Cleveland heart disease dataset).We measured error rate, classification rate, and the time taken.The heart dataset is divided into 5 folds: one fold is used for testing and the other 4 folds are used for training.Each fold has 60 instances and each instance has 13 attributes, as shown in Table 1.This process is repeated 5 times.As mentioned previously, the TLBO algorithm has common parameters, which are population size (ps) and the dimension of each solution ().In this research, the value of the  parameter is equal to 81, which represents the weight, dilation, and translation parameters.However, the value of the ps parameter is varied because the user of this algorithm does not have adequate knowledge about the appropriate value of this parameter.In this study, the training process is repeated in three separate experiments with three various ps, which are 50, 100, or 150.The maximum iteration number represents the stopping condition, set to 500.In FWNN, the classification of the heart dataset is based on the number of fuzzy sets, fuzzy rules, and wavelet functions, which are equal to three, in addition to the value of each attribute.
In terms of the error rate, which represents the percentage of incorrect classifications for training the FWNN using the TLBO algorithm for the Cleveland heart disease dataset, it is illustrated in Table 3.In this table, the results have shown that the TLBO-FWNN reached the minimum error rate (0.0585) when the population size was equal to 150.
Moreover, we noticed from Table 4, which represents the correct classification percentage, that the best classification accuracy is obtained when the population size is equal to 150 (94.1422).Increasing the population size leads to an increase in the training duration, as shown in Table 5, where the average time of training is 138.31 minutes when the population was equal to 150; the average time is 45.39 when the population size was equal to 50.Also, the classification accuracy and the error rate are very close to each other for both population sizes (100 and 150), as shown in Tables 3 and  4.
Tables 6 and 7 illustrate the MSE and the classification accuracy of testing the FWNN for unseen heart data by using the optimal parameter values obtained from the TLBO algorithm, respectively.In Table 6, the minimum average error rate of 0.0970 is obtained when the population size is equal to 100.In addition, the highest average classification rate (90.2909) is acquired when the population size is equal to 100.
Moreover, the input variables of the FNN are automatically normalized during the fuzzification process.So, the input variables of the WNN should be normalized too to be more homogenous with the input variables of the FNN.The data normalization process is done by finding the maximum value of each column of input variables and then dividing each value in that column on this maximum value.
The error rate of training the FWNN utilizing the TLBO algorithm for the normalized heart disease dataset is illustrated in Table 8.In this table, the results have shown that the TLBO-FWNN reached the minimum error rate when the population size was equal to 100, which is 0.0489 in all experiments.
In addition, as shown in Table 9, which represents the correct classification percentage, the highest average classification accuracy (95.0964) is obtained when the population size equals 100.
The average time taken was equal to 88.498 minutes only when the population size was equal to 100, as shown in Table 10.
Thus, Tables 11 and 12 explain the MSE and classification accuracy when testing the FWNN for normalized unseen heart data, respectively.In Table 11, the lower average error testing rate (0.0997) is obtained when the population size is equal to 50.In Table 12, the highest average classification rate (90.0213) is acquired when the population size was equal to 50.
In conclusion, in comparing the maximum classification rate acquired during the testing of the FWNN on normalized and nonnormalized data, the results have shown that the classification accuracy on nonnormalized data (90.2909) is better than the classification accuracy on normalized data (90.0213).Also, to investigate the performance of the proposed TLBO-FWNN method, a comparison with eight recently proposed methods from the literature was carried out on the same dataset.
As shown in Table 13 and Figure 4, the results show that the proposed method, TLBO-FWNN, has the best performance for diagnosing heart disease in terms of classification accuracy (90.2909) compared to the results obtained by other methods.The GSA+MLP method was the worst of the others.

Conclusion and Future Work
In this paper, a hybrid fuzzy wavelet neural network and teaching learning based optimization algorithm were used to classify the presence or the absence of heart disease.The teaching learning based optimization (TLBO) algorithm has been proposed for training fuzzy wavelet neural networks (FWNN).The simulation results have shown that when population is of a medium size (100), TLBO-FWNN gives good results in a somewhat short amount of time.In addition, these results demonstrate that the TLBO-FWNN method has superior performance compared to other published methods, giving the highest classification accuracy.
In addition, there are some suggestions that can be applied to enhance the performance of the TLBO-FWNN method in the future, such as using a TLBO algorithm to evolve the structure of the FWNN or utilizing another optimization algorithm to enhance the learning of the FWNN.

Figure 1 :
Figure 1: The general structure of the fuzzy wavelet neural network.

w 1 Figure 2 :
Figure 2: The structure of the wavelet neural network.

Figure 4 :
Figure 4: The classification accuracy for the proposed and existing methods.

Table 1 :
Properties of the input heart dataset's attributes.

Table 2 :
Properties of the output heart dataset's attributes.

Table 3 :
Error rate of training the FWNN using TLBO on the heart dataset.

Table 4 :
Classification rate of training the FWNN using the TLBO on the heart dataset.

Table 5 :
Time taken for training the FWNN using the TLBO on the heart dataset.

Table 6 :
Error rate of testing the FWNN on the heart dataset.

Table 7 :
Classification rate of testing the FWNN on the heart dataset.

Table 8 :
Error rate of training the FWNN using the TLBO on the normalized heart dataset.

Table 9 :
Classification rate of training the FWNN using the TLBO on the normalized heart dataset.

Table 10 :
Duration time for training the FWNN using the TLBO on the normalized heart dataset.

Table 11 :
Error rate of testing the FWNN on the normalized heart dataset.

Table 12 :
Classification rate of testing the FWNN on the normalized heart dataset.

Table 13 :
A comparison of the proposed and existing methods based on classification accuracy.