Fuzzy Rule-Based Classification System for Assessing Coronary Artery Disease

The aim of this study was to determine the accuracy of fuzzy rule-based classification that could noninvasively predict CAD based on myocardial perfusion scan test and clinical-epidemiological variables. This was a cross-sectional study in which the characteristics, the results of myocardial perfusion scan (MPS), and coronary artery angiography of 115 patients, 62 (53.9%) males, in Mazandaran Heart Center in the north of Iran have been collected. We used membership functions for medical variables by reviewing the related literature. To improve the classification performance, we used Ishibuchi et al. and Nozaki et al. methods by adjusting the grade of certainty CF j of each rule. This system includes 144 rules and the antecedent part of all rules has more than one part. The coronary artery disease data used in this paper contained 115 samples. The data was classified into four classes, namely, classes 1 (normal), 2 (stenosis in one single vessel), 3 (stenosis in two vessels), and 4 (stenosis in three vessels) which had 39, 35, 17, and 24 subjects, respectively. The accuracy in the fuzzy classification based on if-then rule was 92.8 percent if classification result was considered based on rule selection by expert, while it was 91.9 when classification result was obtained according to the equation. To increase the classification rate, we deleted the extra rules to reduce the fuzzy rules after introducing the membership functions.


Introduction
In the past years, fuzzy if-then rule-based systems were used basically to control problems, while nowadays they are mainly applied in classification tasks [1][2][3][4][5][6][7]. There are many methods for automatically generating and learning the fuzzy if-then rules from numerical data for pattern classification problems [2,4,5,8,9].
The concepts of linguistic statement have been introduced by Zadeh [10] and it is crucial that we see each attribute as linguistic value showed by fuzzy numbers with trapezoidal membership function [11,12].
After generating the volunteer rule, a set of rules must be chosen to structure the rule based on the classifier. In this paper, we assume a number of prespecified fuzzy sets given by expert's knowledge for each input attribute [13]. The performance of resulting classifiers can be enhanced by using rule weighting [14]. Therefore, our study can have effect on generating, weighting, and selecting rules based on expert opinions or systems outputs. Therefore, the total number of fuzzy if-then rules generated by partitioning each attribute into fuzzy subsets in an -dimensional pattern classification is defined as [4,5,15]. In this work, we adjusted the grade of certainty of fuzzy if-then rules when there was misclassification of an input pattern in an error-correction method [15].
Coronary artery disease (CAD) is the most common cause of mortality in worldwide population and has a high prevalence of mortality rate in both developing and developed countries [16]. Many risk factors such as age, sex, high 2 Computational and Mathematical Methods in Medicine blood pressure, diabetes, obesity, smoking, family history of coronary artery disease, and cholesterol (LDL) have essential role in CAD [17]. Some risk factors such as sex and diabetes are crisp, whereas others are fuzzy sets.
The gold standard method for the diagnosis of CAD is coronary angiography (CA). Since CA is a costly and invasive procedure needing technology and high level of technical experience, it cannot be used to screen large population [18][19][20]. Therefore, noninvasive alternative methods for coronary angiography are necessary. A number of noninvasive CAD diagnosis methods have been proposed in literature [21], namely, exercise stress test and Single Photon Emission Computed Tomography (SPECT) or scintigraphy and Echo. However, the diagnosis accuracy of these tests is not as high as that of coronary angiography. Some studies showed that these medical tests do not have an accurate result and fuzzy set theory can enhance their accuracy [22][23][24][25]. Moreover, the diagnosis accuracy of a combination of the results of noninvasive clinical tests and other clinical-epidemiological attributes has been improved by using fuzzy models [26]. Fuzzy systems have been proposed and used to determine the cardiovascular diseases and to assess the risk factors due to the ambiguity and uncertainty of the diagnostic process [27,28].
Fuzzy rule-based classification system is one of the fuzzy systems. In recent years, fuzzy classifier methods have been commonly used in medical diagnosis. In the literature there are a lot of studies that have worked on the classification of medical data (on cancer and cardiology) using fuzzy classifiers [29,30]. In 2004, Vig et al. employed fuzzy set theory for the diagnosis of CAD [31]. Allahverdi et al. designed a fuzzy expert system to determine coronary heart disease risk [32] and P. Srivastava and A. Srivastava conducted similar study in India [33] and some of investigators studied improving fuzzy decision systems for CAD diagnosis.
Clinical importance or relevance of features (input variables) in cardiology diagnostic tests can introduce weights for interpretable fuzzy rule selection. CAD diagnosis is a complex and important problem. Some of rules are clinically acceptable and desirable by physicians. In this paper, we added the result of MPS to other attributes for generating fuzzy rules according to physician knowledge and supervised classification with labeled data. Data set is classified using Ishibuchi et al. [34][35][36] weighted fuzzy rule-based classifier to diagnose CAD and 3 levels of severity of CAD.
The aim of this study was to determine the accuracy of fuzzy rule-based classification that could noninvasively predict the CAD based on myocardial perfusion scan test and clinical-epidemiological variables.
After Introduction in Section 1 we continued, in Section 2, generating, learning, and weighting fuzzy if-then rules. Finally, in Sections 3 and 4, we showed application results and discussion.

Methods: Weighting Fuzzy
Classification System
where is the label of the th fuzzy if-then rule, is the total number of fuzzy if-then rules, = [ 1 , . . . , ] is the pattern vector -dimensional, 1 presents antecedent fuzzy sets for the th attribute, represent a consequent class (i.e., one of the classes), and is a certainty grade of the fuzzy if-then [2,5,9,13,15,34,35]. As antecedent fuzzy sets we employ trapezoidal fuzzy sets, where we display other partitions of the unit interval into fuzzy sets [2,36].
Using (1), we generate fuzzy if-then rule that consists of two below steps. The first one is specifying membership function of antecedent fuzzy sets and the second one is determining consequent class and certainty grade of the fuzzy rule [2,5,15]. The antecedent part of the fuzzy if-then rules is initialized manually [2]. For each training pattern, the concept of a weight is applied. The weight of misclassified/rejected patterns is observed as a cost of misclassification or rejection. When a training pattern is misclassified, then the adjustment of fuzzy rules arises. In this study, we can determine both consequent class and the grade of certainty for all rules of the following type.
Step 1. Calculate the compatibility grade of training patterns using product as T-norm: where (⋅) is the membership function of fuzzy sets .
Step 2. For each class ℎ can be calculated class ℎ ( ) according to And is the weight of the training pattern.
Computational and Mathematical Methods in Medicine 3 Step 3. Find classĥ that has largest sum of class ℎ ( ): Note that if two or more classes take the maximum value of (4), then the consequent class of the fuzzy rule cannot be individually determined, so is also specified as 0. Let the grade of certainty of fuzzy rule be = 0. If only class takes the maximum value, let be classĥ. The grade of certainty can be assigned as follows: is the total number of classes.
After generating fuzzy if-then rules by (1), both the consequent class and the grade of certainty can be determined for all rules; then a new pattern = ( 1 , . . . , ) according to the following procedure can be classified.
Step 1. For class ℎ (ℎ = 1, . . . , ), calculate class ℎ ( ) as Step 2. Find class ℎ * that has the largest sum of class ℎ ( ): When multiple classes take the same (maximum) value in (8), then the classification of cannot be classified (i.e., is left as an unclassifiable pattern); otherwise, a sign to class ℎ * (i.e., ) is considered as a classifiable pattern by (8).

Learning Fuzzy If-Then Rules.
In this section, we use a method for improving classification performance. It adjusts the grade of certainty of each rule in Nozaki et al. [37]. Distributed representation of fuzzy rules is a general concept which is not restricted within the field of pattern classification [38]. The total number of the distributed fuzzy rules can be reduced by removing some unnecessary rules from (1) since such selection of fuzzy rules may require a complicated procedure.
When a fuzzy rule is found, training patterns covered by these rules are removed or their weights are set to zero. Then another fuzzy rule is found and added to the rule set using the modified training set [39].
After generating rules by system, each rule was evaluated by expert for decreasing the rule space dimension. Number of rules inducted by systems was 545 rules which were then reduced to 144 by using expert rule selection. We assigned weights to rules in two cases separately.
When a training pattern was not successfully classified by the fuzzy if-then rule , its grade of certainty was decreased as On the contrary, if the training pattern was correctly classified, we could reinforce the grade of certainty in the following manner: where is a positive constant value. Let 0 ≤ ≤ 1, and is the weight of the pattern [36].

Cost Function.
This section evaluates the performance of classification systems. Under the hypothesis that a weight is determined to each training pattern, we applied the weight of training patterns as a cost of misclassification. A cost function cost( ) of a fuzzy classification system is defined as where ( ) is defined as duplex variable to the classification result of the pattern by : if is correctly classified by , then ( ) = 0, and if is misclassified or rejected, ( ) = 1 and is the total number of training patterns. This cost function as good as classification rate is used [36]. We employed four types of weight which were introduced by Nakashima et al. in 2005 [36].
In this paper, we randomly used training pattern as = 0.25, 0.5, 0.7, 0.9 and then we randomly assigned a weight to each training pattern. For example, we specified as = 0.25 if the training pattern belonged to class 1 and = 0.25, 0.5, 0.7, 0.9 if belonged to classes 2, 3, and 4 and as = 0.5 if the training pattern belonged to class 1 and = 0.25, 0.5, 0.7, 0.9 if belonged to classes 2, 3, and 4, and so forth.

Patients.
This study was a cross-sectional study in which characteristics, results of myocardial perfusion scan (MPS), and coronary artery angiography of 115 patients, 62 (53.9% male), in Mazandaran Heart Center (Iran) have been collected. The coronary artery disease data used in this paper contained 115 samples classified into four classes, namely, classes 1 (normal), 2 (stenosis in one single vessel), 3 (stenosis in two vessels), and 4 (stenosis in three vessels) having 39, 35, 17, and 24 subjects, respectively. The dataset included ten input variables (age, sex, diabetes, cholesterol level, triglyceride level, low density lipoprotein (LDL), systolic blood pressure, summed stress score (SSS), smoking, and genetic factor) and one output. We used membership functions for medical variables according to literature review. For these variables we took 3 fuzzy sets for age (young age, middle age, and old age), 3 fuzzy sets for the cholesterol level and triglyceride level (normal, borderline, and high), 3 fuzzy sets for LDL (normal, middle, and high), 4 fuzzy sets for systolic blood pressure (low, middle, high, and very high), and 4 fuzzy sets for SSS (normal, mild, moderate, and severe).
The type of membership functions for each fuzzy set applied was the trapezoidal membership function as shown in Figure 1 and membership functions of input variables were defined by form in Table 1. 3.2. Fuzzy Rule Base. The main part in fuzzy system is the rule base and the results in a fuzzy classification system depend on the fuzzy rules. This system includes 144 rules. Antecedent part of all rules has more than one part. In this study, we The units of the used input variables were age, cholesterol (CL), triglyceride (TG), systolic blood pressure (SY), LDL, SSS, sex, diabetes (DI), genetic factor (GF), and smoking (SM).
Parts of the fuzzy rules are shown in Table 2. A series of 144 rules are formed.
After learning fuzzy if-then rules by training patterns, we used different weights and constants values. Correct classification rates were decreased by assigning different weights to rules and the highest classification rate was for = 0.25 for all classes. The results of correct classification rates are displayed in Table 3. After generating rules by system, classification rate was 92.4% by 545 rules which were then reduced to 144 by using expert rule selection; accuracy rate has computed 92.8%.
Multiple logistic regression (MLR) was used for detecting the risk factors effects on CAD (presence or absence). Sensitivity and specificity of MLR were computed, 88.15% and 64.1%, respectively. Also, clinical test PMS in comparison to angiography was found to have sensitivity of 84% and specificity of 92% higher than MLR model. All patients with CAD by angiography were detected by this proposed fuzzy rule-based classification (Se = 100%) and overall accuracy was 92.8%.

Discussion and Conclusion
The most important result of this study is the classification of coronary artery disease (CAD) with a high accuracy. Correct classification rate depends on the number of input variables or characteristics, type, and combination of fuzzy rules. We used membership functions for medical variables by reviewing the related literature. These membership functions lead to a high quality of precision in many fields of medicine. We used fuzzy inference method with combination of weighted rules by expert and supervisor pattern classification. This is similar to Chen and Chang study with high accuracy rate (96.88%) in classification [40]. Chen and Fang studied a fuzzy classification method for Iris data with 96.72% accuracy classification rate [41]. They generated fuzzy weighted rules for Iris data with genetic algorithm [42]. The study conducted by Allahverdi et al. for determining the risk factors for CAD had a high corrected classification rate [32].  Vig et al. [31], P. Srivastava and A. Srivastava [33] and Kaya et al. [43] also studied fuzzy set and systems for CAD, cardiac analysis, and congenital heart disease by using fuzzy expert systems. This study showed that sensitivity and specificity of PMS compared with angiography were acceptable by MLR, but when it was combined with other clinical-epidemiological variables, fuzzy rule-based classification model improved classification rate.
According to the findings of this study, interpretable fuzzy rule-based classification can determine the most important risk factors for CAD and correctly detect the patients who do   not need invasive tests such as coronary artery angiography and have a high classification accuracy rate.