The Study of Misclassification Probability in Discriminant Model of Pattern Identification for Stroke

Background. Pattern identification (PI) is the basic system for diagnosis of patients in traditional Korean medicine (TKM). The purpose of this study was to identify misclassification objects in discriminant model of PI for improving the classification accuracy of PI for stroke. Methods. The study included 3306 patients with stroke who were admitted to 15 TKM hospitals from June 2006 to December 2012. We derive the four kinds of measure (D, R, S, and C score) based on the pattern of the profile graphs according to classification types. The proposed measures are applied to the data to evaluate how well those detect misclassification objects. Results. In 10–20% of the filtered data, misclassification rate of C score was highest compared to those rates of other scores (42.60%, 41.15%, resp.). In 30% of the filtered data, misclassification rate of R score was highest compared to those rates of other scores (40.32%). And, in 40–90% of the filtered data, misclassification rate of D score was highest compared to those rates of other scores. Additionally, we can derive the same result of C score from multiple regression model with two independent variables. Conclusions. The results of this study should assist the development of diagnostic standards in TKM.


Introduction
Due to the development of modern medicine, the average lifespan for human beings is anticipated to rise beyond 85 years of age within the following 20 years [1]. In the meantime, since the rate of aging in South Korea is expected to surge up to 35.1% by 2050, ranking 2nd in the world close to Japan (37.7%), geriatric diseases and the health of the elderly have emerged as one of the most critical social problems of improving the quality of life in the future [2]. In particular, stroke is one of the representative geriatric diseases, along with dementia. Personal and social insecurities caused by the disease have continued to grow. In addition, stroke ranks as the top mortality risk to Koreans among the single diseases and contributes to more than 70% of the in-patients at traditional Korean medical hospitals [3,4]. In traditional Korean medicine (TKM), specific or nonspecific symptoms of patients are diagnosed by observing, listening, asking, and feeling their pulse under the diagnostic system of pattern identification (PI) in order to determine the cause, nature, treatment method, and treatment drugs of a disease [5][6][7]. This PI diagnosis collects specific or nonspecific symptoms of patients and classifies them into one of the hundreds of symptom classes. It is the essential core technology forming the backbone of diagnosis and treatment in oriental medicine. However, the PI diagnosis holds limited objectivity and reproducibility due to the lack of standardized measurement indices, and objectification problems have always arisen with respect to personal deviations among TKM physicians based on their knowledge and experience [6][7][8].
As the necessity for the standardization of diagnostic systems has recently come to the fore, studies have been underway to objectify diagnosis.
In the study titled "Fundamental Study for the Standardization and Objectification of Pattern Identification in Traditional Korean Medicine for Stroke (SOPI-Stroke)," which was conducted over 9 years from 2005 to 2013, the Korea Institute of Oriental Medicine (KIOM) proposed a standardization plan for PI/syndrome differentiation of stroke, established stroke PI diagnostic indices, built a database system relating 2 Evidence-Based Complementary and Alternative Medicine to TKM clinical technologies by setting up a clinical index database, and founded a scientific basis for stroke and PI by discovering stroke and PI biological indices, to which the latest research methods, such as OMICS, were applied. Studies were carried out to discover biological indices that could be helpful to stroke prevention by finding out what the stroke risk factors were [9][10][11][12][13][14][15][16].
Consequently, the purpose of this study was to identify misclassification objects in discriminant model of PI for improving the classification accuracy of PI for stroke patients. Although current TKM PI diagnostic tools for stroke were developed after several years of research and prepared for public release, the tools still need corrections and modifications in many aspects [17][18][19]. In this study, the key topics for discussion involve appropriate statistical methods to reduce the probability of diagnostic misclassification.

Subjects.
The study included 3306 patients with stroke who were admitted to 15 oriental medical university hospitals from June 2006 to December 2012. Each patient provided informed consent to undergo procedures that were approved by the respective institutions' Institutional Review Boards (IRB). Informed consent of all the study patients was obtained after a thorough explanation of the details. We enrolled stroke patients for enrollment within 30 days of the onset of their symptoms, provided that their diagnosis was confirmed by an imaging diagnosis such as computerized tomography (CT) or magnetic resonance imaging (MRI). Patients with traumatic stroke such as subarachnoid, subdural, and epidural hemorrhage were excluded from the study.

Measured
Variables. Each patient was seen by two experts at the same department within each site. All experts who were well trained in standard operation procedures (SOPs) were participating in this study. The experts had at least three years of clinical experiences with stroke after finishing regular college education about TKM for six years. The examination parameters were extracted from parts of a case report form (CRF) for the standardization of stroke diagnosis that had been developed by an expert committee organized by the KIOM [7,11,12].

The Korean
Standard PI for Stroke-3. PI process for differentiating stroke with four TKM types: the Fire-heat (FH) pattern, Dampness-phlegm (DP) pattern, Yin deficiency (YD) pattern, and Qi deficiency (QD) pattern [11,12]. The FH pattern is characterized by any symptom of heat or fire that is contracted externally or engendered internally. The DP pattern is characterized by impeding Qi movement and its turbidity, heaviness, stickiness, and downward-flowing properties. The QD pattern is characterized by qi deficiency with diminished internal organ function, which is marked by shortness of breath, lassitude, listlessness, spontaneous sweating, a pale tongue, and a weak pulse. The YD pattern is characterized by yin deficiency with diminished moistening and the inability to restrain yang, which is usually manifested as fever [7,[9][10][11][12][13]20]

Statistical Methods.
After determining 12 different types of misclassification through discriminant analysis, we plotted it on the profile graphs according to types. And then we derive the four kinds of measure ( , , , and score) based on the pattern analysis of the profile graphs. The proposed measures are applied to the stroke data to evaluate how well those detect misclassification objects.

Types of Misclassification.
According to the results from the discriminant model classification, 2,209 patients posted correct classifications out of the total of 3,306 patients (66.82%) ( Table 1). Out of the 3,306 patients, 1,097 were misclassified (33.2%) and the misclassification types are summarized in Table 2. To analyze the misclassification types, 44 clinical indices of the Korean Standard PI for Stroke-3 were grouped into four upper-class variables (QD, DP, YD, and FH pattern indices). In addition, the average and standard deviation of each upper-class variable was used to attain standardized scores, after which the misclassification types were analyzed ( Figure 1).

The Profile
Graphs. With 12 misclassification types and 4 correct classification types categorized by the discriminant analysis, the profile graphs were drawn. Specifically, two of the 4 patterns were selected and the correct classification types and misclassification types for each pattern were collected from the TKM physicians and divided. For instance, as described in Figure 2, patients applicable to two misclassification types (FHQD and QDFH) were grouped together.

Derived Four Measures ( , , , and Scores).
In the profile graphs, misclassification observations in most of the 6 cases displayed a bathtub or U-shaped pattern since pattern scores corresponding to actual patterns would be relatively high and the misclassification of a pattern is highly probable if relatively higher scores were observed in the other pattern. In the meantime, correct classification observations showed an L-shaped (or flipped-L-shaped) pattern. Although actual patterns are unknown due to the lack of direct diagnoses from TKM physicians, if a new patient establishes a bathtubshaped profile simply with 4 upper-class pattern scores (obligatory two high scores and two low scores), this patient is likely to be misclassified through the future discriminant model. Criteria were designed to assess how close a pattern score profile would be to a bathtub shape through various arrangements and simple calculations of the four pattern scores and applied to already discriminated data. By doing so, comparison was conducted to investigate how much misclassification was estimated and how much discrimination rates (1) Score. Analyzing correct classification and misclassification types with profile graphs, the value was derived considering that a difference between the maximum value (1) and the second-largest value (2) of misclassification was smaller than that of correct classification, and classification by the value was attempted (Figure 8). Namely, under the hypothesis that the smaller the value was, the closer the profile graph was to a bathtub shape and the higher the probability of the respective observations corresponding to misclassification was, the values were applied to the clinical stroke data. After sorting the data by the value in descending order and investigating the frequency and rates of misclassification over 10% intervals (Figure 9), the misclassification probability of the 10% ( = 331) filtered data reached 40.79% ( = 135, Mean = 0.058), which was 7.61% higher than the previously calculated misclassification probability (33.18%) of the total data. The misclassification probabilities of the data filtered from 20% to 90% were lower than that of the 10% filtered data but higher than that of the total data (33.18%). In the data filtered at 10%, 20%, 40%, and 50%, average values of the misclassifications and correct classifications were barely different from each other, even though the average values of the misclassifications tended to be higher than those of the correct classifications. In the other data groups, the average values of the correct classifications were higher than those of the misclassifications (Table 4). Meanwhile, examining the frequencies and rates of the correct classifications in the data selected for values, the misclassification probability of the correct classifications in the 90% ( = 2975) selected data recorded 67.66% ( = 2013, % of = 32.34%), which was 0.86% higher than those of the previously calculated correct classifications (66.8%) of the total data. In the 80% ( = 2645) selected data, the misclassification probabilities of correct classifications reached 68.28% ( = 1806, % of = 31.72%), which was 0.62% higher than those in the 90% selected data. In the data selected from 70% to 10%, the correct classifications gradually increased (Table 4).
(2) Score. Analyzing correct classification and misclassification types with profile graphs, the value was derived considering that a difference between the maximum value (1) and the minimum value (4) of misclassification was smaller than that of correct classification, and classification by the value was attempted ( Figure 10). Namely, under the hypothesis that the larger the value was, the closer the profile graph was to an L-shaped or flipped-L-shaped pattern, and the higher the probability of the respective observations corresponding to correct classification was the values were applied to the clinical stroke data in the same way as previously (Table 5).
(3) Score. Analyzing correct classification and misclassification types with profile graphs, the value was derived considering that the second-largest value (2) of misclassification was higher than that of correct classification, and classification by the value was attempted ( Figure 11). Namely, under the hypothesis that the larger the value was, the closer the profile graph was to a bathtub (or U) shape and the higher the probability of the respective observations corresponding to misclassification was, the values were applied to the clinical stroke data. In this case, the frequency and rates of misclassification over 10% intervals were investigated after sorting the data by the value in ascending order (Table 6).
(4) Score. Analyzing correct classification and misclassification types with profile graphs, the value was derived considering that a difference between the sum of (1) and (2) and the sum of (3) and (4) of misclassification was larger than that of correct classification, and classification by the value was attempted ( Figure 12). Namely, under the hypothesis that the larger the value was, the closer the profile graph was to a bathtub (or U) shape, the higher the probability of the respective observations corresponding to misclassification was, the values were applied to the clinical stroke data in the same way as previously (Table 7).

Estimated Misclassification Probability and Discrimination
Rate according to Proposed Four Scores. Table 8 summarizes the misclassification probabilities after the data was sorted 6 Evidence-Based Complementary and Alternative Medicine  according to the 4 criteria and investigating the misclassification probability over 10% intervals. If the data were filtered 10-20%, the score marked 42.60% and 41.15%, respectively, indicating the highest misclassification probability among the criteria. If the data were filtered 30%, the score stands at 40.32% and the score at 39.92%. If the data were filtered 40∼90%, the misclassification probability of the score was the highest.
For the data previously selected by 4 scores ( , , , and ), discrimination rates were compared. Having the 4 QD, DP, YD, and FH patterns set as reaction variables for the entire clinical stroke data and 44 clinical indices of the Korean Standard PI for Stroke-3 as independent variables, the discriminant analysis was conducted to calculate the discrimination accuracy (Table 9). If the data were selected at 90%, the discrimination rate of the score increased to 68.2%, which was the largest increase among the four scores. If the data were selected at 80%, the score reached 69.0%, making the largest increase. If the data were selected at 70%, the score posted 70.0%, demonstrating the largest increase in the discrimination rate among the four scores. If the data were selected at 60-10%, the score recorded the largest increase in the discrimination rate among the four scores.

Estimation of Secondary Curvature.
Considering the quadratic curve regression model passing through the four points (1, (1) ), (2, (3) ), (3, (4) ), and (4, (2) ), = 0 + 1 + 2 2 + , the coefficient of 2 is the secondary curvature value that we wanted. Namely, the larger the 2 is, the stronger the bathtub shape becomes, boosting the misclassification probability. Assuming that the estimates of 0 , 1 , and 2 are 0 , 1 , and 2 , these estimates satisfy the following normal equation [21]: ] . (2) According to Neter et al. [21], a general two-variable regression model, 8 Evidence-Based Complementary and Alternative Medicine   Figure 10: Derived values based on the pattern analysis of the profile graphs. Under the hypothesis that the larger the value was, the closer the profile graph was to an L-shaped or flipped-L-shaped pattern, the higher the probability of the respective observations corresponding to correct classification was. (4) Z (4) S = Z (1) + 2 * Z (2) 2 * Z (2) Figure 11: Derived values based on the pattern analysis of the profile graphs. Under the hypothesis that the larger the value was, the closer the profile graph was to a bathtub (or U) shape, the higher the probability of the respective observations corresponding to misclassification was.
has a normal equation which is equal to and the following normal equations, are obtained. In this case, the equations are 1 = , = 1, 2, 3, 4, Now, if 10 Evidence-Based Complementary and Alternative Medicine    (4) Z (4) Z (4) C = (Z (1) + Z (2) ) − (Z (3) + Z (4) ) Figure 12: Derived values based on the pattern analysis of the profile graphs. Under the hypothesis that the larger the value was, the closer the profile graph was to a bathtub (or U) shape, the higher the probability of the respective observations corresponding to misclassification was.
and, ultimately, we obtain Certainly, the values of 0 and 1 may be obtained but omitted herein because they are meaningless. In (10), (1) and (2) are symmetric, and so are (3) and (4) . Namely, when the curvature creates the largest profile with the 4 points, the curvature will not have any changes even if the largest and the second largest scores were switched. This also holds true for the smallest and the second smallest scores. In the meantime, the 2 value equals 1/4 of the score among the 4 criteria obtained. Namely, the previously used score was equal to (3) and (4) was simply subtracted from the total of (1) and (2) , which was the same as the secondary curvature created by the 4 scores.

Discussion
In TKM, a PI diagnostic system-one of the core technologies in the diagnosis and treatment of oriental medicineis used to determine the cause and nature of a disease, treatment methods, and treatment drugs for the patients [5][6][7]. However, the PI diagnosis holds limited objectivity and reproducibility due to the lack of standardized measurement indices. Objectification problems have always arisen with respect to personal deviations among TKM physicians. As the demand for the reestablishment and development of TKM has increased, studies on the establishment of a scientific basis for and the standardization of PI have been actively conducted [7,12].
In this study, the clinical data of PI diagnosis for stroke were used to analyze and quantify the profile patterns of the misclassification types by applying the proposed scores to the comparative analysis. This was intended to boost the correct classification of objects by detecting those objects with a high probability of actual misclassification and deferring discrimination. Misclassification types were discerned by a discriminant analysis on the actual clinical data of PI diagnosis for stroke and quantified by a profile pattern analysis. The  Figure 13: Curvature created by scores ( (1) , (2) , (3) , and (4) ).
proposed criteria of each standard were applied to the data already discriminated by the previous discriminant analysis in order to compare how well the misclassification had been estimated and how much the discrimination rate had improved when the estimated misclassification observations were removed in advanced. Particularly, the score delivered the same results as those from the discrimination of misclassification observations through a secondary curvature. Going forward, the following studies must be performed. First of all, 4 criteria to estimate misclassification were proposed in this study and applied to the actual clinical data, producing the possibility of better estimation of partial misclassification. Nonetheless, it was difficult to notably enhance discrimination rates and additional research appears to be necessary. In addition, 4 pattern groups with a different sample size were used in this study. Hence, the effects of different sample sizes need to be investigated.