A New Method of Diagnosing Constitutional Types Based on Vocal and Facial Features for Personalized Medicine

The aim of the present study is to develop an accurate constitution diagnostic method based solely on the individual's physical characteristics, irrespective of psychologic traits, characteristics of clinical medicine, and genetic factors. In this paper, we suggest a novel method for diagnosing constitutional types using only speech and face characteristics. Based on 514 subjects, the area under the receiver operating characteristics curve (AUC) values of classification models in age and gender groups ranged from 0.64 to 0.89. We identified significant features showing statistical differences among three constitutional types by performing statistical analysis. Also, we selected a compact and discriminative feature subset for constitution diagnosis in each age and gender group. Our method may support the direction of improved diagnosis prediction and will serve to develop a personal and automatic constitution diagnosis software for improvement of the effectiveness of prescribed medications and development of personalized medicine.


Introduction
Due to the development of medicine and advances of biotechnology and information technology, the midpoint of medical treatment has shifted away from common treatments of a certain disease to personalized medicine [1][2][3][4][5][6][7][8]. Consistent with this paradigm, there has been an explosion of interest in alternative oriental medicines and in a fusion of oriental and western medicines [9][10][11][12][13][14][15][16]. One of the core research areas of personalized medicine in western and oriental medicine is to understand the psychological characteristics, morphological traits, genetic characteristics, and constitution of individuals. The human constitution has been researched in western and oriental medicine for a long time. For example, Hippocrates suggested that the human constitution could be attributed to four kinds of substances (blood, phlegm, choler, and black bile) [4]. Wang classified humans into seven constitutional groups using physiological and physical status [4,17]. Similarly, Lee classified humans into four Sasang constitutional types as TE (Teaeumin), TY (Taeyangin), SE (Soeumin), and SY (Soyangin) based on physiological, psychological, and physical characteristics [2,[11][12][13]15].
Personal constitution diagnosis is important for several reasons. Firstly, people have vulnerability to particular diseases according to their individual psychological characteristics, genetic characteristics, and morphological traits. Therefore, risk factors for particular diseases can be identified according to an individual's constitution in the early stages of disease progression [18]. Secondly, drug response to prescribed medicine varies with personal constitution [11]. As such, the efficiency of prescribed medicine can be improved if we know the patient's constitution.
Many studies on Sasang constitution have been conducted. For the association of Sasang constitutions and diseases and the difference of constitutional types, many researchers introduced constitution analysis methods [1,4,5,8,12,13,15,[18][19][20][21][22][23]. Song et al. [22] classified TE and SE among the constitutional types using skin elasticity of the hand and proved that elasticity of the TE type was higher than that of the SE type. Their constitution diagnosis measured hand skin elasticity based on a questionnaire survey of thickness and elasticity of the skin. The limitations of the study were that experiments were performed in only TE and SE types and not in the TY and SY types, and elasticity measurements were performed only on the back of the hand. Choi et al. [18] studied the distribution of insulin resistance using multivariate logistic regression analysis and features such as age, cholesterol level, smoker/nonsmoker, diastolic blood pressure, and insulin in subjects of each constitutional type. They demonstrated that prevalence of insulin resistance differs according to Sasang constitution type and suggested that personal constitution type can act as an independent risk factor for insulin resistance. An association study between genome-wide SNP (simple nucleotide polymorphisms) profiles and Sasang constitution types for a more accurate Sasang constitution diagnosis was conducted via experiments using 353,202 SNPs from 60 DNAs by Yin et al [8]. They observed that 5,692 SNPs in TE versus SE association analysis were significantly different, 7,542 SNPs in SE versus SY were significantly different, and 4,083 SNPs in SY versus TE were significantly different. The detailed contents of Sasang constitutional medicine are described in references [2,12], and the research on face or speech signals are described in references [24][25][26][27][28].
Until recently, previous studies that used face, SNPs, skin, body shape, and speech signals have focused primarily on difference analysis among constitutional types; the study of diagnosis prediction is rare. In this study, we focus on Sasang constitution diagnosis using morphological characteristics that are easily accessible to researchers and doctors.
The motivations of this study are as follows: first, how can we obtain essential and useful features that show relationships between morphological characteristics and corresponding constitutional types? Second, how does one use these features to build an efficient and accurate diagnosis model?
We make the following contributions to the field of constitution diagnosis.
(i) Propose a readily available and novel method for an accurate and detailed constitution diagnosis using the combination of facial characteristics and speech signals in age-and gender-specific groups. Our method may support the direction of improved diagnosis prediction and will serve to develop a personal and automatic constitution diagnosis tool for improvement of the effectiveness of prescribed medications and development of personalized medicine. (ii) Suggest discriminative and meaningful features for constitution diagnosis via statistical analysis, and identify a compact and useful feature subset in accordance with age and gender. Analysis of results will serve to create a better discriminative feature set in this field.

Data Preparation.
Speech and facial feature extraction from 514 subjects in several hospitals and the Korea Institute Basic pitch of X. X is one of five vowels (A, E, I, O, and U). xF1 Formant of 1st in 4 frequency periods of X xF2 Formant of 2nd in 4 frequency periods of X xJITA Mean ratio of change in pitch period of X xRF60 120 F240 480 (Frequency band of 60∼120 Hz)/(frequency band of 240∼480 Hz) of X xRF240 480 960 1960 (Frequency band of 240∼480 Hz)/(frequency band of 960∼1960 Hz) of X aRF2 F1 Relative ratio between frequencies of A (aF2/aF1) iDF0 aF0 Difference of frequencies (iF0-aF0) uDF0 oF0 Difference of frequencies (uF0-oF0) xMFCC4 Mel frequency cepstral coefficients of X SITS Amplitude average SISTD Standard deviation of amplitude average SSPD Time to read one sentence Distance between point n 1 and n 2 in a frontal (or profile) image FDH n 1 n 2 Horizontal distance between n 1 and n 2 in an image FDV n 1 n 2 Vertical distance between n 1 and n 2 in a frontal (or profile) image FA n 1 n 2 n 3 Angle of three points n 1 , n 2 , and n 3 in an image FA n 1 n 2 Angle between the line through 2 points n 1 and n 2  Angle between the line through 2 points n 1 and n 2 and a horizontal line Nose Angle n 1 n 2 Angle between the line through 2 points n 1 and n 2 and a horizontal line Nose Angle n 1 n 2 n 3 Angle of 3 points n 1 , n 2 , and n 3 SA n 1 n 2 Angle between the line through 2 points n 1 and n 2 and a horizontal line Fh Cur Max R79 69 FD(77,9)/FD(6,9) Nose Area n 1 n 2 n 3 Area of the triangle formed by 3 points n 1 , n 2 , and n 3 in an image EUL L el1∼el7 Slope of the tangent at a point (el1∼el7) in an image EUL R er1∼er7 Slope of the tangent at a point (er1∼er7) in an image of Oriental Medicine in the Republic of Korea was carried out. Constitutional types of all subjects were determined by specialists and drug responses [21]. Speech record configurations were no resonance, noise intensity from 40 to 50 dB, room temperature of 20 • C ± 5 • C and humidity of 40% ± 5%, Sennheiser e-835s microphone, Blaster Live 24bit external sound card, and GoldWave recording program. Distance of the mouth of subjects and the microphone was 4-6 cm, and features were extracted using five vowels (A, E, I, O, U) and one sentence. The extracted features consisted of pitch, average ratio of pitch period, Jita (absolute Jitter), MFCC (Mel frequency cepstral coefficients) [29,30], and so forth. We took photographs from the side and front of the subject's face using a digital camera with a ruler (Nikon D700 with an 85 mm lens). Based on an identified feature point from a side-and front-face image, we obtained features such as distance, distance ratios, angle, and area from forehead, nose, mouth, face shape, and eye [20]. Doctors designated the feature points (Figure 1). Height and weight of subject were measured by a digital scale (LG-150; G Tech International Co., Ltd, Republic of Korea). A total of 82 features were used in this study (29 features from speech signals, 51 features from face, and weight and height features). All feature measurements were done based on self-made tool using MATLAB on Window XP. The specific content of the extracted features was described in Table 1.
Since the face and speech signals are influenced by age and gender [21,31], experimental data were divided into five categories based on age and gender: Female-20 (women aged    Table 4 for statistical analysis.

Experiment Configurations.
The goals of our experiment were to measure the ability to distinguish constitutional types, and to identify a more discriminative and compact feature set through feature selection. We conducted classification experiments of TE, SE, and SY constitution types with our five data sets according to the difference of age and gender, with and without feature selection. To investigate the differences of detailed performances of each feature types, speech feature set, face feature set, and hybrid feature set (combining face and speech) were used in this experiment. We applied normalization (scale 0∼1 value) to all data sets. The Wrapper approach using machine learning of LIBLINEAR [32] and the best-first search (forward) was used in feature subset selection. All experiments were performed using LIBLINEAR (L2-loss SVM dual type) in Weka software [33], and a 10-fold cross validation for a statistical evaluation of learning algorithm was performed. For optimal parameter selection (tuning), the value of the C parameter was obtained in the range of The area under receiver operating characteristic curve (AUC) was used as a major evaluation criterion. AUC is widely used to quantify the quality of a prediction or classification model in medical science, bioinformatics, medicine statistics, and biology [34][35][36]. We also evaluated performance using the sensitivity, specificity, and F-measure for detailed evaluation. Statistical analyses were conducted by SPSS version 19 for Windows (SPSS Inc., Chicago, IL, USA).

Comparison of Experimental Results.
For brief summarization of performance evaluation, the AUC values for the 5 groups with and without feature selection method are showed in Table 2.
In experiments using full features without feature selection, the results indicated that the hybrid feature set (Hybrid-FF) performed better than the individual face and speech feature sets (Face-FF and Speech-FF), except for performance by the face feature set in the Female-30 group. The AUC values of age and gender classification models using hybrid feature set without feature selection ranged from 0.59 to 0.69%.
After application of feature subset selection, the remaining number of features was small, whereas the AUC values of constitution classification were greater than that of  However, performance of Face-FS method was higher than that of Hybrid-FS and Speech-FS in the Female-30 and Male-30 groups. Thus, it is preferable to use the Face-FS method in these two groups. The theoretical performance of the Hybrid-FS method using feature selection is better than or equal to that of Face-FS and Speech-FS, because the hybrid feature set includes all the speech and face features. However, realistically, many of the feature selection methods may not ensure better performance because a greater number of features add difficulty in building a classifier and lead to the curse of dimensionality and an NP problem [36][37][38]. Detailed performance evaluation of experiments using feature selection was showed in Table 3.

Statistical Analysis of Meaningful Features.
For the statistical analysis of features obtained from feature selection in Hybrid-FS method, we carried out a one-way ANOVA test, and for the post-hoc analysis, we performed the Scheffe's multiple comparison test. The ANOVA test indicated that there was a significant difference among constitutional types. All features showing P values of <0.05 are shown in Table 4.
In this statistical analysis, we did not found features that covered a broad range of applicability in predicting constitutional types in the age-and gender-specific classification. However, we identified several features with an obvious propensity for classifications of constitutional type.
In facial feature analysis using results from the Scheffe's multiple comparison test, there was a statistically significant difference in weight among constitutional types. Weights between TE and the other types were significantly different in the Female-20 group (F = 19.85, P = 0.0000). Weights between TE and the other types in the Female-50 group were significantly different (F = 25.64, P = 0.0000). Weights among TE, SE, and SY were significantly different in the Male-30 group (F = 26.298, P = 0.0000). There was a significant difference in FD43 143 between TE and the other types in the Female-30 group (F = 23.558, P = 0.0000) and between TE and SE in the Female-50 group (F = 3.384, P = 0.0374). FArea03 between TE and the other types was significantly different in the Male-50 group (F = 14.231, P = 0.0000).
For speech features, there was a significant difference in SITS between TE and SE in the Male-50 group (F = 5.088, P = 0.0088). The uF2 value between SY and the other types was significantly different in the Female-20 group (F = 4.975, P = 0.0100), and aRF2 F1 between TE and SY was significantly different in the Female-20 group (F = 3.772, P = 0.0286).

Limitations and Future Work.
Constitution diagnosis is very difficult in this area, because constitution type decisions are dependent on qualitative judgments of doctors and inspectors [22]. We think that it is possible to develop an accurate diagnosis method or standardization for constitution diagnosis after collecting more diagnosis information by doctors or inspectors.
Our experimental and statistical analysis showed an important and useful feature for better diagnosis based on differences in age and gender. Since diagnosis performances and selected features differ according to age and gender, it makes constitution diagnosis difficult in the real world. Until now, features showing an obvious propensity for constitution diagnosis are not yet sufficient, and data sets to achieve diagnoses that are more accurate are insufficient. Accordingly, more research for constitution diagnosis is needed.
In future work, we will investigate the relationship between constitution and improvement of the effectiveness of medications and explore the role of constitutional types in certain disease. We think that this is very important research in clinical medicine, because the efficiency of prescribed medicine can be improved if we know the patient's constitution. For instance, Jeong et al. [9,10] investigated changes in cytokine production in the acute stage of SY constitution CI (cerebral infarction) patients after oral administration of Yangkyuk-Sanhwa-Tang water, and revealed that Yangkyuk-Sanhwa-Tang had a good effect on anti-inflammatory cytokines and a good CI treatment effect. The results of these studies may help to improve the effectiveness of prescribed medications in Sasang constitutional types.

Conclusions
This study describes a novel prediction method for constitution diagnosis as an essential prerequisite for personalized medicine or alternative medicine. We demonstrated the possibility and usefulness of constitution diagnosis using the combination of face and speech feature sets in age-and gender-specific groups, identified a compact and discriminative feature subset, and included supporting statistical analysis of significant features. Our results could be used for developing an automatic constitution diagnostic tool for improving the effectiveness of prescribed medications and could be used in the fields of speech and face recognition.