Application of Metabolomics in Traditional Chinese Medicine Differentiation of Deficiency and Excess Syndromes in Patients with Diabetes Mellitus

Metabolic profiling is widely used as a probe in diagnosing diseases. In this study, the metabolic profiling of urinary carbohydrates was investigated using gas chromatography/mass spectrometry (GC/MS) and multivariate statistical analysis. The kernel-based orthogonal projections to latent structures (K-OPLS) model were established and validated to distinguish between subjects with and without diabetes mellitus (DM). The model was combined with subwindow permutation analysis (SPA) in order to extract novel biomarker information. Furthermore, the K-OPLS model visually represented the alterations in urinary carbohydrate profiles of excess and deficiency syndromes in patients with diabetes. The combination of GC/MS and K-OPLS/SPA analysis allowed the urinary carbohydrate metabolic characterization of DM patients with different traditional Chinese medicine (TCM) syndromes, including biomarkers different from non-DM patients. The method presented in this study might be a complement or an alternative to TCM syndrome research.


Introduction
Diabetes mellitus (DM) is a complex metabolic disorder characterized by chronic hyperglycemia, hypoinsulinemia, and ketosis. In 2000, around 171 million people were affected with DM. By 2030, this number is estimated to increase to 366 million [1]. Current statistics shows that over 10% of the world's aged population (60 years and above) suffers from this disease, and 90% of these patients have type 2 diabetes mellitus (T 2 DM) [2]. Diabetes always causes high morbidity and mortality rates due to chronic microvascular complications (e.g., retinopathy, nephropathy, or neuropathy) and macrovascular complications (e.g., ischemic cardiac problems, cerebral vascular accidents, and peripheral vascular disorders) [3].
In ancient China, DM was recognized as xiaokezheng, a disease with symptomatic polydipsia. Traditional Chinese medicine (TCM) has a long history of treatments for xiaokezheng [4]. According to TCM theory, Yin (things associated with the physical form of an object), Yang (things associated with energetic qualities), Qi (life force that animates the forms of the world), and Xue (dense form of body fluids that have been acted upon and energized by Qi) [5] are in an unbalanced state when people are suffering from a disease. Similarly, patients with DM could be classified as having deficiency syndrome or excess syndrome, which refers to the organs' insufficiency or excess in Qi, Xue, Yin, and Yang.
Metabolic profiling is defined as the quantitative measurement of the dynamic multiparametric response of a living system to pathophysiological stimuli or genetic modification [6]. The objective of metabolomics is to gain new insight into the pathophysiology of a disease and identify individual metabolites or profiles of metabolites as potential biomarkers that can distinguish between normal and 2 Evidence-Based Complementary and Alternative Medicine pathological states [7]. Metabolomics has been used in the diagnosis and evaluation of diabetic patients [8] because of its effectiveness in evaluating systemic responses to any subtle metabolic perturbation. In addition, it has also been used in the identification of potential biomarkers [9].
Recent animal and human metabolomic studies have investigated the metabolic effects of oral glucose challenge [10][11][12], insulin resistance [13][14][15][16][17][18], type 1 [19,20] or T 2 DM [20][21][22][23][24][25][26][27][28]. Previous studies investigated the metabolic profiling of plasma phospholipids in T 2 DM using liquid chromatography/mass chromatography (LC/MS) coupled with multivariate statistical analysis [29]. Methods based on plasma fatty acid profiles analyzed via GC/MS were also developed to investigate the differences between T 2 DM patients and healthy volunteers [30]. A multianalytical platform method using GC/MS and ultra performance liquid chromatography-mass spectrometry (UPLC/MS) was developed to obtain the global metabolite profiles of DM in rat models [31]. An imbalance between carbohydrate and lipid metabolisms is involved in the etiology and pathophysiology of diabetes. Therefore, a metabolic analysis is necessary to visualize the alteration of globally circulating metabolites in a person suffering from diabetes. In the present study, a metabolic profiling was performed using GC/MS of urinary carbohydrates in subjects with and without DM.
Partial least square linear discriminant analysis (PLSLDA) is currently the common method used in supervised linear modeling in the field of metabolomics. However, the relationship between the disease and metabolic data displays nonlinear characteristics in some cases. Therefore, nonlinear modeling has been applied in metabolomics [32,33]. Recently, the "kernel trick" has been efficient in dealing with nonlinear problems. Kernel-based orthogonal projections to latent structures (K-OPLS) [34,35] can considerably improve the predictive performance in situations where a strong nonlinear relationship exists. Model population analysis (MPA) was developed based on the idea of statistically analyzing the outputs of Monte Carlo Sampling (MCS)-derived "population" of models. The MPAbased method is expected to provide some comprehensive insights into the data because it allows the statistical analysis of some interesting outputs of several models. One typical MPA-based method can be used to identify important variables by examining the distribution of prediction errors of all the submodels [36]. Subwindow permutation analysis (SPA) was used in the present study to reveal informative metabolites by incorporating the Monte Carlo technique and strictly implementing the idea of MPA [37,38].
Several diabetes-related studies have been reported in recent years. However, the metabolic profiles involved in the pathological processes of diabetes are yet to be addressed. Thus, the identification of biomarkers is needed for the adequate screening and diagnosis of diabetes. Syndrome differentiation is an important element in TCM theories and is the basis for the treatments of all diseases, including DM. Therefore, the TCM syndromes of patients with DM are necessary to characterize. However, previous studies have not revealed the differences among the urinary carbohydrate metabolites in the TCM syndromes of these patients.
In the present work, we conducted a comparative analysis of 366 subjects using GC/MS combined with K-OPLS/SPA analysis to (1) compare the urinary carbohydrate profiles of subjects with and without DM, (2) compare the relationship between urine carbohydrate levels and TCM syndromes in subjects with DM, and (3) determine the characteristics and differences in TCM syndrome distribution between excess and deficiency syndromes.

Chemicals.
Carbohydrate standards (C 4 sugar 1, inositol C, talose, mannose, inositol D, glucose, inositol A, arabinose, xylose, and C 4 sugar 2) were purchased from Sigma (St. Louis, MO, USA). Acetonitrile (HPLC grade), methanol (HPLC grade), and methylimidazole were purchased from Fisher/Aldrich (NJ, USA). Sodium borohydride (NaBH 4 ), dimethyl sulfoxide, trifluoracetic acid, acetic acid, acetic anhydride, and chloroform (analytical grade) were purchased from Sinopharm Chemical Reagent Co. Ltd. (Shanghai, China). Water was obtained from a Milli-Q ultra-pure water system (Millipore, Billerica, USA). Patients were required to abstain from eating greasy and sweet food before the study to avoid an interference with the metabolism of the human body. Study protocol was approved by the Ethics Committee of the Hospital, and a written informed consent was obtained from each respondent. Each blood sample collected in a fasting condition was immediately centrifuged at 3000× g for 10 min, and the plasma was transferred into a clean tube. All urine samples collected in fasting condition and plasma samples were stored at −80 • C until analysis.

Inclusion and Syndrome Differentiation Criteria.
Based on the criteria formulated by the World Health Organization in 1999, DM is characterized by a fasting plasma glucose (FPG) of ≥7.0 mmol/L, a postload plasma glucose (2h PG) of ≥11.1 mmol/L, or a history of oral hypoglycemic or insulin use, or both [39]. TCM syndromes, including deficiency and excess syndromes, were differentiated according to the guidelines [40]. The information gathered from inspection, auscultation, and inquiring was obtained on the day of admission. Manifestations and other diagnostic information were determined independently by three experienced physicians to ensure an objective evaluation. If the three were in accordance, the subject will be included in the study. Otherwise, he/she will be excluded.

Exclusion Criteria.
Patients suffering from other serious diseases involving major organs or infective diseases were excluded from the study. Moreover, those who cannot or Evidence-Based Complementary and Alternative Medicine 3 are not willing to complete the study or those who had psychiatric disorders or intellectual dysfunctions were also excluded.

Clinical and Laboratory Assessment.
Clinical data including date of birth, height, weight, body mass index (BMI), waist and hip circumference, systolic blood pressure (SBP), and diastolic blood pressure (DBP) were determined by a senior physician. Obesity is characterized by a BMI of ≥25.0 kg/m 2 according to the Asian guidelines [41]. Serum levels of alanine aminotransferase (ALT), FPG, glycated hemoglobin (HbA1c), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), very low-density lipoprotein cholesterol (VLDL-C) in fasting condition, and 2h PG were measured using an automatic biochemical analyzer (Hitachi7180, Tokyo, Japan).

Sample Preparation of Urine for GC/MS. A 200 μL
sample of urine from each group was blended with 20 μL of ammonia and 1 μL of 0.5 mol/L NaBH 4 /dimethyl sulfoxide (DMSO). Acetic acid (100 μL) was added dropwise to reduce the abundance of NaBH 4 after the reduction reaction (120 min at 40 • C). Acetylation (10 min at 40 • C) was performed after adding 200 μL of 1 methylimidazole and 1 mL of acetic anhydride. Subsequently, 2 mL of water was mixed with the extracts for 10 min at 40 • C, and the mixtures were extracted with 2 mL of chloroform. The samples were centrifuged (4000× g for 10 min), and the supernatant was discarded. The samples were washed with 5 mL of water to remove the chloroform layer. The remaining layer was added with 1 g of sodium sulfate and taken for GC/MS. Allose (20 μL) was used as an internal standard to be added into each 200 μL sample.

GC/MS Conditions. GC/MS was performed using a
Finnigan gas chromatograph (ThermoFinnigan, USA) coupled with a mass spectrometer (TRACE DSQ). A TR-5ms capillary column (60 m × 0.25 mm × 0.25 μm, Thermo) was used in the gas chromatographic system. The inlet temperature was 250 • C. Column temperature was increased from an initial 140 • C to 198 • C (2 • C per min for 4 min). It was then programmed from 198 • C to 214 • C (4 • C per min), 214 • C to 217 • C (1 • C per min for 4 min), and 217 • C to 250 • C (3 • C per min for 5 min). Inlet temperature was maintained at 250 • C. Helium was used as a carrier gas at a flow rate of 1.0 mL/min. The GC/MS was injected with 1 μL aliquots. The mass spectrometer was operated in electron impact and full-scan monitoring modes (m/z 40-450) with 0.2 s/scan velocity. Source temperature, electron energy, and solvent delay were set at 250 • C, 70 eV, and 10 min, respectively.

Data Analysis and
Software. All data were processed by the Xcalibur software (ThermoFinnigan, USA), and the detected peaks were aligned using hand integral methods. The ion peak area for each detected peak was normalized by NIST 05 Standard mass spectral databases in the NIST MS search 2.0 (NIST, Gaithersburg, MD, USA) software. Semiquantitative concentrations of urinary monosaccharides were obtained through the ratio of the peak area to the standard. The K-OPLS package (available at http://kopls.sourceforge.net/download.shtml) and Statistic toolbox of the MATLAB (version 7.1, Mathwork Inc.) software were used in the statistical treatment of the data and application of various multivariate methods. Parts of the source codes used in implementing SPA in MATLAB were freely available at http://code.google.com/p/spa2010/downloads/list.
Data are shown as mean ± standard deviations (SD). In addition, significance was expressed through independent t-tests for continuous variables and Pearson Chi-square tests for categorical variables using the SPSS 17.0 software (SPSS, Chicago, Ill, USA). Fisher's exact tests were calculated when the expected frequencies were less than 5 in any cell. A P value of <0.05 was considered to indicate statistical significance.

K-OPLS Models for Classification.
Based on our previous work [42] and related literature [34,43], the K-OPLS model was employed in the present study to build a classifier, with σ as the parameter for the Gaussian kernel function. The kernel matrix K was centered to model estimation. The K-OPLS algorithm modeled the kernel matrix K through a set of predictive and Y-orthogonal components. Thus, the predictive score matrix and the Y-orthogonal score vector were estimated. After the estimation step of each Y-orthogonal component, K was deflated using the Yorthogonal variation, followed by a subsequent updating of the predictive score matrix and further estimation of Yorthogonal components. The kernel function parameter (σ) and the number of Y-orthogonal components (Ao) of the K-OPLS model were optimized using 10-fold cross-validation. All the samples were randomly partitioned into 10 equally sized folds according to their categories. Subsequently, 10 iterations of calibration and validation were performed. As a result, onefold of the data was held out for validation, whereas the remaining nine folds were used for calibration. Details on the model are provided in the previous work.

Revealing Informative Metabolites through Statistical
Assessment of Variable Importance. Previous studies [37,44] indicated that the SPA method used for uncovering informative metabolites is constructed based on the prediction error distribution of the K-OPLS models, which are based on the subdatasets obtained through Monte Carlo sampling in both sample and variable space.
In the equation DMEAN j = MEAN j,B − MEAN j,A , MEAN j,A and MEAN j,B denote the mean prediction errors calculated by the normal K-OPLS and the latter permuted K-OPLS models of the jth metabolite, respectively. If DMEAN j > 0, the inclusion of the jth metabolite in the K-OPLS model may improve the predictive performance. This type of metabolite is deemed as a candidate of informative metabolites in the present study. By contrast, if DMEAN j < 0, the inclusion of this metabolite into a model may most probably reduce the predictive performance. Therefore, this type of metabolite is considered uninformative/interfering.
With these preparations, the informative metabolites were identified in the following successive steps. (1) All the metabolites with DMEAN j < 0 were removed. (2) The Mann-Whitney U test was used in the remaining metabolites to check the significance of the difference between the two distributions. (3) The metabolites were ranked using the P value. The metabolites with P values smaller than the predefined threshold (e.g., 0.01) were considered informative metabolites, whereas those with P values larger than the threshold were considered uninformative metabolites. The P values calculated in this manner are conditional in all other metabolites because both normal prediction errors and permuted prediction errors are dependent on all other metabolites included in all the subwindows [37,44]. Usually, the more important a metabolite is, the higher the score assigned to it. In this case, a so-called Conditional Synergetic Score (COSS) is defined as the minus logarithm-transformed P value: Clearly, the more significant a metabolite is, the higher the score it will get. Particularly, a metabolite with P < 0.01 will have a COSS > 2. Thus, the informative metabolites revealed via SPA may be considered the most probable biomarker candidates.

Clinical Characteristics of Excess and Deficiency Syndromes in Patients with DM.
Clinical characteristics of the 366 subjects are summarized in Table 1. Among the 366 subjects, 308 (84.1%) were diagnosed with DM, 67 (21.8%) of which had excess syndrome. The patients with deficiency syndromes were significantly more likely to be older than those with excess syndromes in the DM group (P < 0.01). However, other statistical significances were not found. The systolic blood pressure, serum fasting and post-load glucose levels, and glycated hemoglobin were significantly higher in subjects with DM compared with those without DM (P < 0.001). However, opposite results were found for incorporative hyperlipidemia (P < 0.001).

GC/MS Profiles of Urine Samples.
Based on the previously developed method and related literature [45], the GC/MS parameters were optimized for the Thermo GC/MS system used in the present study. This system allowed the detection of several peaks from the GC/MS chromatogram within 50 min of analysis cycle. The typical total-ion chromatograms from the GC/MS of urine samples from DM patients are shown in Figure 1. Ten urinary carbohydrate metabolites were identified in patients with and without DM using standards, and their peak areas were integrated for further multivariate analysis.

Classification of the K-OPLS Models.
All the samples were used to build models. In the present study, K-OPLS was performed using the Gaussian kernel function. σ and Ao were optimized using 10-fold cross-validation. Accuracy of classification of cross-validation (ACCV) was calculated for each combination of σ and Ao. These parameters were optimized by generating models with σ and Ao values of 0.1 to 10 and 1 to 10, respectively. Figure 1 shows the results after cross-validation. ACCV was the largest at σ = 0.5 and Ao = 1 for DM and non-DM as well as for excess and deficiency syndrome groups. These optimal parameters were selected to model for these two groups, respectively (Figures 2(a) and 2(b)).
Tenfold cross-validation was applied to evaluate the predictive abilities of the constructed K-OPLS-DA models. The primary data were divided into 10 sets. One set was the "test set," and the others were the "training sets," which were repeatedly calculated 10 times to obtain the components. Table 2 shows the Q2Y, R2Y, and R2X used in evaluating all the calibration models of the two groups. R2X and R2Y were defined as the explained variation of the input (metabolic data) and output variables (disease category data), respectively. Q2Y denoted the prediction statistics over cross-validation for the classification task [46]. The values of these parameters approaching 1.0 indicate a stable model with a predictive reliability [47]. High coefficient values of R2Y and Q2Y represent good prediction [48]. As displayed by the score plots of K-OPLS (Figure 3(a)), the two sample groups can be separated into distinct clusters to indicate the changes in the metabolic response of the DM and non-DM urine samples. The samples in the excess and deficiency groups were also clearly separated (Figure 3(b)). The R2X, R2Y, and Q2Y of the former model were 0.591, 1, and 0.853, respectively, whereas those of the latter model were 0.543, 1, and 0.783, respectively (Table 2). These results indicated that the models had a good ability of explaining and predicting the variations in the X and Y matrices.

Differential Metabolites from SPA Based on the K-OPLS Models.
For this data, the number of Monte Carlo Simulation (N), ratio of calibration samples to the total samples (R), and number of variables to be sampled in each Monte Carlo Simulation (Q) of SPA were set to 1000, 0.8, and 8, respectively. Each metabolite was first standardized with zero mean and unit variance before further analysis. With this setup, the SPA was applied to the data, and the P value of each metabolite was computed through the Mann-Whitney U test (Figures 4(a) and 4(b)). The corresponding COSS for each metabolite is shown in Figures 4(c) and 4(d).
The two plots of DM and non-DM data obviously suggest that metabolites, including C 4 sugar 1, inositol C, mannose, inositol D, glucose, and C 4 sugar 2, were of small P values (smaller than 0.01) and COSS > 2. These six metabolites may possibly be formative metabolites or biomarkers. Thus, they should be included in further analysis. The remaining four metabolites were of high P values and COSS < 2. The first six significant metabolites were selected to have the best metabolite patterns, which collectively showed high prediction abilities in the clinical outcome. Combined with the t-test results (P < 0.05), the four metabolites were as follows: C 4 sugar 1, inositol D, glucose, and C 4 sugar 2. Similarly, the variables C 4 sugar 1, C 4 sugar 2, inositol C, talose, and xylose were found to have P < 0.01 and COSS > 2 in the excess and deficiency group data. However, based on the t-test results, only xylose and C 4 sugar 2 were statistically significant in the two groups.

Discussion
TCM is a medical system with at least 3000 years of uninterrupted clinical practice. It has the advantage of collecting macroscopic information of a patient for diagnosis, with syndrome as the core of diagnosis and therapy in TCM [49]. Nowadays, the diagnosis of syndromes in TCM mainly relies on four examinations (inspection, listening and smelling examinations, inquiry, and palpation). Outcomes of TCM diagnoses may lack consistency among TCM doctors [50,51]. Thus, the accuracy is relatively low. The use of objective indices in syndrome diagnosis in TCM may significantly improve accuracy. Until now, syndromes in TCM have always been studied in a specific disease or biomedical condition. In addition, several studies have demonstrated that syndromes are significantly associated with diseases [49,52,53]. However, the biological basis of a syndrome in the context of a disease is rarely studied. The issue is significantly critical because it not only establishes a diagnostic avenue in a microcosmic level but also divides the disease into several subtypes and provides a basis for individual therapy. The establishment of a diagnostic method in the microcosmic level is an urgent and major problem in TCM [54].
DM is characterized by two major defects: a dysregulation in pancreatic hormone secretion and a decrease in insulin action on target tissues (insulin resistance). These abnormalities are related to several defects in insulinsignaling mechanisms and several steps in regulating glucose metabolism (transport and key enzymes of glycogen synthesis or mitochondrial oxidation) [55]. The development of strategies to diagnose, prevent, or delay the progression of DM has gained increasing interest because of its high morbidity and mortality rates. TCM has played an important role in lowering blood glucose and controlling the development of DM. Many studies have shown that TCM, such as Radix Astragali, Radix Rehmanniae, and Radix Trichosanthis, also has hypoglycemic effects [56]. Thus, the present study was designed to determine whether metabolomics is useful and powerful enough to differentiate between the deficiency and excess syndromes of TCM using DM as a model.
The systolic blood pressure, serum concentrations of fasting and post-load glucose, and glycated hemoglobin were significantly higher in subjects with DM than in those without DM. This result is in accordance with the characteristics of diabetes. By contrast, no clear difference was found between the two groups. This result reflects that the two subject groups had relative backgrounds in terms of age, sex, waist circumference, hip circumference, WHR, diastolic blood pressure, TG, ALT, VLDL, and HDL levels, except for the incidence of incorporative hyperlipidemia.
The deficiency syndrome patients were older than the excess. This finding is in agreement with the TCM theory that Qi, Xue, Yin, and Yang are more insufficient in older than in younger people. However, other differences including biochemical values were not found between the two groups. This result implies that the TCM syndromes are difficult to differentiate using the clinical biochemical indicators. Therefore, TCM syndromes should be distinguished using other methods.
Considering the intrinsic relationship between TCM theory and systems biology, some researchers began to discuss the prospective application of metabolomics to TCM theory. Metabolic profiling has been recently exploited in the pathophysiological studies of diseases [57][58][59][60]. However, 8 Evidence-Based Complementary and Alternative Medicine only a few reports concerning the metabolomics approach in TCM research have been found in the current literature [61,62]. In the present study, a GC/MS-based metabolomic approach was used for determining the biochemical profiles of different TCM syndrome types in DM. Moreover, the method was also used in testing whether the metabolomics approach is powerful enough to differentiate TCM syndrome types.
With the development of metabolomics, the data-mining technique has become increasingly mature. Its advantages are very applicable to the complex correlativity study of TCM syndromes and metabolites. However, the relationship between disease and metabolic data displayed nonlinear characteristics in the present study. Therefore, good models were not performed using the PLSLDA or OPLSDA method, such as R2X < 0.3 or Q2Y < 0.1. The nonlinear classification model K-OPLS had later shown stronger classification ability than the PLSLDA and OPLSDA linear classifiers.
In the present study, we first discovered that the comprehensive differences of metabolic intermediates between subjects with and without DM focused mainly on those involved in glucose metabolism. The study identified ten carbohydrate compositions, including C 4 sugar 1, inositol C, talose, mannose, inositol D, glucose, inositol A, arabinose, xylose, and C 4 sugar 2. Based on the results of K-OPLS/SPA, six and five possible markers with P < 0.01 and COSS > 2 were found in DM and non-DM subjects and excess and deficiency groups, respectively. T-test was also used to compute the P value for each metabolite. Clearly, the results of t-test were not comparable with those of SPA. Two or three of them had no significant difference between groups based on the t-test (P > 0.05), further suggesting that the conditional P value calculated via SPA was much more informative. The main reason may be that the variable importance computed using SPA can reflect the synergetic effect to some extent [44]. Therefore, one metabolite may not be alone in a disease status but interacts with other metabolites.
Consequently, four intermediates including inositol D, C 4 sugar 2, glucose, and C 4 sugar 1 produced during glycolysis were elevated in the DM group samples. The high prediction performance of the four metabolites indicates that they are possible biomarker candidates for DM. Furthermore, two potential biomarkers, xylose and C 4 sugar 2, were discovered in the two syndromes using K-OPLS/SPA and ttest. These potential biomarkers can be identified by the MS database and corresponding standards.
Metabolites are endogenous and exogenous molecules that play a role in cellular regulatory and biological systems. Glucose is the major source of energy production and macromolecule biosynthesis in maintaining the normal state of the body. Highly active glycolysis and an impaired Krebs cycle guarantee enough metabolic intermediates by avoiding thorough oxidation of glucose. This phenomenon is essential for the synthesis of macromolecules, such as lipid, protein, and nuclear acid, during cell division [63][64][65]. The circulating glucose is filtrated by the glomerulus and absorbed by the renal tubules. Therefore, healthy human urine should not contain any sugar. Hyperglycemia, other metabolic disorders, and chronic complications due to an absolute lack of insulin and/or a reduction of the biological effects of insulin may cause the appearance of corresponding sugars in urinary metabolites. For example, 4-carbon sugars are the intermediate products of glucose metabolism. Inositol, a water-soluble vitamin, can play insulin-like roles on a metabolic enzyme. Mannose is a sugar monomer of the aldohexose series of carbohydrates and a C-2 epimer of glucose. It cannot be metabolized well in vivo. Hence, 90% of mannose will be discharged through the urine within 30 min to 60 min, and 99% of mannose in residual urine will be excreted in the next 8 h. Arabinose is a monosaccharide containing five carbon atoms and is decomposed into glucose and fructose by intestinal sucrose. Sucrose is involved in amino and nucleotide sugar metabolisms. Xylose is the connection unit between the sugar chain and serine or threonine as a combined form in vivo. Talose, also called hydrolysis of lactose, has an unknown significance so far. Therefore, the above components were present in the urine of DM patients. This finding indicates the presence of significant glucose metabolism disorders in diabetes.
Metabolic profiling can sensitively reflect all physiological and pathological changes. Moreover, it can elucidate the "syndrome" concept in TCM complex physiological systems. Using all metabolites in the evaluation of the human health status is more accurate and comprehensive than using a single index [66,67]. The present study indicated that xylose and C 4 sugar 2 were higher in the excess than in the deficiency group. Therefore, the holistic application of metabolic profiling in studying the syndrome essence of TCM is reasonable. In summary, these potential biomarkers reflected the deregulation of glucose metabolism in diabetic individuals, which might help in DM diagnosis and TCM syndrome differentiation.

Conclusions
This research strongly supported that metabolic profiling analysis combined with K-OPLS and SPA is a powerful tool in revealing metabolic differences between various groups, obtaining valuable information to probe molecular mechanisms, and discovering the scientific connotation of TCM theory. Larger randomized trials with an appropriate methodology, including the study of diabetic patients with different TCM syndromes, are required to confirm the results of the present study.

Authors Contribution
T. Wu and M. Yang have equally contributed to this paper.