Study on Closely Related Citrus CMMs based on Chemometrics and Prediction of Components-Targets-Diseases Network by Ingenuity Pathway Analysis

As the representatives of closely related Chinese medicinal materials (CMMs) originated from Rutaceae family, Aurantii fructus immaturus (AFI), Aurantii fructus (AF), Citri reticulatae pericarpium viride (CRPV), and Citri reticulatae pericarpium (CRP) have better functions in regulating qi and promoting gastrointestinal motility. However, differences in the quality of closely related Citrus CMMs have not yet been revealed until now. Herein, this study focused on the systematic differentiation and in-depth understanding of closely related Citrus CMMs by a strategy integrating chemometrics and network pharmacology. Determined by ultra performance liquid chromatography, the content of nine flavonoids showed obvious fluctuations in the decoction pieces from different species (Citrus aurantium Linnaeus and Citrus reticulate Blanco) with decreasing levels in the samples of ripe fruits. Decoction pieces from the different species and ripening stages were well distinguished by orthogonal projection to latent structure-discriminate analysis (OPLS-DA) and cluster analysis. As a result, four active components including narirutin, naringenin, hesperidin, and 3,5,6,7,8,3′,4′-heptemthoxyflavone were filtered out by variable importance for the projection (VIP) value (VIP > 1.0), which were regarded as chemotaxonomic markers. Furthermore, a components-targets-diseases network was constructed via ingenuity pathway analysis (IPA), and the correlations were predicted between four chemotaxonomic markers, 223 targets, and three diseases including colitis, breast cancer, and colorectal cancer. The obtained results will be of great significance for identifying closely related Citrus CMMs and conduce to improving the resource utilization of CMMs.


Introduction
Citrus fruits, known as medicinal and edible homologous plants, have played a vital role in traditional Chinese medicine (TCM) practices and have been extensively applied for thousands of years owing to their greater biological activities, abundant resources, and low toxicity [1]. e Chinese Pharmacopeia stipulates that the fruit of Citrus aurantium Linnaeus or its cultivars harvested in May and June should be used as Aurantii fructus immaturus (AFI), and those harvested in July are employed as Aurantii fructus (AF); besides, the pericarp of Citrus reticulate Blanco or its cultivars harvested during August to December should be treated as Citri reticulatae pericarpium (CRP), and those harvested in July should be used as Citri reticulatae pericarpium viride (CRPV) [2]. Due to their specific therapeutic effects, four decoction pieces are employed to perform diverse usages. AFI is often used to eliminate phlegm and dissipate stagnant qi, while AF relieves gastrointestinal indigestion in a gentle yet effective manner [3,4]. CRPV is mostly used to promote the flow of liver qi, while CRP is commonly utilized to strengthen the spleen and dispel phlegm [5,6]. However, the different chemical features in the four decoction pieces have not been systematically reported, which is not beneficial for understanding the different clinical applications.
Understanding the correlation between components and their efficacy contributes to the rational clinical application of CMM. Network pharmacology is an ideal method that uses bioinformatics to predict the action mechanism of herbal ingredients through strategies of multicomponents and multitargets [7]. It is usually applied to clarify the pharmacodynamic basis and mechanisms of CMM, such as Ginseng [8], Licorice [9], and Huangqi [10]. Network pharmacology can predict the relationships between components and diseases, which will help to better understand the differences in quality of the four closely related Citrus CMMs.
At present, related reports mainly focus on the analysis of chemical composition and the content of closely related Citrus CMMs by employing high-performance liquid chromatography (HPLC) [11][12][13], liquid chromatographymass spectrometry (LC-MS) [14][15][16][17], and gas chromatography-mass spectrometry (GC-MS) [18,19]. In addition, researchers have established multiple strategies for distinguishing closely related Citrus CMMs based on qualitative or quantitative data with chemometric methods. Yi et al. revealed the volatile chemical features of CRP and CRPV by combing GC-MS metabolomics analysis with chemometrics [20]. Li et al. established a comprehensive strategy for the comparison of AFI and AF by integrating HPLC/GC-MS analysis and chemometric methods [21]. According to UPLC-Q-TOF-MS-based metabolomics, Zhao et al. depicted variable chemotaxonomic markers and metabolic mechanisms of four closely related Citrus CMMs [22]. ese methods can successfully discriminate the closely related Citrus CMMs with the merits of high sensitivity and excellent specificity, facilitating a better understanding of their different medicinal values. Nevertheless, the reported methods suffer from some limitations, such as the limited number of focused compounds for quantitative analysis, an imperfection in the integration analysis of quantitative analysis-chemometricsnetwork pharmacology for discriminating four closely related Citrus CMMs.
Herein, a comprehensive strategy integrating the quantification of nine flavonoids and chemometrics analysis was firstly proposed to identify the four closely related Citrus CMMs. is study focused on the analysis of major flavonoids in AFI, AF, CRP, and CRPV through UPLC-variable wavelength detection method. Different species (C. aurantium L. and C. reticulate Blanco) and development stages (AFI and AF, CRPV and CRP) were clearly discriminated via orthogonal signal correction partial least squares-discriminate analysis (OPLS-DA) and cluster analysis. By variable importance for the projection (VIP) value, narirutin (Nar), hesperidin (Hed), naringenin (Nag), and 3,5,6,7,8,3′,4′-heptemthoxyflavone (Hep) were identified as potential chemotaxonomic markers. Furthermore, ingenuity pathway analysis (IPA) was applied to predict the correlation between four chemical markers and three diseases including colitis, breast cancer, and colorectal cancer. In conclusion, this research aims to offer a simple and convenient strategy for better understanding the different characteristics of four closely related Citrus CMMs. Eighty batches of samples including 20 batches of AFI (numbered AFI1-AFI20), 20 batches of AF (numbered AF1-AF20), 20 batches of CRP (numbered CRP1-CRP20), and 20 batches of CRPV (numbered CRPV1-CRPV20) were bought from the medicinal materials markets, whose detailed information is shown in Table S1. All samples were deposited in State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine (Tianjin, China).

Preparation of Standard Solution.
Nine standard components were accurately weighted and directly dissolved in methanol to obtain nine stock solutions with concentrations at 4.148 mg/mL for Nar, 11.998 mg/mL for Nan, 7.002 mg/ mL for Hed, 10.006 mg/mL for Ned, 2.416 mg/mL for Por, 1.505 mg/mL for Nag, 1.511 mg/mL for Not, 0.051 mg/mL for Hep, and 0.559 mg/mL for Tar, respectively. Afterward, a mixed solution was prepared with the final concentrations of 0.581 mg/mL for Nar, 1.320 mg/mL for Nan, 0.572 mg/mL for Hed, 1.301 mg/mL for Ned, 0.338 mg/mL for Por, 0.211 mg/mL for Nag, 0.060 mg/mL for Not, 0.003 mg/mL for Hep, and 0.050 mg/mL for Tar, respectively. Subsequently, the working solution was diluted into a series of standard solutions with different concentrations to build calibration curves.

Preparation of Sample Solution.
Accurately weighed sample powder (0.2 g) was transferred into a conical flask and sonicated with 30 mL methanol at 60°C for 30 min. After centrifugation (18,213 g, 10 min), the supernatant was injected into the UPLC system for detecting Nar, Nan, Hed, Ned, Por, Nag, Not, Hep, and Tar. Only for AFI and AF, the supernatant was subjected to diluting five times with methanol for the analysis of Nan and Ned.

Methodological Validation.
For testing the feasibility of the analytical method for quantitative analysis of the tested compounds, precision (intra-and interday), stability, linearity, limit of detection (LOD), limit of quantification (LOQ), reproducibility, and recovery were systematically validated. Intraday and interday precision were evaluated by performing six replicate injections on the same day and three consecutive days, respectively. e stability of the sample solutions prepared at room temperature was assessed by repeated injections at 0, 2, 4, 6, 8, 10, and 12 h, respectively. Two replicate calibration curves were established based on the peak area (y) and the corresponding concentration (x) of the tested compounds. LOD and LOQ were calculated at a signal-to-noise ratio (S/N) of about 3 and 10 using standard solutions, respectively. Six sample solutions extracted from the same batch were analyzed to confirm repeatability. e recovery was investigated by analyzing six sample solutions processed by adding appropriate amounts of standard solution to 0.1 g sample powder.

Ingenuity Pathway Analysis.
e structures of the four tested compounds including Hep, Nar, Hed, and Nag were downloaded from the PubChem database (https://pubchem. ncbi.nlm.nih.gov/) [23]. Potential human targets of the four components were obtained from the Swiss Target Prediction database (http://www.swisstargetprediction.ch/) [24]. All targets were imported into Microsoft Excel software to remove duplicate values. IPA was employed to predict canonical pathways and networks of compounds, targets, and diseases based on known interactions between genes and proteins.

Optimization of Extraction Conditions and Methodology
Validation. Flavonoids in Citrus CMM are shown to be associated with the treatment of liver damage, regulation of gastrointestinal motility, and other biological activities [25]. With UPLC, nine flavonoids including Nar, Nan, Hed, Ned, Por, Nag, Not, Hep, and Tar were ideally separated within 15 min (Figure 1).
In order to address the key challenge of maximizing the extraction efficiency of the tested compounds with different diverged polarities under the same extraction condition, a "spider-web" mode was preferably employed for optimizing the extraction conditions according to the multivariable valuation method presented by our group [26][27][28][29][30]. To express this concisely, the peak area of the tested compounds per gram was assigned as A m-k , which was correspondingly divided by their biggest peak area (A k (max) ) to give E m-k . As shown in formula (1), k is denoted as different compounds, and m represents different extraction conditions (different solvents, different solid-liquid ratios, and different extraction times). E m-k was used to construct different dimensions of the "spider-web" mode (p i ) in formula (2). e bigger the shaded area was, the closer it was to the most optimal extraction condition. e angle between the two dimensions was tagged as α (α � 360°/n, n � 9).
e extraction solvents for four decoction pieces were investigated, including 50% and 75% methanol aqueous solution and methanol (60°C) (Figure 2). Under ultrasonication with methanol at 60°C, the shaded area of "spider-web" of four decoction pieces exhibited the greatest value. AFI, as a typical sample, was subjected to optimizing the solid-liquid ratio (1 : 100, 1 : 125, 1 : 150, and 1 : 200) and extraction time (10 min, 20 min, 30 min, and 40 min), to obtain optimum extraction condition. In conclusion, the best extraction was achieved by ultrasonic extraction of sample powder with methanol (solvent/material at 150 : 1) at 60°C for 30 min. e optimized extracting method paved the way for methodological validation, whose detailed results are presented in Table 1. Intraday and interday precisions were good with relative standard deviations (RSD) less than 3.0%.
e RSDs of nine tested compounds were lower than 2.8% over 12 h in the stability study at room temperature. e calibration curves of nine tested components were constructed with the determination coefficient (r 2 ) exceeding 0.999, showing good linearity over the tested range. e LOD and LOQ values were below 0.1120 μg/mL and 0.5872 μg/mL, respectively. e result of the reproducibility study was satisfactory, with RSDs less than 3.0% for the tested compounds. Furthermore, the average recoveries ranged from 92.65% to 106.98% with RSD less than 2.8%. Consequently, the established method can be successfully employed in subsequent studies.

Quantitative Analysis of Nine Tested Compounds in Four
Closely Related Citrus CMMs. Determined by the validated UPLC-variable wavelength method, the content of nine flavonoids in 80 batches of samples was exhibited in Table S2. In order to intuitively and clearly display the distribution of the content of interesting flavonoids in the decoction pieces, we normalized the data and expressed them by relative content, which was calculated by C k /C k(max)   Evidence-Based Complementary and Alternative Medicine (the content of the compound is divided into the maximum content of the corresponding compound, with k denoted as different compounds). As shown in Figure 3, the relative contents of nine flavonoids fluctuated greatly in samples from two species (C. aurantium L. and C. reticulate Blanco). e range of compounds' content can reach 2.02-46.96 mg/g for Nar (RSD, 83.99%), 1.53-186.51 mg/g for Hed (RSD, 109.68%), 0.22-6.91 mg/g for Not (RSD, 100.46%), 0.03-0.91 mg/g for Hep (RSD, 89.94%), and 0.10-4.21 mg/g for Tar (RSD, 110.81%), respectively. Nan, Ned, and Por were not detected in the pericarp of C. reticulate Blanco, whose contents in the fruit of C. aurantium L. were in the range of 0.52-117.24 mg/g (RSD, 40.16%), 0.57-195.73 mg/g (RSD, 65.07%), and 0.00-5.16 mg/g (RSD, 68.45%), respectively. e content of Nag in samples was between 0.02 and 3.17 mg/g (RSD, 106.86%) except for CRP. e characteristics of components' accumulation in the two species at different development stages were studied. In the fruit of C. aurantium L., the contents of the tested components were higher in AFI than in AF except for Hep, and the total content in AFI was significantly higher (the average value was 219.36 mg/g) than that in AF (the average  e result suggests that as fruits become ripe, the total content of flavonoids decreases correspondingly. Similarly, in the pericarp of C. reticulate Blanco, the total content of focused analytes can be up to 189.39 mg/g in CRPV, while 73.76 mg/g was measured in CRP. is may be due to the different phenotypes of secondary metabolites in samples at different maturation stages [21,22]. Generally, the content of flavonoids in unripe samples (AFI or CRPV) was more abundant than that in ripe samples (AF or CRP).

Discrimination of Closely Related Citrus CMMs by
Chemometrics. As one of the main tools of multivariate statistical methods, principal component analysis (PCA) is usually used to set up a low-dimensional plane or space that can visualize the classification trends among samples [31,32]. It has also been used to remove outliers from data [33]. By introducing an orthogonal signal correction (OSC) filter, variation from X (descriptor variables) was removed to reduce model complexity by orthogonal projection to latent structure-discriminate analysis (OPLS-DA), which is uncorrelated with Y (property variables) to achieve good discrimination for samples [34,35]. In addition, it has always been successfully applied to study metabolomics and CMMs [36][37][38][39].
For ensuring the accuracy of the model, outliers (AFI8, AF4, CRPV2, CRPV6, CRP11, and CRP13) were removed before the construction of the distinguishing model by PCA ( Figure S1). As shown in Figure 4  For assessing the prediction capability, cross-validation of the OPLS-DA models was performed ( Figure S2). e obtained results proved that the model was established validly. As shown in Figure 4(b), Hep, Nar, Hed, and Nag were filtered out by variable importance for the projection (VIP) value (VIP > 1.0), which were regarded as chemotaxonomic markers for discriminating four closely related Citrus CMMs.
Furthermore, a clustering heatmap was applied to show the trend in the relative content of flavonoids and the relationship of the classified samples ( Figure 5). A red box indicates that the content occurs at higher levels compared to the mean level in a sample, while a blue box means that the content is lower. Meanwhile, the classified clusters were consistent with the results of OPLS-DA. Four closely related Citrus CMMs were well distinguished. Collectively, chemometrics was proved as a feasible strategy to discriminate closely related genus CMMs.  Evidence-Based Complementary and Alternative Medicine

Analysis of Chemotaxonomic Markers by Network
Pharmacological. As one popular strategy, ingenuity pathway analysis (IPA) can be used to predict correlations among components, targets, and diseases through unique algorithms and reported literature [40]. Focusing on screened chemotaxonomic markers including Hep, Nar, Hed, and  Evidence-Based Complementary and Alternative Medicine Nag, a network of components-targets-diseases was constructed to understand the quality differences of closely related Citrus CMMs. Using the PubChem database, we obtained the structures of Hep, Nar, Hed, and Nag. In the Swiss Target Prediction database, 232 targets (Table S4) were identified by the correspondence with Hep, Nar, Hed, and Nag, after removing the redundant targets with Homo as a limited species. en, 223 targets were focused on by IPA for core analysis. Furthermore, 465 canonical pathways and 21 networks were illustrated, revealing the close associations between targets and inflammation, cancer, and neurological disorders. As shown in Figure 6(a), the yellow threshold line indicates p � 0.05. e ranking of the relationship between targets and pathways revealed that the regulation of these targets by the tested compounds may respectively exhibit anti-inflammatory, anticancer, and neuroprotective activities through IL-8 signaling pathway, colorectal cancer metastasis signaling pathway, and CERB signaling in neural. e related diseases, including breast cancer, colorectal cancer, and colitis, were selected for the construction of components-targets-diseases network. Four compounds, 223 targets, and three diseases were combined by using the function of "pathway designer". As shown in Figure 6(b), 126 targets were related to breast cancer, 148 targets were associated with colorectal cancer, and 54 targets were involved in colitis, respectively. It is speculated that Hep, Nar, Hed, and Nag exert induction effects on the treatment of colitis, breast cancer, and colorectal cancer, which is consistent with some reported literature [41][42][43].

Conclusions
In the presented work, the established UPLC-variable wavelength detection method showed acceptable linearity, precision, repeatability, and accuracy, which was successfully used for quantitative analysis of nine major flavonoids in four closely related Citrus CMMs. e total content of flavonoids fluctuated obviously in different species (C. aurantium L. and C. reticulate Blanco) with a decrease in both species at maturity. Meanwhile, OPLS-DA and cluster analysis were employed to perform excellent discrimination of AFI, AF, CRPV, and CRP. By OPLS-DA, the screened compounds with VIP values (VIP > 1.0) were suggested as four chemotaxonomic markers (Hep, Nar, Hed, and Nag). Furthermore, a components-targets-diseases network was developed to display the correlation between chemotaxonomic markers and diseases of colitis, breast cancer, and colon cancer via the focused targets. is study firstly presents a comprehensive strategy for simultaneous comparison of four closely related Citrus CMMs based on the flavonoids' content, which may offer a simple and convenient method for studying closely related genus CMMs.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Qixuan Mu and Yaping Zhang conceived the concept and designed the study methods. Ying Cui, Xin Chai, and Junlian Liu provided advice on the methodology and contributed to targets searching. Yongzhi Li, Huijuan Yu, and Yuefei Wang polished and revised the manuscript. All authors approved the final manuscript. Qixuan Mu and Yaping Zhang wrote the manuscript. Qixuan Mu and Yaping Zhang contributed equally to this article. Table S1: sample information of four closely related Citrus CMMs. Table S2: the content of nine flavonoids in the four closely related Citrus CMMs samples (mg/g, n � 3), and "-", signal below LOD. Table S3: all the targets corresponding to four chemotaxonomic markers. Figure S1: outlier identified by PCA score from AFI (a), AF (b), CRPV (c), and CRP (d), respectively. Red circles indicate abnormal samples. Figure  S2