Phytochemical Analysis Using UPLC-MSn Combined with Network Pharmacology Approaches to Explore the Biomarkers for the Quality Control of the Anticancer Tannin Fraction of Phyllanthus emblica L. Habitat in Nepal

Phyllanthus emblica L. is widely used in traditional Tibetan medicine for its therapeutic effects on treating liver, kidney, and bladder problems. We have reported that the tannin fraction has a good anti-hepatocellular carcinoma effect, but its active ingredients are not clear. This study was to find the active ingredients of the tannin fraction using UPLC-MSn and network pharmacology. First of all, the UPLC-MSn method was employed to obtain high-resolution mass spectra of different components, and 110 compounds were obtained. Then a network pharmacology method was used to find biomarkers for quality control. Network pharmacology results showed that gallic acid, punicalagin A, punicalagin B, methyl gallate, geraniin, corilagin, chebulinic acid, chebulagic acid, and ellagic acid should be the biomarkers of the tannin fraction. Furthermore, 9 components were detected in the serum, which also proved that they could be biomarkers, because we generally believe that the ingredients which are absorbed into the blood are effective. In the end, a simple method for simultaneously determining the contents of the 9 compounds was constructed by HPLC-DAD. This research established a new method to find biomarkers of traditional Chinese medicine. This is of great significance to improving the quality standards of Tibetan medicine.


Introduction
Traditional Tibetan medicine has evolved from 2,300 years ago and still plays an important role in protecting human health. It is a vital part of traditional Chinese medicine. It can draw extensive attention for its mysterious nature and good effectiveness. Phllanthus emblica L. is widely used in traditional Tibetan medicine due to its numerous pharmacological applications in chronic diseases (for example, hypertension, hepatitis, blood stasis, and pharyngitis) [1][2][3][4]. It is an edible fruit indigenous to Southeast Asia and has been considered as a potent functional food. It is increasingly recognized that food and diet can maintain health and reduce the risk of chronic diseases.
As part of our phytochemical investigation of medicinal plants for the discovery of new bioactive natural products, we have already reported the chemical constituents [17,18] isolated from Phyllanthus emblica L., and the tannin fraction has good antitumor activity [19,20]. We also established the stable preparation processes of the tannin fraction of Phyllanthus emblica L. However, most of the chemicals in the tannin fraction remain unknown, making it difficult to rationalize its bioactivity or evaluate the safety of this material as a therapeutic agent. erefore, there is an urgent need to develop an analytical method capable of determining the chemical compositions in the tannin fraction. e therapeutic effects of traditional Chinese medicines (TCM) are based on the complex interactions of complicated chemical constituents as a whole system. It is obviously unreasonable to use only a few ingredients for quality control. It is also necessary to associate ingredients with activity. us, choosing the right ingredients to reflect the quality of traditional Chinese medicine is the key issue. We researched the relevant literature on the quality control of the tannin fraction of Phyllanthus emblica L. Some scholars used HPLC to determine the content of a few compounds in Phyllanthus emblica L. [21,22], but there was no correlation between ingredients and efficacy.
is research established a new method to find biomarkers for the quality control of traditional Chinese medicine. We firstly used the UPLC-MS n method to obtain high-resolution mass spectra of the different components. A total of 110 compounds including 45 hydrolysable tannins, 22 mucic acids, 15 phenolic acids, 15 flavonoids, 11 organic acids, and 2 other compounds were tentatively identified by comparing their retention times and mass spectrometry data with those of the reference compounds and reviewing the literature. en, a network pharmacology method was used to find biomarkers for quality control based on the 110 identified compounds and anti-hepatocellular carcinoma effect. Network pharmacology results showed that gallic acid, punicalagin A, punicalagin B, methyl gallate, geraniin, corilagin, chebulinic acid, chebulagic acid, and ellagic acid might be the biomarkers of the tannin fraction, and these 9 components were detected in the serum, which also proves that they could be biomarkers, because we generally believe that the ingredients those are absorbed into the blood are effective. In the end, a simple method for simultaneously determining the contents of the 9 compounds was constructed using HPLC-DAD. To the best of our knowledge, this is the first report using UPLC-MS n and network pharmacology approaches to find the boimarkers for the quality control of the tannin fraction of Phyllanthus emblica L. e method developed in our study also provides a scientific foundation for the study of anticancer effective substances of the tannin fraction of Phyllanthus emblica L. e crude drug was extracted with ethanol and separated by HPD-400 macroporous resin column chromatography. e sample was dried and powdered, before being sieved through a 40 mesh sieve. A sample of the powder (approximately 25 mg) was suspended in 50 mL of methanol, and the resulting mixture was filtered through a 0.22 μm PTFE syringe filter. e filtrate was collected and subjected to centrifugation (13,000 rpm, 10 min). e supernatant was then transferred to an autosampler vial for analysis by UPLC-MS/MS and HPLC-DAD.

Optimization of Analytical Conditions.
To obtain better chromatographic separation and mass spectrometric detection, we evaluated three different mobile phase systems, including aqueous methanol, aqueous acetonitrile, and aqueous acetonitrile-formic acid solutions. e aqueous methanol solution resulted in the best separation of the major components of the tannin fraction of Phyllanthus emblica L. Furthermore, the addition of 0.2% acetic acid to this mobile phase resulted in a considerable improvement in the symmetry properties of the most chromatographic peaks. We also varied the flow rate (0.8, 1.0, and 1.2 mL/min) for HPLC analysis and (0.25, 0.3, and 0.35 mL/min) UPLC analysis, column temperature (25,30, and 35°C) for HPLC and UPLC analysis, and injection volume (3, 5, and 10 μL) for UPLC analysis during method development. e results of these optimization experiments established the following conditions for the chromatographic separation of the different components of the tannin fraction of Phyllanthus emblica L.
2.5. Structure Analysis Procedure. In the negative scan mode, based on the high-accuracy precursor ions and product ions obtained from UPLC-MS/MS, the elemental compositions were calculated when the maximum tolerance of mass error for the precursor ions and product ions was set at 1.5 ppm, which can satisfy the requirements for positive identification. Based on the elemental compositions of the precursors, the most rational molecular formula was sought in different chemical databases such as the Spectral Database for Organic Compounds SDBS, m/z cloud, and ChemSpider. Meanwhile, by searching literature sources, such as PubMed of the U.S. National Library of Medicine and the National Institutes of Health, Scifinder Scholar of the American Chemical Society, Science Direct of Elsevier, and Chinese National Knowledge Infrastructure (CNKI) of Tsinghua University, all components reported in the literature on Phyllanthus emblica L. and plants from the same family were summarized in a Microsoft Office Excel table to establish an in-house library [5,[7][8][9][10][11][12][13]23] for searching the most rational molecular formula. When several matching compounds with the same formula were found, the fragmentation patterns and pathways of the compounds were analyzed and then validated by Mass Frontier 7.0 ( ermo Scientific) for positive identification.

Biomarkers Selected by Network Pharmacology and
Ingredients Absorbed into the Blood. We followed the methods of Luo et al. 2020 [24]. Firstly, a network pharmacology method was used to find biomarkers for quality control based on the compounds identified by UPLC-MS n and antihepatocellular carcinoma effect. en to confirm that these compounds were proper quality control markers, animal experiments were conducted, with rats as test animals. We check whether these active ingredients are absorbed into the blood, because we generally believe that the ingredients those are absorbed into the blood are effective. e use of animals in the present study was permitted by the Ethics Committee of Beijing University of Chinese Medicine, and all animal studies were carried out according to the Guide for Care and Use of Laboratory Animals.

3.1.
Identification of the Compounds Present. UPLC-MS/MS method was employed to identify the components in the tannin fraction of Phyllanthus emblica L. e total ion chromatogram profile of the tannin fraction of Phyllanthus emblica L. was presented in negative mode, as shown in Figure 1(a). Molecular weights and fragmentation information (Table 1) were obtained. e possible structures of all peaks were deduced as shown in Figure 2. Under the optimized MS conditions, the negative mode was used to identify the peaks. 110 compounds including 45 hydrolysable tannins, 22 mucic acids, 15 phenolic acids, 15 flavonoids, 11 organic acids, and 2 other compounds have been tentatively identified by comparing their retention times and mass spectrometry data with that of reference compounds and reviewing the literature. Data for all of these compounds are summarized in Table 1 (4), geraniin (5), corilagin (6), chebulinic acid (7), chebulagic acid (8), and ellagic acid (9).

Biomarkers Selected by Network Pharmacology and
Ingredients Absorbed into the Blood. 228 potential targets related to the 110 compounds were obtained by using Swiss Target Prediction and TCMSP databases. And 7392 potential targets related to hepatocellular carcinoma were obtained according to OncoDB.HCC and Liverome databases. rough protein-protein interaction analysis, 120 targets with higher correlation were obtained, as shown in Figure 3. e DAVID database was used to conduct GO enrichment analysis on 120 targets with p-value less than 0.01, as shown in Figure 4. Finally, the Cytoscape 3.7.1 software was used to visualize the "component-target-function" network, as shown in Figure 5. 9 compounds, 72 proteins, and 20 pathways were obtained. Of the 20 pathways, PI3K-Akt signaling pathway, HIF-1 signaling pathway, Ras signaling pathway, ErbB signaling pathway, FoxO signaling pathway, and VEGF signaling pathway are related to anticancer effect [25][26][27][28][29][30][31], these pathways may be related to the anti-cancer effect of the tannin fraction of Phyllanthus emblica L. And the 9 compounds including gallic acid, punicalagin A, punicalagin B, methyl gallate, geraniin, corilagin, chebulinic acid, chebulagic acid, and ellagic acid were all detected in the rat serum by using UPLC-MS/MS, which further verified that these compounds were proper biomarkers. Detailed information about the analysis of chemical components in rat serum can be found in the supplementary materials.

Validation of the HPLC Method.
e method was validated in terms of linearity, precision, stability, repeatability, and recovery test. e concentrations of gallic acid, punicalagin A, punicalagin B, methyl gallate, geraniin, corilagin, chebulinic acid, chebulagic acid, and ellagic acid in the stock solution were          Table 2. All calibration curves showed good linear regression within the test ranges. e precision was determined by replicate injection with the same sample solution six consecutive times. e RSDs of peak area of gallic acid, punicalagin A, punicalagin B, methyl gallate, geraniin, corilagin, chebulinic acid, chebulagic acid, and ellagic acid were all below 3.05%, which showed high precision.
Stability testing was performed with one sample over 24 h. e RSDs of peak area of the 9 constituents were all below 2.71%, which indicated that the samples remained stable during the testing period and the conditions for the analysis were satisfactory. e repeatability was evaluated by the analysis of six prepared samples. e RSDs of for the contents of 9 constituents were all below 3.61%, which showed high repeatability.  e recovery was determined by the standard addition method. Certain amounts of the 9 constituents were spiked into the known sample and then processed and quantified in accordance with the established procedures as shown in Sections 2.2 and 2.3. e average recoveries were between 98.11% and 103.16%, with RSD values of less than 3.01% for the 9 compounds. erefore, the developed method was precise and sensitive enough for simultaneously quantitative analysis of 0 compounds in the tannin fraction of Phyllanthus emblica L.

Quality Evaluation of the 9 Compounds.
e developed quantitative analysis method was subsequently applied to 6 batches of the tannin fraction of Phyllanthus emblica L. sample habitat in Nepal. e results demonstrated a successful application of this HPLC-DAD assay for the quantification of 9 major constituents in different samples. e 9 compounds have been eluted within 62 min, giving good separation and acceptable tailings factors. Representative HPLC-DAD chromatograms of standard solutions and sample solutions for quantitative analysis are shown in Figure 1. e contents, summarized in Table 3, were calculated with the external standard methods.
In this experiment, UPLC-MS n was employed to analyze the tannin fraction of Phyllanthus emblica L. e total ion chromatograms under both positive and negative modes were investigated at first, but the response intensity in the negative mode was significantly increased, and the number of detected chromatographic peaks increased significantly. erefore, the negative mode was selected for the detection of Phyllanthus emblica L. We tentatively identified a total of 110 compounds including 45 hydrolysable tannins, 22 mucic acids, 15 phenolic acids, 15 flavonoids, 11 organic acids, and 2 other compounds. It can be seen from this result that most of the compounds in the tannin fraction are hydrolysable tannins, and the number of compounds accounted for more than 41% (45/110). e total tannins in the tannin fraction were determined before, and the content reached more than 60%. It is consistent with the results detected by UPLC-MS n . ere are also some mucic acids, phenolic acids, and flavonoids. Next, we will pay attention to these chemical components; the total content of flavonoids, mucic acids, phenolic acids, and organic acids accounts for about 40%; these ingredients may work synergistically with the hydrolysable tannins.
From the results of content determination by HPLC-DAD, the contents of gallic acid (content: 3.42%) and ellagic acid (content: 3.21%) are significantly higher than some hydrolysable tannins (punicalagin A: 0.26%, punicalagin B: 0.42%, chebulinic acid: 0.44%). Analyzing the reasons, we speculate that gallic acid and ellagic acid may be produced by the decomposition of other hydrolysable tannins. As we all know, gallic acid and ellagic acid are the basic structural units of hydrolysable tannins. Hydrolysable tannins are  unstable; they are easily decomposed under acids, alkali, enzyme, and high temperatures and used to produce gallic acid, ellagic acid, and polyols. In the process of preparing tannin fraction, the extraction temperature is 60°C and some hydrolysable tannins may decomposed, and these need to be further confirmed.

Conclusions
is research established a new method to find biomarkers for quality control of the tannin fraction of Phyllanthus emblica L. by using the UPLC-MS n and network pharmacology methods. 110 compounds were obtained from UPLC-MS n and the characteristic fragmentations were summarized. We found that hydrolysable tannins were the main components of the tannin fraction of Phyllanthus emblica L. en, a network pharmacology method was used to explore the biomarkers for quality control of the tannin fraction of Phyllanthus emblica L., gallic acid, punicalagin A, punicalagin B, methyl gallate, geraniin, corilagin, chebulinic acid, chebulagic acid, and ellagic acid were filter as the biomarkers. Animal experiments proved these 9 compounds were proper biomarkers, because we generally believe that the ingredients those are absorbed into the blood are effective. Finally, a simple method for simultaneously measuring the contents of 9 biomarkers was established using HPLC-DAD. is method does not require high equipment, Figure 5: "Component-target-function" network. Note: yellow triangle: compounds; pink ellipse: target; blue triangle: biological process. Table 3: Contents of 9 compounds (n � 6).

Analytes
Contents (%) S1 S2 S3 S4 S5 S6 Mean ± SD Gallic acid and it is suitable for promotion. e method developed in our study also provides a scientific foundation for the study of anticancer effective substances of the tannin fraction of Phyllanthus emblica L.

Data Availability
e data used to support the findings of this study are included within the Supplementary Materials.
Ethical Approval e use of animals in the present study was permitted by the Ethics Committee of Beijing University of Chinese Medicine, and all animal studies were carried out according to the Guide for Care and Use of Laboratory Animals.

Conflicts of Interest
e authors declare no conflicts of interest.