Identification of Bioactive Components of Stephania epigaea Lo and Their Potential Therapeutic Targets by UPLC-MS/MS and Network Pharmacology

Stephania epigaea, an important traditional folk medicinal plant, elucidating its bioactive compound profiles and their molecular mechanisms of action on human health, would better understand its traditional therapies and guide their use in preclinical and clinical. This study aims to detect the critical therapeutic compounds, predict their targets, and explore potential therapeutic molecular mechanisms. This work first determined metabolites from roots, stems, and flowering twigs of S. epigaea by a widely targeted metabolomic analysis assay. Then, the drug likeness of the compounds and their pharmacokinetic profiles were screened by the ADMETlab server. The target proteins of active compounds were further analyzed by PPI combing with GO and KEGG cluster enrichment analysis. Finally, the interaction networks between essential compounds, targets, and disease-associated pathways were constructed, and the essential compounds binding to their possible target proteins were verified by molecular docking. Five key target proteins (EGFR, HSP90AA1, SRC, TNF, and CASP3) and twelve correlated metabolites, including aknadinine, cephakicine, homostephanoline, and N-methylliriodendronine associated with medical applications of S. epigaea, were identified, and the compounds and protein interactions were verified. The key active ingredients are mainly accumulated in the root, which indicates that the root is the main medicinal tissue. This study demonstrated that S. epigaea might exert the desired disease efficacy mainly through twelve components interacting via five essential target proteins. EGFR is the most critical one, which deserves further verification by biological studies.


Introduction
Stephania epigaea Lo, belonging to the family Menispermaceae, is an herbaceous liana that primarily grows mainly in limestone hills and is found in Guangdong, Guangxi, Hainan, Yunnan, and Sichuan provinces of China, where it is called "dì bù róng," "j � in bù huàn," or "sh� an w� u gu � i" [1,2]. Its root tuber has been used as a traditional folk medicine for anti-infammatory, relieving pain, and sedation to treat cancer, fever, cough, malaria, diarrhea, bellyache, stomachache, and injuries from falls and fractures by local people [3][4][5]. A total of 40 alkaloids have been identifed from the plant since the study of their chemical constituents was frst reported in 1975 [6], which are divided into seven categories, including protoberberine-, aporphine-, morphine-, hasubanan-, benzylisoquinoline-, bisbenzylisoquinoline-, and azafuoranthene-type alkaloids, which have been evaluated for biological activity, such as acetylcholinesterase (AChE) inhibitory, cytotoxic, anti-infammatory, antitumor activities [7][8][9][10]. However, these confrmed biological activities do not well explain traditional folk medicine applications of S. epigaea. Terefore, to increase our understanding of uncovering the molecular mechanisms of traditional folk medicine applications for S. epigaea, the roots, stems, and fowering twigs (described as fowers) of S. epigaea were collected, and UPLC-ESI-MS/MS carried out widely targeted metabolomic analysis. Furthermore, a network-based pharmacology study on multiple compounds, multiple targets, and multiple pathways was performed for insight into the medicine activity mechanisms of S. epigaea.

Plant Materials.
Te plants were collected in Dali city and taxonomically identifed as S. epigaea Lo by Professor Guo Fenggen of Yunnan Agricultural University. Te specimens (201703066) were kept in the herbarium of the School of Agronomy and Biotechnology, Yunnan Agricultural University.

Metabolite Extraction.
Te fresh roots, stems, and fowers of fve years were freeze-dried using lyophilizer (Scientz-100F), respectively, and crushed using a mixer mill (MM 400, Retsch) with a zirconia bead for 1.5 min at 30 Hz. Te 100 mg powder was extracted overnight with 0.6 mL of 70% methanol at 4°C. Following centrifugation at 10,000 g for 10 min, the supernatant was fltered with a microporous membrane (0.22 μm pore size) and used for UPLC-MS/MS analysis.

UPLC-ESI-MS/MS Analytical Conditions.
All samples were analyzed using a UPLC-ESI-MS/MS system (UPLC, Shim-pack UFLC SHIMADZU CBM30A system, https:// www.shimadzu.com.cn/; tandem mass spectrometry, MS/ MS, Applied Biosystems 4500 Q TRAP, https://www. appliedbiosystems.com.cn/). Te analytical conditions were as follows: UPLC: column, Agilent SB-C18 (1.8 µm, 2.1 mm × 100 mm); the mobile phase consisted of solvent A (0.1% formic acid in HPLC grade water) and solvent B (acetonitrile). Sample measurements were performed with a gradient program that employed the starting conditions of 95% A and 5% B. Within 9 min, a linear gradient to 5% A and 95% B was programmed, and a composition of 5% A and 95% B was kept for 1 min. Subsequently, a composition of 95% A and 5.0% B was adjusted within 1 min and then held to 10.00 min to restart the following analysis. Te column oven was set to 40°C; the injection volume was 4 μL with an efuent speed of 0.35 mL/min. Te efuent was connected to an ESI-triple quadrupole linear ion trap (QTRAP)-MS.
Linear ion trap (LIT) and triple quadrupole (QQQ) scans were acquired on a triple quadrupole linear ion trap mass spectrometer (Q TRAP) and API 4500 Q TRAP UPLC/MS/ MS system, equipped with an ESI Turbo Ion-Spray interface, operating in positive and negative ion modes and controlled by Analyst 1.6.3 software (AB Sciex). Te ESI source operation parameters were as follows: an ion source, turbo spray; source temperature, 550°C; ion spray voltage (IS), 5500 V (positive ion mode)/-4500 V (negative ion mode); ion source gas I (GSI), gas II(GSII), and curtain gas (CUR) was set at 50, 60, and 30 psi, respectively; and the collision gas (CAD) was high. Instrument tuning and mass calibration were performed with 10 and 100 μmol/L polypropylene glycol solutions in QQQ and LIT modes. QQQ scans were acquired as MRM experiments with collision gas (nitrogen) set to 5 psi. DP and CE for individual MRM transitions were performed with further DP and CE optimizations [11]. A specifc set of MRM transitions were monitored for each period according to the metabolites eluted within this period.

Metabolite Analysis.
To compare the diferences in the metabolites, the mass spectral peaks of each metabolite detected in diferent samples were corrected to ensure the accuracy of qualitative and quantitative analyses. Te mass spectrometry fle of each sample was opened with Multi-Quant version 3.0.3 software (Sciex, Darmstadt, Germany), and the integration and correction of chromatographic peaks were conducted. Te peak area of each chromatographic peak represents the relative levels of the corresponding substances. Based on the MetWare metabolism self-built plant-specifc DataBase (MWDB), the qualitative analysis of substances was carried out according to the secondary mass spectrometry information, removing the isotopic signals, repeated signals containing K + ions, Na + ions, and NH + 4 ions, and the signals of other large molecular fragment ions. Te metabolites were quantifed by triple quadrupole mass spectrometry's multiple reaction monitoring (MRM) model [12]. Te UPLC-MS/MS data were processed using Analyst 1.6.3 software (AB Sciex) with default parameters. Quality control samples were prepared by mixing sample extracts and analyzing the repeatability of samples by the same treatment methods. In instrumental analysis, a quality control sample was analyzed every ten samples to monitor the repeatability of the UPLC-MS/MS system over the entire detection process.

Acquisition and Processing for Bioinformatics of
Metabolites. Absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of the metabolites of S. epigaea were predicted by using the ADMETlab 2.0 server (https://admetmesh.scbdd.com/) [13], which is a free online platform that facilitates researchers to predict the ADMET and drug-likeness properties of a compound. Te two-dimensional (2-D) and three-dimensional (3-D) structures of the metabolites were obtained from PubChem (https:// pubchem.ncbi.nlm.nih.gov/). Te putative protein targets of the metabolites were retrieved from Swis-sTargetPrediction (https://www.swisstargetprediction.ch/) [14]. Tese targets were used to construct the proteinprotein interaction (PPI) network through the online tool STRING v.11.0 (https://string-db.org/), and from which the GO [15] and KEGG [16] enrichments results were obtained. Finally, the enriched pathways were used to search for the relevant disease pathways by using the KEGG database (https://www.kegg.jp/) again.

Network Construction and Statistical Analysis.
Te networks of protein-protein interaction (PPI), compoundtarget (C-T), target-pathway (T-P), pathway-disease (P-D), 2 Evidence-Based Complementary and Alternative Medicine and comprehensive network (C-T-P-D) were constructed and visualized using Cytoscape v.3.9.0 (https://cytoscape. org/) [17]. For functional module identifcation, the two-mode T-P relationships were frst transformed into the one-mode target-target (T-T) relationships using Excel2Pajek 5.14 (https://mrvar.fdv.uni-lj.si/pajek/) [18]. Ten, the targetpathway-disease (T-P-D) and T-T network were constructed using Gephi v.0.92 (https://gephi.org/), and in which the modularity classes were analyzed and identifed using the Louvain algorithm with a resolution of 1.0.32 [19]. In addition, the functional module is evaluated by its contribution score (CS) to a specifc disease category that is calculated according to the reference [20,21], and the module with the highest contribution value is obtained. In this module, the most critical target is evaluated by the protein's integrated centrality (IC) degree calculated by the following equation.
where IC i refers to the integrated centrality of target i; DC i , BC i , CC i , and EC i refer to the degree, betweenness, closeness, and eigenvector centrality of target i; DC min , BC min , CC min , and EC min refer to the minimum degree, betweenness, closeness, and eigenvector centralities of the functional module; and DC max , BC max , CC max , and EC max refer to the maximum degree, betweenness, closeness, and eigenvector centralities of the functional module. Te value of IC ranged from 0 to 1. Te higher the IC value of a target, the more important it is in its functional module from the topological perspective.

Molecular Docking Simulation.
To confrm the binding afnity of an essential protein target to the metabolites of S. epigaea, molecular docking simulation was performed using AutoDock Vina v.1.2.0 (https://vina.scripps.edu/) [22]. Te 3D protein structure was downloaded as a pdb fle from the PDB database (https://www.rcsb.org/) and uploaded to PyMOL v.2.5.2 (https://pymol.org/2/) to remove water molecules and other ligands from the structure before it was saved as a pdb fle [23]. Te polar hydrogens and charges were added to the protein structure using MGLTools (https://mgltools.scripps.edu/) and saved as a pdbqt fle, and the protein grid box was set to cover up the entire protein molecule with a spacing of 1 angstrom (Å) in MGLTool, and the grid box coordinates were saved as a text fle [24]. Te 3D metabolite structure was downloaded as an sdf fle from PubChem (https://pubchem.ncbi.nlm.nih.gov/) and converted to a pdb fle using Open Babel (https:// openbabel.org/wiki/Main_Page) [25]. Charges were added, and the torsion tree was constructed using MGLTools before it was saved as a pdbqt fle.
Te blind docking with the AutoDock Vina [22] was performed where the protein structure in the pdbqt format was set as the receptor, the structure of the metabolite in the pdbqt format was set as the ligand, and the grid box coordinates were copied from the txt fle of the protein grid box. Once the docking was performed, the ligand confgurations in the protein structure were generated and saved as a pdbqt fle. Tese confgurations' corresponding binding free energy changes (ΔE) were calculated and saved as a log.txt fle [22]. Te visualization of the docking structures was achieved in PyMOL [26] by uploading both protein structure and ligand confgurations in the pdbqt format. Te images of molecular docking were exported from PyMOL as png fles.

Widely Targeted Metabolite Profling in Diferent Tissues.
Trough widely targeted metabolomic analysis, 518 metabolites were detected (Table S1) and were categorized into 8 classes of natural compounds such as alkaloids, amino acids, favonoids, lignans, lipids, nucleotides, organic acids, phenolic acids, and others. Principal component analysis (PCA) by the R package of PCAtools (https://github.com/kevinblighe/PCAtools) showed that there were signifcant chemical diferences in the tested samples, indicating that there was an obvious separation trend among the metabolic of the three diferent tissues (Figure 1(a)). As shown in Figure 1(b), the roots, stems, and fowers contained 445, 482, and 472 metabolites, respectively. Te abundance of lipids in each tissue was about 20%. Te abundance of alkaloids in each tissue was the second, and there were diferences. For example, 84 alkaloids in roots accounted for 18.88%, 79 alkaloids in stems accounted for 16.39%, and 70 alkaloids in fowers accounted for 14.83%. Except for the diferent content of favonoids (in roots 4.72%, in stems 9.34%, and in fowers 8.47%), the proportion of other metabolites in each tissue was similar. However, Venn vitalizing [27] of diferent metabolites in each group showed that there were 268 diferential metabolites between root and stem, 297 differential metabolites between fower and stem, and 326 diferential metabolites between root and fower, of which 125 diferential metabolites existed in three diferent tissues (Figure 1(c)). Moreover, hierarchical cluster analysis (HCA) of heatmap [28] analysis of metabolite accumulation patterns among diferent samples also suggested the same trend: the heterogeneity between Evidence-Based Complementary and Alternative Medicine stems and fowers, and roots and stems was lower than between roots and fowers (Figure 1(d)). Terefore, the characteristics of metabolites in the roots, stems, and fowers of S. epigaea were diferent.

Prediction of ADMET and Drug-Likeness Properties.
To gain insight into the pharmacokinetic profle of 518 metabolites of S. epigaea and whether they have the potential to become a drug, we used ADMETlab 2.0 to predict its ADMET and drug-likeness properties. Te corresponding predicted results are presented in Table S2 (see Supplementary Data). Two hundred metabolites were suggested as putative active compounds because they have relatively fne drug likeness and less toxicity overall, according to the excellent selection criteria of ADMETlab 2.0. [13], e.g., QED score ≥ 0.67, Lipinski

Target Identifcation and Bioinformatic Mining of
Metabolites. Te 200 putative active compounds were conducted using target prediction through the Swiss target prediction online software [14]. A total of 393 proteins were retrieved as putative targets for species limited to "Homo sapiens," with a probability greater than 0.1. Te results are shown in Table S3 (see Supplementary Data). Te 393 putative protein targets were introduced into the STRING database for bioinformatic mining, and the organism was selected as "Homo sapiens," setting the minimum required interaction score as confdence score >0.4. Ten, KEGG enrichment data were downloaded for further network pharmacology research, in which a total of 165 pathways were obtained with the criteria of false discovery rate (FDA) <0.05 [29]. Te results are shown in Table S4 (see Supplementary Data). A total of 46 KEGG disease entries were associated with 62 pathways of human diseases and one pathway of environmental information processing and signal transduction that related the nephrotic syndrome with urinary system disease (Table S5). A total of 167 of the 393 putative targets were enriched in these disease pathways. Tese diseases are categorized into infectious, cancer, neurodegenerative, urinary system, metabolic, mental, immune system, cardiovascular, drug resistance, and substance dependence.

Network Construction and Analysis.
A network of interrelated targets, pathways, and disease categories (T-P-D) was constructed (Figure 2(a)), which consisted of 240 nodes, including 167 targets, 63 pathways, 10 related diseases, and 796 edges. As shown in Figure 2(a), a target protein in the constructed network was connected to either one or multiple pathways, which related to either one or multiple disease types. Te T-T network was constructed through the corresponding relationships between targets and pathways to extract the relationship among the targets. Te edges of the network represent the common pathways between every two targets. Trough a set of shared pathways, the functional modules of the target are determined. Te targets related to the same pathways have a similar biological function (Figure 2(b)). Five functional modules (modules 1-5) were identifed, where module 1 consisted of 48 targets (28.74% of total targets), module 2 consisted of 30 targets (17.96%), module 3 consisted of 41 targets (24.55%), module 4 consisted of 40 targets (23.95%), and module 5 consisted of 8 targets (4.79%).
Te contribution of each functional module to a particular disease category can be evaluated using the contribution score (CS) of each module. Te CSs of the fve functional modules to 10 disease categories are calculated in Figure 3. Te sum of the CSs of fve modules to a disease category was 1 unity, and the larger the CS value of a module, the greater the module's contribution to a disease became. Te targets in modules 1-3 were extracted to calculate the integrated centrality (IC) degree to mine the most important target. Te IC values were determined and are shown in  TNF [32], CASP3 [33], and HSP90AA1 [34], that are all close relative to oncogenesis, cell growth, and immunomodulation.

Molecular Docking Verifcation.
Twelve compounds correlated to the fve most important targets were extracted, and their interaction was further verifed by molecular docking simulation. Te binding afnity of a ligand-target complex was evaluated by the binding energy change (ΔE), where a more negative binding energy value indicates a stronger binding afnity or a greater binding constant for the formation of the ligand-target complex. Table 1 shows the binding afnity energies of the 12 compounds to the fve essential targets, which had values ranging from −9.1 kcal/ mol to −4.9 kcal/mol. Te binding energy of ≤−5.0 kcal/mol indicates the strong binding between a ligand and its target [35][36][37]. Te strongest binding was observed between N-methylliriodendronine and SRC with a −9.1 kcal/mol binding energy. Since the stronger the binding afnity of a ligand to its protein target, the higher the potency of the ligand, the binding afnity data can guide us to select the proper ligand-target pairs from each functional module for experimental validation of the efcacy of compounds aimed at illnesses and therapeutic outcomes.    Te compound structures were shown as colored sticks. Carbon atoms and carbon-carbon bonds were green colored, oxygen atoms were red colored, hydrogen atoms were grey colored, and nitrogen atoms were blue colored.   Figure 6: Accumulation of twelve compounds in the three tissues, where StR represents root, StS represents stem, and StF means fower.

Discussion and Conclusions
Previously mentioned that the medicinal plant Stephania epigaea Lo traditionally was used to treat fever, cough, malaria, diarrhea, bellyache, injuries from falls, and fracture [3,4] and newly discovered activities of antiproliferative/ anticancer, immunomodulating, and apoptosis [7,38]. Our metabolomic analysis shows that 518 metabolites were detected from the root, stem, and fowers of S. epigaea through widely targeted metabolomics analysis. Among the metabolites, 200 were selected as putative active compounds because they have relatively fne drug likeness and less toxicity overall, and 393 putative proteins were predicted as targets of those selected compounds. Trough PPI analysis, enrichment and analysis of both GO and KEGG, correlated analysis of richening pathway-associated diseases, various biological network construction and analysis, and fnally, fve essential protein targets and twelve key metabolites were identifed and verifed by molecular docking simulation to be associated with traditional folk medicine applications of S. epigaea (Table 1). Figure 5 demonstrates that S. epigaea exerted its pharmacological efects on humans through multicomponent acts via fve essential protein targets to exhibit the desired disease efcacy. It is noteworthy that EGFR is the common target of most alkaline and phenolic compounds. Furthermore, molecular docking found that these compounds can perfectly combine with EGFR and form hydrogen bonds with the critical amino acid residue of M769, like the original ligand (erlotinib) [39]. EGFR is present on the surface of cells involved in cell growth. Te binding of those chemicals to modulate EGFR kinase activity may prevent cancer cells from growing or promoting normal cell growth by recovering damaged tissues or organs from injury of falling or fracture. Tus, some inhibitors of EGFR kinase, such as erlotinib, afatinib, and osimertinib, are used for cancer treatment [40][41][42]. Te chemicals of both cephakicine and N-methylliriodendronine can bind to EGFR, while cephakicine can bind to CASP3 and N-methylliriodendronine can dock SRC, indicating that those compounds can simultaneously afect the regulation of multiple pathways to suppress the caner or tumor initiations and development [43,44]. We also found that both homostephanoline and aknadinine can bind EGFR and HSP90AA1, the latter, as a molecular chaperon, responding to cancer cells to support folding and activating oncoproteins, including many kinases and transcription factors for cell growth and proliferating, so chemicals modulate Hsp90 activities working as a bufer for these regulators' activity. Terefore, this study provides a base for drug discovery of those four compounds and analogies for cancer/tumor treatment based on further biological and pharmaceutical studies.
Te PCA showed signifcant chemical diferences in the tested samples, indicating an obvious separation trend between the metabolites of the three diferent tissues (Figure 1), and the qualitative and quantitative of the twelve key active components identifed are diferent in the three tissues. As shown in Figure 6, the root contains all twelve active ingredients. Except for compounds cafeic acid Figure 6(d), tyrosine Figure 6(f ), and aknadinine Figure 6(k), the content of the rest is the highest in the three tissues, while the stem is less in cafeic acid Figure 6(d). Compounds cephakicine Figure 6(a), 3, 4-dihydroxy-DL-phenylalanine Figure 6(h), N-methylliriodendronine Figure 6(j), and 1-O-β-D-glucopyranosyl sinapate Figure 6(l) are missing in fowers. Tese results are just consistent with the application description of traditional medicine that its root tuber has been used as a traditional folk medicine for some diseases by local people.

Data Availability
Te datasets generated and analyzed during the current study were uploaded to the manuscript as Supplementary Materials.

Conflicts of Interest
Te authors declare that there are no conficts of interest regarding the publication of this paper.