A New Strategy to Uncover the Anticancer Mechanism of Chinese Compound Formula by Integrating Systems Pharmacology and Bioinformatics

Currently, cancer has become one of the major refractory diseases threatening human health. Complementary and alternative medicine (CAM) has gradually become an alternative choice for patients, which can be attributed to the high cost of leading cancer treatments (including surgery, radiotherapy, and chemotherapy) and the severe related adverse effects. As a critical component of CAM, traditional Chinese medicine (TCM) has increasing application in preventing and treating cancer over the past few decades. Huanglian Jiedu Decoction (HJD), a classical Chinese compound formula, has been recognized to exert a beneficial effect on cancer treatment, with few adverse effects reported. Nevertheless, the precise molecular mechanism remains unclear yet. In this study, we had integrated systems pharmacology and bioinformatics to explore the major active ingredients against cancer, targets for cancer treatment, and the related mechanisms of action. These targets were scrutinized using web-based Gene SeT Analysis Toolkit (WebGestalt), and 10 KEGG pathways were identified by enrichment analysis. Refined analysis of the KEGG pathways indicated that the anticancer effect of HJD showed a functional correlation with the p53 signaling pathway; moreover, HJD had potential therapeutic effect on prostate cancer (PCa) and small cell lung cancer (SCLC). Afterwards, genetic alterations and survival analysis of key targets for cancer treatment were examined in both PCa and SCLC. Our results suggested that such integrated research strategy might serve as a new paradigm to guide future research on Chinese compound formula. Importantly, such strategy contributes to studying the anticancer effect and the mechanisms of action of Chinese compound formula, which has also laid down the foundation for clinical application.


Introduction
According to a WHO report, cancer has become the leading killer of human health, which is associated with high recurrence rate and high mortality. Typically, the year 2012 has witnessed about 14 million new cancer cases and 8.2 million cancer-related deaths. It is estimated that the annual new cases will increase from 14 million to 22 million over the coming 20 years [1]. The existing anticancer treatments mainly include surgery, radiotherapy, and chemotherapy. However, the patients would eventually choose to discontinue the treatment due to the high cost of radiotherapy and chemotherapy, as well as the serious related adverse effects [2]. With the development of medicine, cancer is treated based on a comprehensive and diversified treatment, and complementary and alternative medicine (CAM) has become an alternative option for patients under such circumstances. Traditional Chinese medicine (TCM), a critical component of CAM, has been increasingly applied in preventing and treating cancer over the past few decades [3,4]. As an adjuvant therapy, Chinese medicine shows beneficial effect on cancer treatment with few adverse effects reported [5].
Huanglian Jiedu Decoction (HJD), first recorded in the Prescriptions for Emergent Reference (Zhouhou Beiji Fang) written by Ge Hong, consists of four herbs, including Coptidis Rhizoma (Huanglian), Scutellariae Radix (Huangqin), Phellodendri Chinrnsis Cortex (Huangbo), and Gardeniae Fructus (Zhizi). HJD is a representative formula for cancer 2 Evidence-Based Complementary and Alternative Medicine treatment, which is frequently employed to treat pancreatic cancer, breast cancer, liver cancer, and colorectal cancer (CRC) in clinical practice [6]. For instance, some results of pharmacological experiment suggest that HJD has anticancer effect on human liver cancer cells both in vitro and in vivo, which can also markedly extend the survival time of liver cancer bearing mice [7,8]. However, the precise mechanism of its anticancer effect remains unclear so far.
Chinese compound formula is characterized by the synergistic effects of multicomponent and multitarget. On this account, a method suitable for its characteristics is needed to reveal the underlying mechanism of action. Systems pharmacology is a new discipline studying the regularity and mechanism of drug-organism interaction at the system level [9]. It can study the changes in body function mechanisms caused by drug treatment for diseases from molecules, cells, tissues, to organs. Moreover, it would establish the interrelationships between drug efficacy and the organism at both microscopic levels (molecular and biochemical network levels) and macroscopic levels (tissue, organ, and overall levels). Besides, extremely abundant cancer data have been produced in recent years, with the rapid development of bioinformatics technology, including microarray, proteomics, and other high-throughput screening assays. By integrating systems pharmacology and bioinformatics, this study aimed to explore the relationships of HJD with its cancer-related targets and interactive genes and to reveal the underlying molecular mechanisms of action. Such strategy would be helpful for investigating the anticancer effect and the mechanism of action of Chinese compound formula, which could also provide the basis for clinical application. A flowchart of the research approach was presented in Figure 1. In addition, The Chinese herbal compound can be considered as a weak inhibitor with multicomponent and multitarget, and there are synergistic effects among multiple components. We hope to explore how this compound can actually work in the treatment of cancer, but it must be taken into account that the components of the compound are complex and not every component can play a role. Therefore, we screen out the main active components through multiple parameters and predict the targets of the active ingredients, so as to infer the therapeutic effect.

Construction of Cancerous Target Network and Chemical
Component Database. All targets for cancer treatment could be accessed in DrugBank database (http://www.drugbank.ca/), and the cancerous target network was thereby constructed through Cytoscape [12]. In addition, HJD was comprised of four herbs, including Coptidis Rhizoma (Huanglian), Scutellariae Radix (Huangqin), Phellodendri Chinrnsis Cortex (Huangbo), and Gardeniae Fructus (Zhizi). All chemical components of these Chinese herbs had been collected into TcmSP [13], TcmID [14], TCM Database@Taiwan [15], and NCBI Pubchem databases and had been standardized to a constituent data supplemented in the TcmSP database. Finally, the number of chemical compounds in HJD was obtained, as shown in the Appendix.

Screening the Active Ingredients by OB Prediction.
Oral bioavailability (OB) in vivo (%F), the unchanged fraction of the orally administered dose achieving systemic circulation, is one of the most commonly used pharmacokinetic parameters in drug screening cascades. In this study, a robust calculative system OBioavail 1.1 [16] was employed to predict the OB of the compounds, since it was difficult to assess the bioavailability of the complex TCM by "wet" experiments. It has combined the metabolism (cytochrome P450 3A4) and transporter (P-glycoproteins) information. Using this system, compounds with lower OB could be discarded, so that the amount of the original compounds could be distinctly reduced to a smaller set suitable for Chinese compound formulas. Compounds with the OB of ≥30% were selected as the active ingredients in this study. Such a threshold was selected based on (1) the use of a minimum number of components to maximally extract HJD information and (2) the fact that the obtained model could be reasonably explained by the reported pharmacological data.

Screening the Active Ingredients by Drug-Likeness Prediction.
Before target prediction, some compounds considered chemically unsuitable for use were removed by drugs similarity index, which could be deduced as a delicate balance among the molecular properties affecting pharmacodynamics and pharmacokinetics, ultimately influencing its absorption, distribution, metabolism, and excretion (ADME) in human body like a drug. In this study, the drug-likeness (DL) index of a new compound was calculated according to the Tanimoto similarity [17].
where A represented the new compound and B stood for the average DL index of all the 6511 molecules in the DrugBank database based on the Dragon soft descriptors. Accordingly, molecules with the drug-likeness of <0.18 were also removed. Finally, compounds with both the OB of ≥30% and DL of ≥0.18 were considered as the active ingredients.

Prediction of the Targets of Active Ingredients.
SysDT [18], the drug-target prediction model, was adopted to predict the targets of active ingredients. Briefly, SysDT was based on the 6511 drugs and 3987 targets of DrugBank database as well as the mutual correlation degree. Moreover, it was established using the stochastic forest algorithm and the support vector machine (SVM) algorithm, respectively. It turned out that the prediction model constructed by SVM was superior, with the consistency of 82.83%, sensitivity of 81.33%, and specificity of 93.62%. Using such model, targets with the SVM of > 0.7 were predicted as the putative targets of active ingredients. In addition, target information was integrated from SEA [19], STITCH [20], TTD [21], and HIT [22] databases to supplement this predictive model. Moreover, information regarding the physiological functions of all targets was obtained from the TTD and UniProt databases.

Construction of the Network and Topological Analysis.
Associations between active ingredients and putative targets were constructed into the compound-target network of HJD   Evidence-Based Complementary and Alternative Medicine  [12], which was then mapped with the cancerous target network to obtain the compoundcancer target network of HJD, including all HJD-related targets for cancer treatment. Afterwards, the protein-protein interaction (PPI) network of HJD-related targets for cancer treatment was constructed by STRING [23]. Subsequently, topology analysis was performed using the Network Analyzer plug-in to output the main topological parameters of this network [24].

Screening Key Targets and KEGG Pathway Enrichment
Analysis. The centrality algorithm is a key method to measure the importance degree of nodes in the whole network, with a larger value indicating a higher importance degree of node in the whole network and greater influence on the structure and function of the whole network. In this study, the degree centrality algorithm was adopted as the major algorithm, supplemented by the closeness centrality and the betweenness centrality algorithm, so as to select and evaluate the key anticancer targets of HJD. Additionally, the biological information and attribution embedded in the anticancer targets were then analyzed using a web-based integrated data mining system, WebGestalt [25]. Biochemical pathways and functions linked to the anticancer targets of HJD were specifically queried and navigated by the KEGG pathway enrichment analysis tool in WebGestalt. Eventually, the top 10 pathways with an adjusted P value of <0.01 were selected.

Exploration of the Cancer Genomics Data Linked to HJD by cBio Cancer Genomics Portal. The cBio Cancer Genomics
Portal (http://cbioportal.org), an open platform to explore the multidimensional cancer genomics data, can encapsulate the molecular profiling data obtained from cancer tissues and cell lines into the readily understandable genetic, epigenetic, gene expression, and proteomic events [26]. Specifically, the complex cancer genomics profiles can be easily accessed using the query interface of the Portal, which enables the researchers to explore and compare the genetic alterations across samples. Furthermore, the obtained underlying data can thereby be linked to clinical outcomes, which has facilitated the novel discovery in biological systems. In this study, the cBio Portal was utilized to examine the connectivity of HJD-related targets for cancer treatment across all studies on PCa and SCLC available in the databases. These targets in all sample studies on PCa and SCLC were classified as altered or nonaltered using the Portal search function. The genomics datasets were then presented using OncoPrint as the heatmap, a visually appealing display of alterations in microarrays across cancer samples [27]. Another feature of the Portal was that, it could generate multiple visualization platforms through grouping PCa abd SCLCassociated alterations using the input from key HJD-related targets for cancer treatment [27][28][29][30][31]. In the meantime, the survival of these targets in PCa and SCLC was analyzed using survival option embedded in the Portal, a tool integrating the survival Kaplan-Meier estimate and the survival data in the TCGA database.

Screening the Active Ingredients and Visualization of the Compound-Cancer Target Network.
Compounds contained in all 4 herbs constituting HJD were collected through several databases, including Huanglian (48), Huangqin (143), Huangbo (140), and Zhizi (98). A total of 85 compounds with OB of ≥ 30% and DL of ≥ 0.18 were identified, among which only 59 active ingredients targeting the anticancer targets were screened (the Appendix). Correlations of the active ingredients with their anticancer targets were visualized through Cytoscape, and the compound-cancer target network was also obtained for subsequent analysis ( Figure 2).

Construction of the PPI Network of HJD-Related Targets
for Cancer Treatment as well as Topological Analysis. The HJD-related targets for cancer treatment could be obtained through the compound-cancer target network. In addition, the "protein-protein interaction (PPI) option" embedded in STRING was also adopted for further analysis, and a PPI network containing 98 interactive targets was also identified ( Figure 3). Later, the topological features of this network were calculated with the Network Analyzer plug-in (Table 1), which consisted of an entire portion of the interaction between the anticancer targets, with an average number of direct neighbors of 20.959. Besides, the degree of some nodes was much higher than the average number of direct neighbors. In the degree centrality algorithm, a higher degree of a node indicated greater impact on the whole network. In this network, the degree distribution between nodes was uneven. These nodes, which were twice the average number of direct neighbors, were then define as Hub nodes in this study, indicating their importance in the network for subsequent investigation.

Searching and Analysis of the Key Targets.
Three centrality algorithms were employed for key target screening, including degree centrality, closeness centrality, and betweenness centrality. Of them, the closeness centrality algorithm has emphasized the average shortest path length between nodes and other nodes. In contrast, the betweenness centrality algorithm measures the number of nodes on the shortest path of other nodes, which suggests the frequency that the shortest path between the other nodes passes through one node. In other words, if the shortest path of the other nodes often passes through this node, then this node shows a high importance or ability, which can modulate information transmission of other nodes as a link between the other nodes. These 3 algorithms were used to calculate the whole network, and the top 30 targets were summarized based on the algorithm results, as shown in Table 2. Consistently, nodes that were twice the average number of direct neighbors were defined as Hub nodes, including TP53, AKT1, EGF, PCNA, JUN, VEGFA, ESR1, and IL6. It should be noted that TP53 ranked the top among the three centrality algorithms, indicating that the primary target pathway under control or mediated by HJD was associated with TP53. In addition, AKT1 took up the second place, which was only second to TP53. As a critical component in the PI3K-AKT signaling pathway, AKT1 was closely correlated with the occurrence and development of human cancers. Baicalin and baicalein, the main active ingredients of Huangqin, had been reported to show a definite relationship with the downregulation of the PI3K-AKT pathway in anticancer effect [23,24]. Consequently, the AKT1-related signaling pathway might also have an important link with the anticancer effect of HJD.

Analysis of the KEGG Pathway.
To explore the biological mechanism underlying the anticancer effect of HJD, the KEGG pathway enrichment analysis embedded in WebGestalt was performed. Typically, the top 10 KEGG pathways linked to all targets in the PPI network were obtained, including cell cycle (24), pathways in cancer (31), the p53 signaling pathway (15), the AGE-RAGE signaling pathway in diabetic complications (17), prostate cancer (16), endocrine resistance (16), hepatitis B (18), the PI3K-Akt signaling pathway (25), small cell lung cancer (14), and the FoxO signaling pathway (15) ( Table 3). Broad grouping of the KEGG pathway analysis suggested that the anticancer effect of HJD was closely correlated with the following cancer-related signaling pathways with potential mechanisms, including (1) control of cancer cell proliferation and survival by p53-mediated cell cycle control, (2) the PI3K-Akt signaling pathway regulating the growth, proliferation, and invasion and metastasis of cancer cells by mediating the FoxO signaling pathway, and (3) the potential treatment of breast cancer achieved through regulating endocrine resistance. TP53 ranked the top among the 3 centrality algorithms (Table 2); as a result, emphasis was directed to the p53 signaling pathway. The KEGG analysis results probably indicated that the anticancer effect of HJD showed a functional correlation with TP53. In addition, the enrichment KEGG pathway analysis also suggested that 16 and 14 targets were associated with PCa and SCLC, respectively (Table 3).

Mining the Genetic Alterations and Survival Analysis.
It had been proved that HJD displayed therapeutic effects on different cancers; however, its specific biological mechanisms remained unclear so far. KEGG enrichment analysis revealed that HJD was correlated with the cancer-related pathways (Table 3). To further explore the validity of such correlation, cBio Portal, a web-based integrated data mining system, was adopted to examine the genetic alterations and survival analysis associated with HJD-related targets in PCa and SCLC. The p53 signaling pathway was the main target of HJD; consequently, the overlapping targets of the p53 signaling pathway with PCa and SCLC were studied. The results discovered that 8 overlapping targets were associated with the KEGG assay embedded in WebGestalt, including 7 in PCA (CDK2, CDKN1A, MDM2, CCND1, TP53, CCNE1, and CCNE2) and 5 in SCLC (CDK2, CCND1, TP53, CCNE1, and CCNE2). Therefore, the genomic and clinical characteristics of these targets in PCa and SCLC were examined, respectively ( Table 2). 13 studies on PCa were analyzed [10,[32][33][34][35][36][37][38][39][40], the results of which indicated 1.9% to 63.9% alterations in the gene sets/pathways submitted for analysis (Figure 4(a)). Multiple genetic alterations observed across each set of cancer samples from the Michigan study [10] with the most significant genomic changes were summarized and presented using OncoPrint. The results indicated that 37 cases (63%) had an alteration in at least one of the 7 targets, and the alteration frequency in each of the selected targets was presented in Figure 4(b). CDK2, CDKN1A, and CCNE1 were not associated with genetic alterations. For MDM2, CCND1, and CCNE2, most alterations were classified as amplification. TP53-associated genetic alterations included deep deletions and missense/truncating mutations. The alterations in these targets showed a cooccurrence trend across samples. However, mutual exclusivity analysis revealed no statistical significance (p=0.183) (data not shown). More interestingly, cases with genetic alterations were linked with a poorer survival compared with those without alterations (P=0.443, Figure 4(c)).
Among the 3 SCLC studies analyzed [11,41,42], 78.6% to 93.6% alterations were found in the gene sets/pathways submitted for analysis ( Figure 5(a)). Multiple genetic alterations observed across each set of cancer samples from the U Cologne study with the most significant genomic changes were summarized and presented using OncoPrint [11]. The results indicated that 103 cases (94%) had an alteration in at least one of the 5 targets, and the alteration frequency in each of the selected targets was shown in Figure 5(b). Different from results of PCa study, these results indicated that almost all genetic alterations occurred in TP53, whereas no genetic alterations were seen in CDK2 or CCND1. CCNE1associated genetic alterations were classified as missense mutations, while CCNE2-associated ones were classified as truncating mutations. In comparison, TP53-associated 8 Evidence-Based Complementary and Alternative Medicine The following statistics were listed in the row: C: the number of reference targets in the category; O: the number of targets in both the gene set and the category; E: the expected number in the category; R: ratio of enrichment; rawP: p value upon hypergeometric test; and adjP: p value adjusted by the multiple test adjustment.
genetic alterations included both missense mutations and truncating mutations. The mutual exclusivity analysis still displayed no statistical significance (p = 0.876) (data not shown). More interestingly, cases with genetic alterations also had a poorer survival relative to those without (P=0.166, Figure 5(c)).

Discussion
HJD serves as the object of study in this work. To elucidate the anticancer molecular mechanism of HJD, we have integrated systems pharmacology and bioinformatics. As a result, a number of public databases as the research basis and a set of tools are available to elucidate the molecular mechanisms and the relationship with the clinical outcomes of cancers. 3 steps are carried out in our workflow. (i) The cancerous target network is constructed through the DrugBank database, and all chemical components contained in the 4 medicines are obtained by databases, such as TcmSP, TcmID, TCM Database@Taiwan, and NCBI Pubchem. Subsequently, the active ingredients are screened based on the criteria of OB of ≥30% and DL of ≥0.18, and the targets of these active ingredients were then predicted using the SysDT model. Ultimately, 59 anticancer active ingredients and their anticancer targets were identified by mapping with the cancerous target network (the Appendix). (ii) Based on these anticancer targets, a PPI network containing 98 targets is constructed by STRING (Figure 2), and topological analysis is therefore performed.
(iii) Taking TP53 as the main object of study, we have compared the p53 signaling pathway between PCa and SCLC, and 8 overlapping targets are obtained. Then, the genetic alterations and survival analysis of the overlapping targets in PCa and SCLC are performed, so as to evaluate the relevance of the p53 signaling pathway with HJD in treating cancer.
Cases with Alteration(s) in Query Gene           HJD has been suggested in a report to inhibit angiogenesis through suppressing the expression of VEGFA and MMP-9, thus further restraining cancer growth [43]. Similarly, we also discover that VEGFA is a key target in the anticancer activity of HJD using network analysis (Table 2). In addition, a study shows that HJD can obviously inhibit the proliferation of human SCLC NCI-H446 cells [44]. Coincidently, our findings also support that HJD has certain therapeutic effect on SCLC, which is probably achieved through regulating the p53 signaling pathway. However, no other related literature reports that HJD has therapeutic effect on PCa, which may account for a future research direction pending further validation of the experiment. Interestingly, we find through KEGG enrichment analysis that the AGE-RAGE signaling pathway is also present in diabetic complications. The therapeutic effect of HJD on diabetes and its complications has been approved in lots of literature; nonetheless, no existing study indicates HJD works through this pathway. Therefore, it remains to be further studied whether the AGE-RAGE signaling pathway may be a potential mechanism of HJD in treating diabetes and its complications.
Compared with studies integrating systems pharmacology and network pharmacology, the current study has a certain biological rationality, since it has bridged HJD to its target genes and linked it with biological effects. Moreover, this study has also illustrated the relationship between the molecular mechanism of HJD and the clinical outcome of cancer through a set of network-based tools. This approach is greatly different from the use of experimental techniques to prove a few relationships at a time; instead, it can reduce redundant experiments from different laboratories. The use of such a new research strategy may remarkably contribute to (i) understanding the molecular biological mechanisms of Chinese compound formula, (ii) revealing the primary effects and targets of HJD on cancers, and (iii) promoting the clinical use of Chinese compound formula and laying down the clinical foundation. This method can be used not only in the study on HJD, but also on other Chinese compound formulas and on medicine combination therapy.
However, there are some shortcomings deserving our attention. The compounds contained in the herbal medicines are obtained based on databases; therefore, the quality of databases would directly affect the final compounds obtained. Moreover, the selection of screening parameters and the setting of threshold can also affect the number of active ingredients obtained. All of these may influence the final analysis.
In conclusion, the targets of HJD will undoubtedly be confirmed thanks to a growing number of studies on HJD carried out using traditional experimental techniques and methods. However, the relationship with the biological effects of HJD remains unclear yet. We believe that the use of this method can help to offset some uncertainties of HJD related to its target and its subsequent phenotypic expression. Furthermore, this approach contributes to determining the feasibility of future experiments. In the future, molecular biology experiments about the key targets and pathways of HJD can be carried out on the basis of the current study. Apart from PCa and SCLC, many studies have also reported the antitumor effect of HJD on other tumors, such as lung cancer, liver cancer, breast cancer, and colon cancer. These findings reveal that it remains to be further studied whether the connectivity between HJD and PCa as well as SCLC can be extended to other cancers.