The traditional Chinese medicine (TCM), which has thousands of years of clinical application among China and other Asian countries, is the pioneer of the “multicomponent-multitarget” and network pharmacology. Although there is no doubt of the efficacy, it is difficult to elucidate convincing underlying mechanism of TCM due to its complex composition and unclear pharmacology. The use of ligand-protein networks has been gaining significant value in the history of drug discovery while its application in TCM is still in its early stage. This paper firstly surveys TCM databases for virtual screening that have been greatly expanded in size and data diversity in recent years. On that basis, different screening methods and strategies for identifying active ingredients and targets of TCM are outlined based on the amount of network information available, both on sides of ligand bioactivity and the protein structures. Furthermore, applications of successful
Drug discovery was once an empirical process when the effect of the medicine was purely based on phenotype readout, while the mode of action of drug molecules remained unknown. Later, reductionists began to research on the molecular mechanism of the drug-target interactions, believing that the drug is like a magic bullet towards the functioning targets [
The traditional Chinese medicine (TCM), which has been widely used in China as well as in other Asian countries for a long history, is considered to be the pioneer of the “multicomponent-multitarget” pharmacology [
In recent years, great efforts have been made on modernization of TCM, most on identification of effective ingredients, and ligands in TCM formulae and functioning targets [
On the contrary,
In this paper, we firstly investigate TCM databases for
Data availability is the first consideration before any virtual screening or data-mining task could be undertaken. The TCM databases can be classified in accordance with several categories, namely, formulae, herbs, and compounds. The formula of TCM is a combination of herbs for treating a disease, while compounds are the bioactive molecules within herbs. In this section, we have summarized a list of databases for TCM herbs, formulations, and compounds, as shown in Table
Basic information for main TCM databases.
Database | Description | URL or ref. |
---|---|---|
Traditional Chinese medicine database (TCMD) | 6760 herbs, 23,033 compounds | [ |
Chinese herb constituents database (CHCD) | 240 herbs, 8264 compounds | [ |
3D structural database of biochemical components | 2073 herbs, 10,564 compounds | [ |
TCM database@Taiwan | 453 herbs, 20,000 compounds | [ |
Traditional Chinese medicine information database (TCM-ID) | 1197 formulae, 1313 herbs, ~9000 compounds | [ |
TCM drugs information system | 1712 formulae, 2738 herbs, 16,500 compounds, 868 dietotherapy prescription | [ |
Comprehensive herbal medicine information system for cancer (CHMIS-C) | 203 formulae, 900 herbs, 8500 compounds | [ |
China natural products database (CNPD) | 45,055 compounds | [ |
Marine natural products database (MNPD) | 8078 compounds, 3200 with bioactivity data | [ |
Bioactive plant compounds database (BPCD) | 2794 compounds | [ |
acupuncture.com.au | TCM formulations |
|
Dictionary of Chinese herbs | TCM formulae, toxicity, and side effects |
|
Plants for a future | Herb medical usage and potential side effects |
|
The elementary units of TCM databases are compounds, the bioactive components that exert efficacy through binding to therapeutic targets. Most of the compounds in TCM databases have two-dimensional structure, while some of them have three-dimensional structures deduced by force field. In most TCM databases, the information of both herbs and compounds are collected while some even have formulae information as well.
The traditional Chinese medicine database (TCMD) contains 23,033 chemical constituents and over 6760 herbs that mainly come from Yan et al. [
In addition to herbs and compounds, traditional Chinese medicine information database (TCM-ID) [
The China natural products database (CNPD) [
There are other databases on the internet focusing only on the clinical efficacy or side effects of formulae and herbs, without details of compounds. acupuncture.com.au collects the TCM formulae according to their clinical action and efficacy. Both the English and Chinese names of TCM herbs are recorded to facilitate studies using both traditional and modern methods. The dictionary of Chinese herbs contains information on both clinical usage and side effects of the TCM herbs. It also includes the samples of TCM formulae for treating diseases such as cancer, dengue fever, diabetes, and hepatitis B. Besides, the compatibility of TCM herbs and certain drugs is listed to provide biochemical explanation for drug designers. The plants for a future database allows querying of herbs with special medicinal usage and also lists the potential side effects, medical usage, and physical characteristics.
The computational methods for drug discovery based on ligand-protein networks have been increasingly developed and applied in the area of TCM and other drugs in recent years [
The ligand-based approach, also known as the chemical approach, is to reorganize pharmacological characteristics and protein associations, by means of ligand similarities rather than genomic space such as sequence, structural or pathway information. The basic assumption for ligand-based approach is that, regardless that similar chemical structures may interact with proteins in different ways, similar ligands tend to bind to similar targets more than not [
In the area of ligand-based virtual screening, researchers have tried to evaluate whether novel ligand-target pairs could be identified, based on the chemical knowledge of ligands and ligand-target interactions. G protein-coupled receptors (GPCRs) are a family of effective drug targets with significant therapeutic value. Many researchers have built support vector machine (SVM) models as well as substructural analysis to describe GPCRs from the perspective of ligand chemogenomics [
A powerful ligand-based prediction method based on features of protein ligands is the similarity ensemble approach (SEA), which was originally used to investigate protein similarity based on chemical similarity between their ligand sets with the main idea that similar ligands might tend to share same targets [
The pharmacophore model is perhaps the most widely used methods that make use of the 3D structure representations of molecules [
Pharmacophore screening only considers those compounds who are direct mimics of the ligand from which the pharmacophore has been generated and may neglect the other positive binding modes as well. In fact, the pharmacophore model is limited to only one mode of action for small molecules [
Quantitative structure-activity relationships (QSARs) were first established in the early 1960s when computational means were used to quantitatively describe pharmacodynamics and pharmacokinetic effects in biology systems and the chemical structures of compounds [
The target-based approach predicts ligand-target interactions by the structural information of protein targets as well as ligands. The target-based approach depends highly on the availability of the structural information of targets, either from wet experiments or numerical simulations [
Despite more than 20 years of research, docking and scoring ligands with proteins are still challenging processes and the performance is highly dependent on targets [
To alleviate the situation that docking depends on the nature of targets, multiple active sites have been used to compensate the ligand-dependent biases, and the consensus scoring has been also suggested to reduce the false positives in virtual screening [
Regardless of all limitations, virtual screening based on docking and inverse docking has been successfully utilized to identify and predict novel bioactive compounds in the past 10 years. Using the combinatorial small molecule growth algorithm, Grzybowski applied the docking to the design of picomolar ligands for the human carbonic anhydrase II [
Docking is usually used as the second step to further validate the ligand-target binding features after the first round of virtual screening by other ligand-based approaches [
The ligand-based approach and target-based approach predict potential ligand-target bindings by means of chemical similarity and structural information. Machine learning is a high-throughput method of artificial intelligence that enables computers to learn from data of knowns, including ligand chemistry, structural information, and ligand-protein networks, and to predict unknowns, such as new drugs, targets, and drug-target pairs. This method gains stability and credibility and has strong ability for classifications among large numbers of ligand-protein pairs that otherwise would be impossible to be connected based on chemical similarity alone.
Machine learning is to exact features from data automatically by computers [
Nidhi et al. trained a multiple-category Laplacian-modified naïve Bayesian model from 964 target classes in WOMBAT and predicted the top three potential targets for compounds in MDDR with or without known targets information [
The Gaussian interaction profile kernels, which represented the drug-target interactions, were used in regularized least squares combined with chemical and genomic space to achieve the prediction with precision-recall curve (AUPR) up to 92.7 [
The support vector machine (SVM) is a powerful classification tool in which appropriate kernel functions are selected to map the data space into higher-dimensional space without increasing the computational difficulties. The performance of SVM is usually stronger than other probability-based models. Wale and Karypis [
Random forest, a form of multiple decision trees, recently has been applied to screen TCM database for potential inhibitors against several therapeutically important targets [
Linear regression models have also been applied to predict ligand-target pairs. Zhao and Li developed a computational framework, drugCIPHER, to infer drug-target interactions based on pharmacology and genomic space [
Although machine learning has strong performance in classification of protein-ligand interactions, its shortcoming is obvious. The process of some machine learning methods is implicit, like a black box, from which we cannot have an intuitive biological or physical relevance between proteins and ligands. SVM maps the classification problem into higher space and acquires excellent performance with high computational efficiency. The tradeoff is that it can hardly explicitly create relationship between a protein and a ligand. Therefore, even with a very strong prediction tool, we can hardly move forward with innovations in theory of protein-ligand interactions.
Network-based pharmacology explores the possibility to develop a systematic and holistic understanding of the mode of actions of multidrugs by considering their multitargets in the context of molecular networks. It has also been suggested that relatively weak patterns of inhibition of many targets may prove more satisfactory than the highly potent single-target inhibitors routinely developed in the course of a drug discovery program [
Summary of multi-target drugs/preparations with TCM pharmacology based on ligand-protein networks.
Disease | Methods and experiments | Formula, herbs, and components | TCM pharmacology | Reference |
---|---|---|---|---|
AIDS | Experiments | Tannin | Tannin suppresses the activity of HIV-1 reverse transcriptase, protease, and integrase and cuts off virus fusion and virus entry into the host cells. | [ |
| ||||
AIDS | Experiments | Matrine from the root of |
Matrine is effective in inducing T-cell anergy by targeting both the MAPKs pathway and the NFAT pathway. | [ |
| ||||
Antitumor | Experiments | PHY906: |
PHY906 reduces CPT-11-induced gastrointestinal toxicity in the treatment of colon or rectal cancer by several mechanisms. It both repairs the intestinal epithelium by facilitating the generation of intestinal progenitor or stem cells and several Wnt signaling components and suppresses inflammatory responses like factor kB, cyclooxygenase-2, and inducible nitric oxide synthase. | [ |
| ||||
Anti-inflammatory and analgesic effects | Experiments | Qingfu Guanjieshu (QFGJS): paeonol and other components | The pharmacokinetic behavior and metabolites of paeonol are greatly promoted by other components in QFGJS. This may be the result of enhanced adsorption of paeonol in the gastrointestinal tract by P-glycoprotein-mediated efflux change. | [ |
| ||||
Inflammatory and arthritic diseases | Experiments | Paeoniflorin from the root of |
Paeoniflorin is markedly enhanced when coadministrated with sinomenine, which promotes intestinal transportation via the inhibition of P-glycoprotein and affects the hydrolysis of paeoniflorin via interaction with b-glycosidase. | [ |
| ||||
Anti-inflammatory | Experiments | Huang-Lian-Jie-Du-Tang (HLJDT): |
Baicalein derived from Radix scutellariae showed significant inhibitory effect on 5-LO and 15-LO while coptisine from Rhizoma coptidis showed medium inhibitory effects on LTA(4)H. | [ |
| ||||
Acute promyelocytic leukemia (APL) | Experiments | Realgar-Indigo naturalis: tetraarsenic tetrasulfide (A), indirubin (I), and tanshinone IIA (T) | ATI leads to ubiquitination/degradation of promyelocytic leukemia (PML) retinoic acid receptor oncoprotein, reprogramming of myeloid differentiation regulators, and G1/G0 arrest in APL cells by mediating multiple targets. A acts as the principal component of the formula, whereas T and I serve as adjuvant ingredients. | [ |
| ||||
Chronic myeloid leukemia |
Experiments | Imatinib (IM) and arsenic sulfide [As(4)S(4) (AS)] | AS targets BCR/ABL through the ubiquitination of key lysine residues, leading to its proteasomal degradation, whereas IM inhibits the PI3K/AKT/mTOR pathway. | [ |
| ||||
Inflammation | Pharmacophore-assisted docking | Twelve examples of compounds from CHCD | The screened compounds target cyclooxygenases 1 and 2 (COX), p38 MAP kinase (p38), c-Jun terminal-NH(2) kinase (JNK), and type 4 cAMP-specific phosphodiesterase (PDE4). | [ |
| ||||
Type II diabetes mellitus (T2DM) | Molecular docking (LigandFit), clustering, and drug-target network analysis | 676 compounds in eleven herbs from Tang-min-ling Pills | Multiple active components in Tangminling Pills interact with multiple targets. The 37 targets were classified into 3 clusters, and proteins in each cluster were highly relevant to each other. Ten known compounds were selected according to their network attribute ranking in drug-target and drug-drug network. | [ |
| ||||
Cardiovascular disease | Similarity search and alignment, docking (LigandFit) | Xuefu Zhuyu decoction (XFZYD): 501 compounds, 489 drug/drug-like compounds | Active components in XFZYD mainly target rennin, ACE, and ACE2 in renin-angiotensin system (RAS), which modulates the cardiovascular physiological function. | [ |
| ||||
9 types of cancer, 5 diseases with dysfunction, and 2 cardiovascular disorders | Distance-based mutual information model (DMIM) | Liu-wei-di-huang formula (LWDH), |
The interactions between TCM drugs and disease genes in cancer pathways and neuro-endocrine-immune pathways were inferred to contribute to the action of LWDH formula. | [ |
| ||||
Cardiovascular diseases | Quantitative composition-activity relationship model (QCAR) (SVM and linear regression) |
|
The proportion of active components of |
[ |
| ||||
Anticoagulant | Network-based computational scheme utilizing multi-target docking score (LigandFit and AutoDock) | Six argatroban intermediates and a series of components from 24 TCMs widely used for cardiac system diseases | A ligand can have impact on multiple targets based on the docking scores, and those with the highest-target network efficiency are regarded as potential anticoagulant agents. Factor Xa and thrombin are two critical targets for anticoagulant compounds and the catalytic reactions they mediate were recognized as the most fragile biological matters in the human clotting cascade system. | [ |
| ||||
Alzheimers’ disease | Systematical target network analysis framework |
|
AD-symptoms-associated pathways, inflammation-associated pathways, cancer-associated pathways, diabetes-mellitus-associated pathways, Ca2 |
[ |
| ||||
Depression | Literature search and network analysis | Hyperforin (HP), hypericin (HY), pseudohypericin (PH), amentoflavone (AF), and several flavonoids (FL) from St. John’s Wort (SJW) | Active components in SJW mainly intervene with neuroactive ligand-receptor interaction, the calcium-signaling pathway, and the gap-junction related pathway. |
[ |
| ||||
Rheumatoid arthritis (RA) | Integrative platform of TCM network pharmacology including drugCIPHER | Qing-Luo-Yin (QLY), including four herbs, Ku-Shen ( |
The target network of QLY is involved in RA-related key processes including angiogenesis, inflammatory response, and immune response. The four herbs in QLY work in concert to promote efficiency and reduce toxicity. Specifically, the synergetic effect of Ku-Shen ( |
[ |
Many bioactive compounds in TCM herbs may have synergetic effort with many non-TCM drugs in markets. Tannin, a component derived from a TCM, can be combined with HIV triple cocktail therapy to yield everlasting efforts in preventing HIV virus propagation. The underlying mechanism is that tannin suppresses the activity of HIV-1 reverse transcriptase, protease, and integrase and cuts off virus fusion and virus entry into the host cells [
Lam et al. recently showed in murine colon 38 allograft model that a formula containing 4 herbs (PHY906) has synergetic effect on reducing side effects and enhancing efficacy induced by CPT-11, a powerful anticancer agent with strong toxicity. The reason is that PHY906 can repair the intestinal epithelium by facilitating the intestinal progenitor or stem cells and several Wnt signaling components and suppressing a batch of inflammatory responses like factor kB, cyclooxygenase-2, and inducible nitric oxide synthase [
Multicomponent and multitarget interactions are the main mode of action for TCM formula, which exerts synergetic effects as a whole preparation rather than the primary active compound in TCM alone. Xie et al. demonstrated that other components in “Qingfu Guanjieshu” (QFGJS) could effectively influence the pharmacokinetic behavior and metabolic profile of paeonol in rats, indicating the synergy of herbal components. This synergy may be the result of enhanced adsorption of paeonol in the gastrointestinal tract induced by P-glycoprotein-mediated efflux change [
Huang-Lian-Jie-Du-Tang (HLJDT) is a TCM formula with anti-inflammatory efficacy, but the action mechanism is still not very clear. Zeng et al. investigated the effects of its component herbs and pure components on eicosanoid generation and found out the active components and their precise targets on arachidonic acid (AA) cascade. Results showed that
A TCM formula, Realgar-Indigo naturalis formula (RIF), was applied to treat Acute promyelocytic leukemia (APL) and showed a high complete remission (CR rate) [
To target the complex, multifactorial diseases more effectively, the network biology incorporating ligand-protein networks has been applied in multitarget drug development as well as modernization of traditional Chinese medicine in the systematic and holistic way. Zhao et al. reviewed the available disease-associated networks, drug-associated networks that can be used to assist the drug discovery and elaborate the network-based TCM pharmacology [
Barlow et al. screened among Chinese herbs for compounds that may be active against 4 targets in inflammation, by means of pharmacophore-assisted docking. The results showed that the twelve examples of compounds from CHCD inhibit multiple targets including cyclooxygenases 1 and 2 (COX), p38 MAP kinase (p38), c-Jun terminal-NH(2) kinase (JNK), and type 4 cAMP-specific phosphodiesterase (PDE4). The distribution of herbs containing the predicted active inhibitors was studied in regard to 192 Chinese formulae, and it was found that these herbs were in the formulae that were traditionally used to treat fever, headache, and so on [
Many traditional Chinese medicines (TCMs) are effective to relieve complicated diseases such as type II diabetes mellitus (T2DM). Gu et al. employed the molecular docking and network analysis to elucidate the action mechanism of a medical composition-Tangminling Pills which had clinical efficacy for T2DM. It was found that multiple active components in Tangminling Pills interact with multiple targets in the biological network of T2DM. The 37 targets were classified into 3 clusters, and proteins in each cluster were highly relevant to each other. Ten known compounds were selected according to their network attribute ranking in drug-target and drug-drug network [
XFZYD, a recipe derived from Wang Q. R. in Qing dynasty, was widely used in cardiac system disease. From similarity search and alignment, the chemical space of compounds in XFZYD was found to share a lot of similarities with that of drug/drug-like ligands set collected from cardiovascular pharmacology, while the chemical pattern in XFZYD is more diverse than that in drug/drug-like ligands for cardiovascular pharmacology. Docking protocol between compounds in XFZYD and targets related to cardiac system disease using LigandFit shows that many molecules have good binding affinity with the targeting enzymes and most have interactions with more than one single target. The active components in XFZYD mainly target rennin, ACE, and ACE2 in renin-angiotensin system (RAS), which modulates the cardiovascular physiological function. It was proved that promiscuous drugs in TCM might be more effective for treating cardiosystem diseases, which tend to result from multitarget abnormalities, but not from a single defect [
A lot of integrative computational tools and models have been developed and widely used to optimize the combination regimen of multicomponents drugs and elucidate the interactive mechanism among ligand-target networks.
Li et al. built a method called distance-based mutual information model (DMIM) to identify useful relationships among herbs in numerous herbal formulae. DMIM combines mutual information entropy and distance between herbs to score herb interactions and construct herb network. Novel antiangiogenic herbs, Vitexicarpin and Timosaponin A-III, were discovered to have synergistic effects. Based on herb network constructed by DMIM from 3865 collateral-related herbs, the interactions between TCM drugs and disease genes in cancer pathways and neuro-endocrine-immune pathways were inferred to contribute to the action of Liu-wei-di-huang formula, one of the most well-known TCM formulae as potential treatment for a variety of diseases including cancer, dysfunction of the neuro-endocrine-immune-metabolism system, and cardiovascular system [
Wang et al. adopted a new method based upon lattice experimental design and multivariate regression to model the quantitative composition-activity relationship (QCAR) of
A network-based multitarget computational scheme for the whole efficacy of a compound in a complex disease was developed for screening the anticoagulant activities of a serial of argatroban intermediates and eight natural products, respectively. Aimed at the phenotypic data of drugs, this scheme predicted bioactive compounds by integrating biological network efficiency analysis with multitarget docking score, which evolves from the traditional virtual screening method that usually predicted binding affinity between single drug molecule and target. A ligand can have impact on multiple targets based on the docking scores, and those with highest-target-network efficiency are regarded as potential anticoagulant agents. Factor Xa and thrombin are two critical targets for anticoagulant compounds, and the catalytic reactions they mediate were recognized as the most fragile biological matters in the human clotting cascade system [
Sun et al. presented a systematic target network analysis framework to explore the mode of action of anti-Alzheimer’s disease (AD) herb ingredients based on applicable bioinformatics resources and methodologies on clinical anti-AD herbs and their corresponding target proteins [
Based on the available experimental results, Zhao et al. analyzed the molecular mechanism with the aid of pathways and networks and theoretically proved the multitarget effect of St. John’s Wort [
Zhang et al. established an integrative platform of TCM network pharmacology to discover herbal formulae on basis of systematic network. This platform incorporates a set of state-of-the-art network-based methods to explore the action mechanism, identify active ingredients, and create new synergetic combinations of components. The Qing-Luo-Yin (QLY), an antirheumatoid arthritis (RA) formula, was studied comprehensively using the new platform. It is found that the target network of QLY is involved in RA-related key processes including angiogenesis, inflammatory response, and immune response. The four herbs in QLY work in concert to promote efficiency and reduce toxicity, as the
In recent years, the bottleneck in western medicine has brought unprecedented opportunities in TCM research and development. For decades, the fundamental research has achieved great success and laid the foundation of modern western medicine, and the philosophical idea of “reductionism” was considered to own the credit.
The counterparty of “reductionism” in Chinese medicine is the philosophical idea of holism, which has thousands years of history of practice in China as well as in other Asian countries. Using this methodology, the effectiveness of TCM can only be verified from a large number of clinical trials given the unclear composition and unknown relationship among various components. This implicit effect without clear clarification at the molecular level has been hindering the modernization of TCM. How to learn from the accumulative knowledge of western medicine in order to identify the effective compositions and explore the molecular mechanism of the efficacy is an urgent problem that needs to be solved in TCM.
The hypothesis of “multidrug, multitarget, multigene” in fact bridges the gap between TCM and western medicine and is also a manifestation of unity of opposites on “reductionism” and “holism.” TCM uses the holistic method to investigate the effects of multicomponent formula across the whole organism, such as the use of a variety of “ZHENG” in TCM theory [
Unity of opposites on holism in traditional Chinese medicine and reductionism in western medicine. Emergentism constructs the framework of the understanding of holism in TCM via accumulative practice of reductionism in WM.
So far, ligand-protein network or “multidrug, multitarget, multigene” is one of the few basic modules that can clearly reveal the pharmacology of TCM and is expected to be the future direction of the modernization of TCM. But just relying on experimental scientists to build ligand-protein interactions nonexhaustively will slow down both the modernization of TCM and the development of its industry. Therefore, the use of cross-platform database (TCM compounds and recipe database; see Section
The increasing availability of ligand-protein networks is a unique chance to boost success in the modernization of TCM based on the accumulative knowledge of TCM formulae and practices based on the assumption that TCM exerts the pharmacological efficacy in multidrug, multitarget way. Although preliminary research has been initiated in this area, there is still a long way to go to further leverage these networks and modeling techniques. Virtual screening and informatics in the drug discovery area have already been proven to be quite useful either to predict potential new drug and target candidates for experimentalists or to explore the functional mechanism at the molecular level. A large number of drug-target interactions have thus been gained and the resulted drug-target networks will also be quite beneficial to investigate the underlying mechanism of multicomponent drugs, such as the TCM. With further applications of these methods in TCM area, we are expecting to reveal the mode of action underlying polypharmacology of TCM. This grants us the possibility to discover novel effective drug leads, understand the synergistic mechanism of drug combinations, and more importantly, develop drug portfolios against epidemic, chronic disease, cancer, and other complex diseases that are almost incurable by western medicine.
The authors declare that they have no conflict of interests.
This work is supported by Grants from the National High-Tech R&D Program (863 Program Contract No. 2012AA020307), the National Basic Research Program of China (973 Program) (Contract no. 2012CB721000), the Key Project of Shanghai Science and Technology Commission (Contract no. 11JC1406400), and Ph.D. Programs Foundation of Ministry of Education of China (Contract no., 20120073110057), which were awarded to D. Q. Wei.