A Network Pharmacology-Based Approach to Investigate the Novel TCM Formula against Huntington's Disease and Validated by Support Vector Machine Model

Several pathways are crucial in Huntington's disease (HD). Based on the concept of multitargets, network pharmacology-based analysis was employed to find out related proteins in disease network. The network target method aims to find out related mechanism of efficacy substances in rational design way. Traditional Chinese medicine prescriptions would be used for research and development against HD. Virtual screening was performed to obtain drug molecules with high binding capacity from traditional Chinese medicine (TCM) database@Taiwan. Quantitative structure-activity relationship (QSAR) models were conducted by MLR, SVM, CoMFA, and CoMSIA, constructed to predict the bioactivities of candidates. The compounds with high-dock score were further analyzed compared with control. Traditional Chinese medicine reported in the literature could be the training set provided for constructing novel formula by SVM model. We tried to find a novel formula that can bind well with these targets at the same time, which indicates our design could be highly related to the HD. Additionally, the candidates would validate by a long-term molecular dynamics (MD) simulation, 5 microseconds. Thus, we suggested the herbs Brucea javanica, Holarrhena antidysenterica, Dichroa febrifuga, Erythrophleum guineense, etc. which contained active compounds might be a novel medicine formula toward Huntington's disease.


Introduction
Huntington disease (HD) is a kind of involuntary movement, mental disorders, and progressive dementia as the main clinical features of dominant hereditary neurological degeneration disease. The pathology is identified as the protein named Huntingtin [1], which is produced by the site 63 of fourth chromosome. Pathological changes are characterized by the loss of nerve cells in the striatum and cerebral cortex. The novel research points out that the link between peripheral biology and neurodegeneration are shown in other chronic neurodegenerative diseases, suggesting that the modulation of peripheral targets may provide novel ways of therapeutic development [2]. The information here also indicated a multitargets way against the disease.
Neurodegenerative diseases were studied together frequently [3] because of the common features such as misfolding [4]. Studies had constructed a single cell model that greatly promotes the development of neurodegenerative diseases [5]. Many genes are involved in HD [6]. Correspondingly, the treatment of HD is related to multiple ideas, such as antiapoptotic effect [7], antioxidant [8,9], improving memory by enhancing hippocampal synaptic plasticity [10], and immunotherapies [11].
A novel method of merging traditional Chinese medicine (TCM) with network pharmacology-based could be a reliable way to overcome disease. In the past decades, the singletarget drugs sometimes did not achieve favorable therapeutic effects. It would prefer to use the network pharmacologybased approach to develop a treatment in a multitargeted, multidrug manner [12]. Recently, there is a concept about treating the whole body in HD [2]; brain cells are under attack, causing a stress response in cells far from the brain [13]. Most diseases are associated with multiple target proteins, and it is difficult to achieve proper therapeutic results with a single target. Similar to evidence-based medicine, proteinprotein interaction (PPI) network is in line with both western medicine and traditional Chinese medicine. The theory of traditional Chinese medicine holds that the body is an organic whole, and its components are inseparable in structure. This association includes the PPI in modern medicine, which is interdependent in physiological conditions and influences each other in pathology [14].
TCM is the quintessence of China. After thousands years of accumulation, there are enough clinical examples to confirm its rationality [15]. Therefore, the return to traditional Chinese medicine is imperative. Developing a novel type of traditional Chinese medicine prescription could guide the treatment of diseases. Combined with the modern biotechnology computer simulation technology, it could filter out the leading small molecules from the TCM database within antidisease activity. As traditional Chinese medicine has the advantages of relative low toxicity [16], we propose to return to traditional Chinese medicine to screen out the appropriate formula to treat some diseases and even critical diseases. TCM had precedents for the treatment of diseases [17], even some of which modern medicine cannot solve.
The energy metabolism of medium-sized thorny neurons in the striatum of HD patients is affected by the abnormal energy metabolism that rapidly dissipates neurons and is susceptible to excite-toxicity and ultimately damage the cortical neurons, so abnormal energy metabolism may also be related to the pathogenesis of HD [18]. Knockdown of the mitochondrial chaperone mtHSP70 causes a unique fatmediated stress response that corrects the cellular protein folding. Drugs that activate this mitochondrial-to-cytosolic stress response could play a protective role [13,19]. In another word, mtHSP70 inhibitors could reduce protein misfolding, which is a key pathogenesis in lots of neurological diseases.
Another idea to cure HD is to protect neurons. The neuronal stress-protective transcription factor HSF1 will be abnormal degradation [20], which replaces the damage to protection factors. Therefore, we should protect the heat shock transcription factor 1 (HSF1) through a way of cutting the hurt of HSF1. Casein kinase II subunit alpha' (CK2 alpha' , CSNK2A2, CK2A2) inhibitors play a pivoted role to protect HSF1 [20].
The biological significance of YAP binding to TEAD has been more and more clear, but the three-dimensional structure of both and the molecular mechanism of their action remain unclear. In recent year, this pathway is regarded as a potential target toward HD [21]. Huntington's disease (HD) gene product Huntingtin (Htt) selectively induces new forms of necrotic cell death, in which the endoplasmic reticulum (ER) expands and the cell body asymmetric balloon is ultimately cracked. The special necrotic cell death is mediated by functional deficits in TEAD/YAP-dependent transcription instead of RIP1/3 pathway-dependent necroptosis [22]. The special necroptosis way depends on the way of transferring the YAP from TEAD to p73 through YAP phosphorylation [23]. It could be a therapeutic target against HD. The experimental protocol was aimed at developing a novel anti-HD TCM formula ( Figure 1).

Network Pharmacology-Based Analysis.
Based on the preliminary work, Kyoto Encyclopedia of Genes and Genomes (KEGG) database was utilized to generate the related proteins to construct a protein-protein interaction network. The hub protein should be identified in the pathway network. It was clear that the inhibition of Mitochondrial 70 kDa heat shock protein (mtHSP70, HAPA9, mortalin) and CK2A2 benefitted from HD resistance. Based on the multitarget idea, the related targets would be identified after network pharmacology-based analysis about YAP1 pathway. The YAP level would decrease in HD brain model [22], which indicates the downshift of YAP could be one of pathogenic factors. And the signal pathway could provide information of up-/downadjust which can guide us to find the vital protein in the pathway network. Another 2 protein targets, Serine/threonineprotein kinase 3 (STK3) and Serine/threonine-protein kinase LATS1, were studied further.

3D Structure
Modeling for LATS1 Protein. The protein LATS1 has no enough structure information for structurebased drug design. The origin structure of LATS1 in PDB (5BRK:B) was further modeled to obtain a complete structure [24]. The binding site is set as a natural structure, which indicates that it was credible. The structure was verified in Ramachandran plot [25] and Verify profile-3D.

Docking Study of Target
Proteins. The sequence of human CK2alpha' protein (residues 1-350) was acquired from the Uniprot Knowledgebase (P19784). The 3D crystal structure of human CK2alpha' protein, 5M56 (residues 1-350) was obtained from Protein Database Bank (PDB) website [26]. 5M56 was the latest CK2alpha' prototype structural which has the complete 3D structure used for ligand-protein docking. All compounds from TCM Database@Taiwan, the world largest TCM database [27], were applied to dock with 5M56 by using Accelrys Discovery Studio (DS) software. LigandFit program in DS was performed to conduct molecular docking procedure. The ATP binding area was set as a binding sites referring to the ATP-competitive inhibitors, and one of the known inhibitor was set as a control in our study [28]. HARvard Molecular Mechanics (CHARMm) force field was performed to minimize energy. Affinity would be considered in Dock score, where the ligands with high score in docking would be the candidates to interact with target. Multiple poses would be calculated, 11 scoring patterns would be considered, and the consensus score would be set as a comprehensive consideration.
The same way was descripted above; another two targets screened small molecules in the TCM database as well. The information of HSPA9 (mtHSP70) protein was gained from UniprotKB (P38646). 3D structure of HSPA9 was acquired in PDB (4KBO, 52-431). The binding pocket referred to the nucleotide-binding domain (NBD), which is an active  binding region [29]. MST2 (STK3) protein in the PDB (4LGD, 4-491) was selected for studying similarly. A peptide could inhibit the homodimerization of MST2, which could further prevent the activity of MST2 by transautophosphorylation [30]. And the binding site could be referred from it. The LATS1 was docking employing our modeling structure mentioned above.
The screen result of four targets would be analyzed in a network to discover which compounds have effects on multiple targets.

Quantitative Structure-Activity Relationship (QSAR)
Models and Predictive Work. The bioactivity was predicted through QSAR model [31]. The Genetic Function Approximation (GFA) algorithm [32], a method of searching for the optimal solution by simulating the natural evolutionary process, was utilized to look for the proper molecular descriptors and then used Calculate Molecular Property protocol to acquire relative properties in DS software. 2D-QSAR models were conducted through multiple linear regression (MLR) with Matlab and support vector machine (SVM) with libSVM. The way to evaluate these models was the value of square correlation coefficient (R 2 ) which is calculated in the regression. Several molecules with known activity were used for external verification as well. The predictive models were chosen from the models which have high value of square correlation coefficient. And then the predictive models were performed to forecast each candidate drugs after docking.
The docking scores of candidates with CK2A2 were provided, as well as the docking score with other related proteins, mtHSP70, STK3, and LATS1. And the predicted activities or scores through several known compounds or traditional Chinese medicines about HD could also take a reference to the final drugs' selection. Some proteins (CK2A2 and STK3) also provided a control for comparing.
Thirty-one known ligands [33] for CK2 protein with IC50 information constructed the predicted model by SVM and MLR algorithm. Thirty-five ligands [34] for meHSP70 protein with IC50 information conducted the predicted model as well. The STK3 and LATS1 protein constructed the predicted models through the reported TCM formula, and the active ingredients would be gathered in our TCM database. The aim was to build prediction module through SVM and MLR algorithm to predict better medicines from the clinical medicines models.
2.5. 3D-QSAR Analysis. 3D-QSAR was constructed by known ligands [33] for CK2 protein with Sybyl-X 1.1. These active ligands were superimposed. The Comparative Molecular Field Analysis (CoMFA) [35] and Comparative molecular similarity index analysis (CoMSIA) [36] models were constructed according to their activity and structural characteristics. The models were evaluated through cross-validation (CV). Residual between the measured and predicted values was calculated and the square sum of explained (SSE); F-test, coefficient of determination (R 2 ), and q 2 for cross-validation were referred as evaluation indexes of 3D-QSAR models.
CoMFA analyzes the effects of stereoscopic fields and electric fields, while CoMSIA evaluates stereoscopic fields, electric fields, hydrophobic fields, and hydrogen bonds for acceptors and donors. Various field combinations had been studied separately and the best model would be further studied.

Pharmacophore Analysis and Cross Validation.
Pharmacophore models with 38 compounds [33,37,38] were created by hypoGen protocol, which was to analyze spatial influence factors such as hydrogen bonds acceptors, hydrophobic interactions, and -conjugated effect. The models were verified through leave-one-out cross validation (LOOCV) [39]. Fisher's randomization test was used to ensure that reliable hypotheses were built. To obtain a 95% confidence level, a total of 19 randomizations were required.

Molecular Dynamics Simulation (MDs).
In order to verify the results of docking screening, molecular dynamics simulations were performed on all candidate compounds screened above using gromacs5.0.4 program software with world's top computing resources, Tianhe No. 2 supercomputer for long enough (5000 ns) simulation. The 5000 ns is basically the longest simulation in the current stage (2018). The energy minimization was produced by the steepest descent algorithm with the maximum number of 5000 steps minimization. The NVT equilibration was set in 20 ns with each step of 2fs and constrained with Lincs algorithm. The NPT equilibration was running after NVT in total of 20 ns. All bonds (even heavy atom-H bonds) were constrained with Lincs algorithm at 300K. The long time molecular dynasty was set in 5000 ns (for LATS1 protein nearly 3500 ns because of the computing resources). With the analysis of root-meansquare deviation (RMSD), root mean square fluctuation (RMSF), mean square displacement (MSD), secondary structure of protein, solvent accessible surface area (SASA), radius of Gyration (gyrate), the residue contact map (mdmat), Hydrogen bond distance (H-bond), energy analysis, and torsion angle, even the cluster research, the candidates will be validated to acquire a reasonable conclusion. After all, please be aware of short time MD simulation. Several candidates were "flying away" from protein during 5000 ns, even though they interacted well with targets at the first period.

Protein-Protein Interactions Network Analysis.
From the interaction information about the known targets, some highrelated targets could be further confirmed in the research. The decreased expression level of YAP in human HD brains neurons was known [22]. Up-adjust YAP level can be the treatment method. The hub node in the pathway was important and probably got more interactions with other related proteins. The proteins in the pathway were gathered; it showed the related proteins which could adjust ( Figure 2). It could be certain clearly which agonists or inhibitors could be designed from it. The key proteins could be found in the pathway. Here four proteins, CK2A2, mtHSP70, mst2 (STK3), and LATS1 were identified as the targets for the further research. The 3D-profile validation of modeling structure. The Verify Score, higher than 0, means the trusted simulation of amino acids. (c) Ramachandran plot validation of our modeling structure. The area was divided into three parts; the points stayed at the favoured and allowed implying the ration of modeling structure. (d) All the plots mean the amino acids of binding sites, to find if these important amino acids stood in best or allowed fold region. 0, especially the binding site region, which indicates the rationality of the modeling structure. The Ramachandran plot analysis showed whether each amino acid was in a reasonable angle, especially the amino acids of binding site displayed alone (Figure 3(d)).

QSAR Predicted
Model. Appropriate descriptors were selected using the Genetic Function Approximation (GFA) algorithm (Table 1). And the descriptors would further be applied to create 2D QSAR model with support vector machines (SVM) and multiple linear regression (MLR) algorithm. The MLR model is built as follows.
Protein CK2A2: pIC50 = 6.8787 − 0.3456ES (Count ssCH2) Evidence-Based Complementary and Alternative Medicine The model was regressed by the compounds from treatment drugs. Blue spheres represented the training set, and the red inverted triangle indicated the test set ( Figure 4). External verification and R 2 were reasonable.

3D-QSAR Analysis.
The CoMFA and CoMSIA models of 3D-QSAR were constructed ( Figure 5). The value of CoMFA and CoMSIA was calculated further to obtain the predicted activity. And the residuals between observed IC50 and predicted IC50 would be one of the evaluation values of predictive ability ( Table 2). Partial least squares (PLS) analysis and validation of CoMFA and CoMSIA model provided the value of correlation coefficient (R 2 ), standard error of estimate (SEE), and F test (F ratio) for evaluating, as well as the cross validation (q 2 cv ) for assessing (Table 3). Large enough R 2 and relatively small SEE could explain the rationality of the model. And q 2 cv greater than 0.5 were worth being considered. The information of several 3D-QSAR model CoMFA analysis reminded us where bulky substituent groups were needed (green area) and where no substituent groups were demand (yellow area) ( Figure 5(a)). The CoMSIA model suggested what the hydrophobic groups (yellow area), hydrophilic groups (gray area), hydrogen bonds donors (cyan area), and hydrogen bonds acceptors (violet area) needed. Areas that were inappropriate for introducing hydrogen bond acceptors would be warned correspondingly (red area) ( Figure 5(b)). External validations of the models were provided to identify its accuracy (Figures 5(c) and 5(d)) 3.5. Pharmacophore Analysis and Cross-Validation. The activity predicts model would be verified with cross-validation. The leave one out cross-validation (LOOCV) was used to illustrate the reliability of the model. The information of 10 hypoGen models was shared (Table 4). Cat-Scramble suggested the accuracy of pharmacophore model. The total cost values of 19 random hypotheses were all higher than the initial spreadsheet ( Figure 6). The difference between null cost and total cost was vital to assess the confidence of the model. Configuration cost value was 15.59 lower than 17 which at a reasonable interval. Pharmacophore Analysis further evaluated the candidates; compounds 1a and 1b nearly matched all spatial factors (Figure 7).

Dock Screening Result and Candidate Determination.
Top 50 compounds in each target were created in a network to suggest the multitarget effect ( Figure 8). The screening rules the vote score based on the docking scores, SVM and MLR predicted activity, and consensus scores (calculated from 11 scores). For example, top 20 percent of each project was set as efficient and scored 1 point, otherwise 0 point. The total score was set as final evaluation ( Table 5). The screening result for CK2A2 and mtHSP70 was displayed in Table 6, while the STK3 and LATS1 were shown in Table 7.
2D structure of candidate compounds was displayed in Figure 9. The hydrogen bonds, hydrophobic bonds, etc. were displayed to suggest potential poses of ligands and receptors (Table 8); especially several residues binding with all ligands would be focused on. The hydrogen bonding was a significant reference for these binding analyses of combining ability. 2D diagram for different targets was shown in the interaction  The total cost of the initial spreadsheet and 19 random spreadsheets was displayed. The total cost values of all random result were higher than initial, which indicated the reliability of the pharmacophore model.  bonds between key residues and ligands. Van der Waals force, hydrogen bonds, salt bridge, attractive charges, piinteraction, etc. displayed the binding potentials ( Figure 10). The bond length (Å) was considered as the index of stability. Unfavorable bumps prompted space conflict that is not recommended for introduction (red callout). Probably it was a result that the ligand 3a-STK3 was not stable during MD period. state ( Figure 11). During the MD period, the CK2A2-Flazine (1a), CK2A2-Typhic Acid (1b), mtHSP70-Febrifugine (2a), mtHSP70-Holantosine C (2b), mtHSP70-Cassaine (2d), STK3-3a, STK3-ANP (Control), and LATS1-(+)-Taraxafolin B (4d) were tested whether interactions well for a long time.
Most of the complex possessed nonfluctuating total energy, even if the protein interacted with different ligands. Total energy of CK2A2 protein-ligand complex stand at nearly -860000∼-850000 kJ/mol, while mtHSP70 was -1175000∼-1170000kJ/mol, STK3 was -1550000∼-1545000 kJ/mol, and LATS1 was -2600000∼-2590000kJ/mol. It was noteworthy that the energy of STK3-control was lower compared to other ligands, which indicates the control complex stayed at a more stable state. Changes in RMSD suggested that the interactions changed in configuration. It was more concerned whether the RMSD is stable at the end period of MD. Ligand RMSD  of STK3-control and LATS1-4d fluctuated obviously during MD; nevertheless, both of them did not infect the total energy or protein RMSD; the former probably changed posture at 3500 ns ( Figure 11). The mean square displacement (MSD), solvent accessible surface area (SASA), and Radius of Gyration (gyrate) analysis provided the information about both protein and ligand during MD (Figures 12 and 13). Gyrate shows the more bulky the molecule, the smaller the value and the closer the molecules. In general, gyrate presents a downward trend. And for the STK3 protein STK3-control and STK-3a (red line) all displayed high gyrate, which indicate that the Evidence-Based Complementary and Alternative Medicine 13   ligand induced fit to the protein and the protein maintain a bigger open binding site (Figure 12(c)). For mtHSP70 protein (Figure 12(b)), the opposite is true; when the protein was in a bulky state, the ligands easily "fly away"; when the protein was tighter, the ligand could get better stay inside. MSD analysis told us the protein change between the original state and simulation state. It was clearly which ligand "fly away". The ( Figure 12(c)) displayed higher hydrophilic when complex with ligand (red line and orange line); it was related to its bulky structure predictably. Mean square displacement (MSD), solvent accessible surface area (SASA), and Radius of Gyration (gyrate) analysis of each ligand showed in Figure 13. The ligands' staying with protein complexes during MD would be focused. Ligand gyrate was stable in general. The ligand MSD revealed the ligand change during MD; as for STK3 protein, the ligand 3a (red line) increased at 4000 ns; it was consistent with gyrate drop, which indicates structure contraction at that time. SASA of ligands change little at MD period. One interesting condition is for STK3 protein (Figure 13(c)), where the SASA value of the ligands complex with protein (red line and orange line) was higher than other "fly away" ligands (blue line and green line); probably the bulk protein structure provides more chance for inside ligands to contact with water.
Nonbond relationship provided the binding information between the ligands and receptors; it was an important reference for drug design (Table 8). For example, it told us the Residue R48 of CK2A2 kept the H-bond, electrostatic interaction, and hydrophobic interaction. As the same, E222 of mtHSP70, D164 of STK3, and R618 of LATS1 can be significant for maintaining interaction effect.  Hydrogen bond distance analysis was vital to explain the interaction between targets and proteins. The variation of main H-bond was observed to interpret the stability of receptor-ligand complexes (Figure 14). And the H-bond distance and the occupancy were showed to explain whether these H-bonds were in a proper distances ( Table 9). The Hbond occupancy of STK3-control and mtHSP70-Febrifugine was kept high which can predict a good interaction for these complexes. As for LATS1 protein, unfortunately low H-bond occupancy indicated the ligand may leave even if we have not yet observed (Figure 14(e)). The H-bond information informed the significant residues. ARG48 and ASN119 of CK2A2 were great for maintaining complex interactions. ASP59, Gly247, Gly248, and Gly387 of mtHSP70 protein Table 9: Main H-bond distance occupancy information for complexes. Root mean square fluctuation (RMSF) can be used to test fluctuation of each residue, which can notify the key residues change during MD. The key residues contributed nonbond interaction which would mainly be focused on. CK2A2 complexes had similar residue fluctuations (Figure 15(a)), which correspondingly showed similar average protein structure ( Figure 16). Residues fluctuation of mtHSP70 complexes displayed similar trend, which have the same fluctuation peaks and valleys. The difference between these complexes was the different degrees of change (Figure 15(b)). The only one ligand "fly away" (pink line) had different types of fluctuations, which further verified the role of ligands. As for STK3 protein (Figure 15(c)), ligand 3a (pink line) and control (orange line) displayed different changes mainly due to the loop area outside, which showed in different average structure ( Figure 16). The change of LATS1 complexes was more complex due to a lot of residues, but it could be clearly that the fluctuation of key residues like P481, Y597, K608, and R618 influenced the binding interaction.

H-bond
The average protein structures would be superimposed to observe whether the binding structures are alike, which can predict these structures related to their inhibition. CK2A2 and mtHSP70 were nearly superimposed on one structure, respectively. STK3 can be superimposed except the outside loop. The two LATS1 protein structures cannot merge into one because one ligand of structure "flies away". It manifested the structure which acting with ligand was completely different from the origin protein ( Figure 16).
The protein residues contact map provided 3D information about the residues distance of proteins in 2D matrix. Average amino acid distance was shown in different color, which indicates the looseness of protein. The distance between key residues was focused on. It demonstrated a tight conformation for SK2A2 protein.
The key amino acids R48 and N119 of CK2A2 are all close to other residue that can construct a "hotspot" cavity. Similarly, key residues D59, G247, etc. of mtHSP70 constitute an active cavity site. The amino acids S37, K56, E70, and D164 of STK3 are also in a reasonable range. Different ligands for CK2A2 or mtHSP70 protein would get similar 3D structure shown in residues distance matrix, respectively, which further validate our inference above.
Secondary structure of protein provided important information on structural stability. Maintaining A-helix and -sheet during MD period can keep structural stability. Basically secondary structure was stable, including the key residues. A change was observed like residue 176-181 of mtHSP70-Febrifugine since 3000 ns, where A-helix changed to -turn. It is possible that complex structure changes to a more stable configuration since this moment influenced by the ligand (Figure 17).
Cluster analysis was taken for each complex at the last period. The dock poses, cluster poses during the last 500 ns period, and the last poses were provided ( Figure 18). Cluster poses were got from the big class of cluster map. The poses between dock and the last were changed in some complex like LATS-ligand; in the large cavity of the action site, the ligand is completely steered, showing another way of acting. The CK2A2 and mtHSP70 were relatively fixed. And the Hbond relationship was shown which could clearly know the interaction modes of key residues like Asp59, Glu222, Asp 244 in the mtHSP70, etc. For the same receptor, different ligand employed different residual contributions; however, the common force mode between different complexes could help us find the key residues like Asp48 and Asn119 in the CK2A2 protein.
The change in the torsion angle provides important clues to the stability of hydrogen bonds. If a single key that can be rotated originally is fixed in a relatively small range of variation, it may be fixed by space or electrically fixed. Relatively speaking, finding its related hydrogen bonds also shows that the hydrogen bonds are relatively stable ( Figure 19). Torsion 1 showed 180 degrees of twist, but in fact two O elements on the carboxyl were equivalent when carboxyl dissociates; hydrogen bonds which could form after 180-degree rotation were still equivalent to origin hydrogen bond; what is certain was that this was a weak hydrogen bond. Torsion 2 had a great influence on the overall conformation of the ligand molecule connecting two planes, which showed rotation of furan ring plane; its orientation can bring hydrogen bonding or hydrophobic interaction; approximately 30 degrees of change is relatively stable to conform a H-bond or hydrophobic interaction. Residue Index Residue Index Figure 17: Residue distance matrix can show the distance between each residue. The main observation is the yellow-green region. It can be seen which residues were closed in space, and the residues around the binding site would be particularly concerned by us. The first three maps were mainly for all residues of the corresponding docking proteins, while the last one specifically extracts the residue information from the key region. Protein (a) CK2A2; (b) mtHSP70; (c) STK3; (d) LATS1.
group on the phenyl ring failed to form a fixed hydrogen bond. The span of torsion 32 was nearly 90 degrees that this single would keep moving obviously. Pathway analysis could provide various pathways that the ligands could entry using Caver 3.0 [40]. The possibility of ligands entering the binding region was diverse. More entry paths mean greater possibilities for integration of ligands and proteins. Almost all the targets provide lots of access routes for ligands, which is advantageous for binding ( Figure 20).

Discussions
Traditional Chinese medicine provides another treatment idea with no solution strategy. Compound information of TCM was provided in our TCM database (the world's largest database of traditional Chinese medicine). Considering the docking scores comprehensively, SVM prediction activity and MLR prediction activity could explain the effect well. The verification of the simulation was provided in the experiment, such as the validation of the homology modeling and the activity prediction module.
Traditional Chinese medicine combined with network pharmacological analysis could get more effective information. It could consider multiple targets synthetically, which is quite beneficial for treatment, especially for stubborn diseases. It is impossible to employ only one target or one drug against disease nowadays. Considering the comprehensive consideration of the disease, it is exactly the same as the concept of traditional Chinese medicine. Small molecules in traditional Chinese medicine formula can be linked to multiple targets through a network, just as western medicine has been studied. Traditional Chinese medicine modernization, or integration of TCM and western medicine (WM). Network analysis can find the intercommunication between WM and TCM, bridge the current gap, and promote integrated treatment [41]. Network pharmacology-based approach integrates information into disease networks and pharmacological networks. With computational methodology, western medicine (WM) and TCM adopt networks analysis as the standard for evaluation in disease and pharmacology.
Long-term MD simulation would tell us more reliable information. In addition, the short-term MD is actually debatable; however, it was often performed at past several years. From this experiment, we want to tell the scholars who use MD to be careful about molecular dynamics simulations, especially the short-term MD. The MD simulation used in our experiment ran for 5000 ns (part of the data ran for 3000 ns due to computational resources), which is quite long for the current computing resource (2018). It could be found   Figure 19: Torsion changes during MD simulation. Some single bonds that involve space or hydrogen bonds had been specifically selected for study, and it was possible to guess which hydrogen bonds can exist more stably and which structures will be bound in space to achieve a more stable combination. A single bond with a small twist angle will be noticed.
that some complexes that can be stably combined in tens or hundreds of nanoseconds may "fly away" in subsequent simulations. This shows that short-term MD is likely to have false positive results. Of course, our MD is not long enough but using the computational resources that can now be achieved. We want to tell the researcher there are some problems in MD, so be aware of MD.

Conclusions
Dock screening result is validated by QSAR model and molecular dynasty simulation; we got the more reliable candidate of small molecule compounds from TCM database. Several molecular dynamics were connected to the targets through TCM formula candidates. The novel formula can achieve anti-HD effects through the multicomponent and multitarget strategy. Prescriptions and details of their small molecule effects were provided; they could form a drugmultitargets and multidrug synergistic effect against HD (Table 10). It was provided the novel TCM formula drugs with compounds in it, as well as the reacted target proteins. The novel TCM prescription proposes could be a developing method, not only for Huntington's disease, but also for other chronic illnesses. The network concept deal with disease and our method of developing Chinese medicine could be a direction for development.

Abbreviations
HD: Huntington's disease TCM: Traditional Chinese medicine QSAR: Quantitative structure-activity relationship MLR: Multiple linear regression SVM: Support vector machine Genetic Function Approximation LOOCV: Leave one out cross validation RMSD: Root-mean-square deviation RMSF: Root mean square fluctuation MSD: Mean square displacement SASA: Solvent accessible surface area gyrate: Radius of gyration.

Data Availability
All data generated or analysed for this study are included in this publish article. Raw data are available from the corresponding author on reasonable request.