Identification of Potent Chloride Intracellular Channel Protein 1 Inhibitors from Traditional Chinese Medicine through Structure-Based Virtual Screening and Molecular Dynamics Analysis

Chloride intracellular channel 1 (CLIC1) is involved in the development of most aggressive human tumors, including gastric, colon, lung, liver, and glioblastoma cancers. It has become an attractive new therapeutic target for several types of cancer. In this work, we aim to identify natural products as potent CLIC1 inhibitors from Traditional Chinese Medicine (TCM) database using structure-based virtual screening and molecular dynamics (MD) simulation. First, structure-based docking was employed to screen the refined TCM database and the top 500 TCM compounds were obtained and reranked by X-Score. Then, 30 potent hits were achieved from the top 500 TCM compounds using cluster and ligand-protein interaction analysis. Finally, MD simulation was employed to validate the stability of interactions between each hit and CLIC1 protein from docking simulation, and Molecular Mechanics/Generalized Born Surface Area (MM-GBSA) analysis was used to refine the virtual hits. Six TCM compounds with top MM-GBSA scores and ideal-binding models were confirmed as the final hits. Our study provides information about the interaction between TCM compounds and CLIC1 protein, which may be helpful for further experimental investigations. In addition, the top 6 natural products structural scaffolds could serve as building blocks in designing drug-like molecules for CLIC1 inhibition.


Introduction
Chloride intracellular channel 1 (CLIC1), a newly discovered member of the highly evolutionarily conserved CLIC family of chloride ion channel proteins, was first cloned because of its increased expression in activated macrophages [1]. CLIC1 located within the plasma membrane and other internal cell membranes are involved in diverse physiological processes [2,3]. The CLIC family have seven members: CLIC1 (NCC27), CLIC2, CLIC3, CLIC4, and CLIC5A whose sequences are highly conserved across species and two larger variants, CLIC5B and CLIC6 [4,5]. They are known to participate in many physiological processes, the control of absorption and secretion of salt, acidification of organelles, and the regulation of cell volume and membrane potentials [6]. Malfunction of any of these channel proteins can lead to severe disease states [7].

BioMed Research International
Previous studies suggested CLIC1 appears to have a broad tissue distribution; it has been most intensely studied in various tumor tissues like gastric cancer [8], colon cancer [9], lung cancer [10], liver cancers [11], and glioblastoma. All of these studies showed that the high expression of CLIC1 has important association with tumor invasion, metastasis, and prognosis. For example, CLIC1 is involved in the development of most aggressive human tumors, including glioblastoma (GBM) [12] and lung adenocarcinoma [10]. CLIC1 is mainly localized in cytosol in resting cells, while it is progressively oxidized and recruits transiently to the plasma membrane, where it functions as a chloride selective ion channel, during cell cycle progression [13]. In vivo and in vitro proliferation of GBM cancer stem cells depends on CLIC1 activity, and its inhibition reduces tumor development in animal models.
Recently, Gritti et al. [14] discovered that CLIC1 is a direct target of metformin in human GBM cells. They identified that CLIC1 is not only a modulator of cell cycle progression in human GBM stem cells but also the main target of metformin's antiproliferative activity, paving the way for novel and necessary pharmacological approaches to GBM treatment. Therefore, CLIC1 is a potential prognostic marker and drug therapy target for diverse malignant tumors.
Although CLIC1 represents an emerging therapeutic target and shows important significance in clinical diagnosis, only few CLIC1 inhibitors have been reported to date [6,15]. The most recognized CLIC1 inhibitor is IAA-94, known as a glutathione transferase (GST) binding molecule [16], and based on ethacrynic acid. Similarly, it is not entirely surprising that the CLICs are members of the GST superfamily. As well as predictions based on sequence similarity, p64, the first identified CLIC, was purified and characterized by its ability to bind the chloride channel inhibitor. In fact, the affinity purification experiments first isolated p64 and GST concurrently.
Herein, we focus on identification of potent CLIC1 inhibitors. To achieve this goal, the integrated in silico protocol, including docking-based high throughput screening, MD simulations, and Molecular Mechanics/Generalized Born Surface Area (MM-GBSA) analysis, was designed to discover new potent CLIC1 inhibitors from the world's largest TCM database. Based on our strategy, six TCM compounds were predicted as promising CLIC1 inhibitors, which may become the new lead compounds or drug candidates in the future.

TCM Database and Receptor Preparation.
A total of 57423 ligand molecules were obtained from the TCM database @Taiwan (http://tcm.cmu.edu.tw) [17][18][19] and refined as the following protocol. First, we did the pretreatment for each molecular structure, including removing the counterions, solvent moieties and salts, adding hydrogen atoms, and optimizing the structures based on the MMFF94 force field using MOE (version 2010.10, Chemical Computing Group, Inc., Canada) [20][21][22][23]. Second, the refined database was filtered using drug-like analysis including Lipinski rules of five and PAINS assay http://cbligand.org/PAINS) [24,25].
Then, all molecules were automatically converted to PDBQT format. Open Babel software (http://openbabel.org/wiki/) and in-house python script were used for manipulating the various chemical formats of ligand molecules [26].
For the docking simulations, the cocrystallized structure of human CLIC1 with glutathione (PDB code: 1K0N) [7] with a resolution factor of 1.80Å was retrieved from RCSB protein data bank and prepared in three steps [7]. First, all native ligands, ions, and crystalline water were removed from the cocrystallized structure of CLIC1. Second, the missing hydrogen atoms were added [27]. Finally, the protein file was automatically prepared in PDBQT format. Molecular Graphics Laboratory (MGL Tool) software was applied in preparing all of the structure parameters of CLIC1 protein.

Docking-Based Virtual Screening. AutoDock
Vina was employed to screen the refined 9033 TCM library against CLIC1 [28]. The docking site was defined on glutathione active binding site (−1.343, −6.385, 32.927Å) and the grid box was set as 25 × 25 × 25Å in , , and directions. During docking process, the semiflexible docking simulations were performed employing Lamarckian genetic algorithm, and the receptor was kept rigid, while the ligands were flexible to rotate and explore the most probable binding conformations. After docking-based virtual screening, the top 5000 TCM compounds with docking score were obtained [29,30].
The top 5,000 TCM compounds were resorted by -Score. Again, the top 500 compounds were selected from the 5,000 reranked compounds for clustering. Clustering and visual analyses were carried out to remove redundancies resulting from similar structures and check the docking poses and interactions between the ligands and CLIC1 [20]. Finally, 30 compounds were selected for further analysis.

Molecular Dynamics
Simulation. Molecular docking method only reflects a possibly instantaneous binding mode which may not be reasonable/stable between a ligand and a receptor. Therefore, MD simulations were applied to perform further evaluation of the binding stabilities between all the top 30 TCM and their receptor CLIC1. MD simulations were performed to investigate the binding patterns of the virtual screening top 30 hits using the PMEMD module in AMBER 12 software accelerated by running on a GPU system, the NVIDIA CUDA processor [31]. The CLIC1 protein complexes with docked structures of top hits were used as the initial coordinates for MD simulations. The protein and top hits were applied with ff99SB and Generalized Amber Force Field (GAFF), respectively. The partial charges of ligands were computed using the HF/6-31 G * basis set from GAUSSIAN09 and refined by RESP calculation using the antechamber module of the AMBER 12 package [20]. Each system was solvated in a truncated octahedron box of TIP3P water molecules with a margin distance of 10Å. Periodic boundary conditions were applied. Neutralizing counterions were added to the simulation system.
To remove possible steric stresses, each system was minimized for 2,000 steps with the steepest descent method, followed by application of conjugate gradients for another 2,000 steps. Each system was linearly heated from 0 to 310 K using a Langevin thermostat, with a collision frequency of 5.0 ps −1 and harmonic restraints of 4 kcal/mol/Å2 on the backbone atoms over 50 ps and then equilibrated for 50 ps at 310 K using the NVT ensemble. A production simulation run for 5 ns was performed using the NPT ensemble. Coordinate trajectories were saved every 1 ps for the whole MD runs. The temperature was kept at 310 K by means of a weak coupling algorithm [23]. Covalent bonds involving hydrogen were constrained using the SHAKE algorithm.

Binding Free Energy Analysis.
To provide insight into the interaction energies and energetic stabilities of the CLIC1 and TCM compounds, the MM/GBSA method [32] in the AMBER 12 was used to calculate the binding free energies for 30 hits. Detailed calculations and analyses can be found in the previous studies [33][34][35][36]. The final top 6 hits were selected as potent CLIC1 inhibitor according to the ranked binding free energy results.

Results and Discussion
3.1. Binding Domain Analysis. The electrostatic potential representation structure of glutathione-CLIC1 complex is shown in Figure 1(a). The green molecule is glutathione (GSH) surrounded by the basic lobes of the N and C domains at the edge of a slot at the top of the molecule (Figure 1(a)).
According to the previous study [7], the N-domain of CLIC1 has a well-conserved glutaredoxin-like site for covalently interacting with GSH. The thiol of Cys24 in CLIC1 is likely to be a highly reactive thiolate with a low pKa due to its position at the amino terminus of helix h1 (Figure 1(b)) [37]. The interactions between GST and ethacrynic acid inhibitor compared with CLIC1 and IAA-94 inhibitor were shown in Figure 2 [16]. The structure of the soluble form of CLIC1 indicates that it belongs to the GST superfamily [7]. Hence, the mechanisms of IAA-94, a well-characterized CLIC1 inhibitor, and GSH in CLIC1 are likely to be related  in ethacrynic acid and GSH in GST [7,38]. Ethacrynic acid binds to GST at the electrophilic substrate site ("H-site"), surrounded by TYR-9, ARG-13, GLY-14, LYS-15, LEU-107, and PHE-222, which is adjacent to the GSH binding site (Figure 2(a)) [39]. In GSTs, the H-site is formed by the loop connecting -strand s1 to helix h1 and helix h4 plus the carboxyl terminus (the "walls") and helix h9 (the "lid"). This corresponds to the more open and elongated slot in CLIC1 (Figures 1(a) and 2(b)). Due to its structural homology to ethacrynic acid, IAA-94 is bound to CLIC1 protein in the slot adjacent to the GSH binding site, surrounded by ALA-14, ASN-23, GLU-228, ALA-232, and TYR-233 [40]. To ensure docking reliability, the validation of docking performed based on the available crystal structure. GSH was docked to the apo-protein and the RMSD value of XRD and GSH is 1.351, suggesting the docking method used in the present study is reliable. Therefore, the grid center was defined by glutathione active binding domain and grid box was set as 25 × 25 × 25 points in , , and directions, which contains the slot of binding site of CLIC1 potential inhibitors.

Virtual Screening Result.
Virtual screening is gaining increasingly important influence in modern drug discovery. It can be used to screen large compound databases and reduce large numbers of compounds to smaller subsets that are more likely to contain biologically active compounds. In this work, we designed a systematic strategy for identifying natural products CLIC1 inhibitors using structure-based VS and MD simulation. The detailed flowchart is shown in Figure 3. Among the MOL2 files in TCM database, 9,033 natural products were obtained from the mother TCM database containing 57,423 using the Lipinski rules and PAINS assay filtering. The Lipinski rule states that "drug-like" molecules must satisfy the conditions below at the same time: log ≤ 5, 150 ≤ molecular weight ≤ 500, number of hydrogen bond donors ≤ 5, number of hydrogen bond acceptors ≤ 10, and number of rotatable bonds ≤ 10 [41]. Also, PAINS-Remover is used to remove the Pan Assay Interference Compounds (PAINS) from screening libraries and for their exclusion in bioassays. This server will facilitate data-sharing and information exchange among UPCMLD scientific research communities with online structure search functions and data analysis tools implemented for removal of PAINS [25]. Then, the 9033 TCM compounds were docked into the binding site of CLIC1 by AutoDock Vina, followed by ranking according to their binding energy. The top 5000 molecules were selected for further X-Score analysis (Figure 3).
X-Score software computes a binding score for a given protein-ligand complex structure, and this binding score correlates with experimental binding constants well. Three individual empirical scoring functions have been implemented in X-Score software, namely, HPScore, HMScore, and HSScore [29]. According to the results ranked by X-Score, the top 500 hits were kept. Then, the 500 TCM with top X-Scores were stored separately for clustering and visual analyses. These compounds were inspected to check whether they had interactions with the GSH binding pocket of CLIC1 protein. This step makes sure that selected candidates have not only a higher docking score but also a rational binding mode. An ECFP_4 fingerprint based clustering algorithm implemented in discovery studio 3.5 software (Accelrys, Inc., San Diego, CA) was applied for structure diversity analysis to ensure that the hits selected from the virtual screening were unique and unrepeated. Finally, 30 compounds were chosen for further MD analysis and their corresponding zinc codes were listed in Table 1.

Molecular Dynamics Results.
The RMSD values of the complexes of the 30 hits and CLIC1 during the MD simulations were monitored in Figure 4. Meanwhile, the RMSD values of the known inhibitor IAA-94 and CLIC1 complex during the MD simulations were also drawn in Figure 4(a). In order to provide the explanation, 5 ns of MD simulations is really capable of representing the equilibrated ligand-proteincomplex. The RMSD (Å) trajectories of IAA-94 and top 30 TCM compounds in binding site (residues within 6.5Å to the ligands) of CLIC1 complexes during 5 ns MD simulation were drawn in Figure 5. From Figure 5, we can see all the RMSD curves in the selected part (residues within 6.5Å to the ligands) were stable after 1 ns. Here, we employed the MM-GBSA method encoded in Amber 12 to calculate the ligand binding free energies and rescore the docking hits. The detailed MM-GBSA scores results are listed in Table 1 [42]. As shown in Table 1, we can clearly achieve the fact that the top 6 TCM hits have relative bigger binding free energy, indicating they are potential CLIC1 inhibitors. The detailed binding energy profiles of three TCM hits (16, 22, and 20) are shown in Table 2 The binding free energy of positive control IAA-94 and CLIC1 complex were calculated using the same method MM/GBSA (Table 2). Here, we can conclude the van der Waals contribution (−18.47 kcal/mol) of compound 22 is approximately equal to van der Waals contribution (−18.83 kcal/mol) of IAA-94. Also, the van der Waals contributions (−25.28 kcal/mol) of compound 16 and compound 20 (−28.67 kcal/mol) were greater than our positive control. In conclusion, the binding energy of top 3 hits is greater than the known inhibitor.     bonds and van der Waals play a key role in protein and small molecules interaction. All of these binding modes analysis results are consistent with the binding energy results. Hence, we propose the six TCM compounds (Figure 7) as potential candidates for further study in drug development process with the CLIC1 protein against several types of cancer [43].

Conclusion
This study aims to investigate the potent lead TCM candidates for CLIC1 protein inhibitors against cancer. We introduced a systematic structure-based VS and MD analysis study about potential inhibitors against CLIC1 from the world's largest   natural products TCM database. 30 TCM hits were selected through molecular docking-based virtual screening, X-Score, cluster, and visualizing analysis. 6 of 30 TCM hits are refined through 5 ns MD simulations and MMGB-SA binding energy analysis. The detailed binding modes of 6 TCM candidates were illustrated and discussed. We hope our results may inspire medicinal chemists to further develop these potential CLIC1 inhibitors into lead compounds for the treatment of solid cancers in the near future.