Discovery of Specific Inhibitors for Intestinal E. coli   β-Glucuronidase through In Silico Virtual Screening

Glucuronidation is a major metabolism process of detoxification for carcinogens, 4-(methylnitrosamino)-1-(3-pyridy)-1-butanone (NNK) and 1,2-dimethylhydrazine (DMH), of reactive oxygen species (ROS). However, intestinal E. coli    β-glucuronidase (eβG) has been considered pivotal to colorectal carcinogenesis. Specific inhibition of eβG may prevent reactivating the glucuronide-carcinogen and protect the intestine from ROS-mediated carcinogenesis. In order to develop specific eβG inhibitors, we found that 59 candidate compounds obtained from the initial virtual screening had high inhibition specificity against eβG but not human βG. In particular, we found that compounds 7145 and 4041 with naphthalenylidene-benzenesulfonamide (NYBS) are highly effective and selective to inhibit eβG activity. Compound 4041  (IC50 = 2.8 μM) shows a higher inhibiting ability than compound 7145  (IC50 = 31.6 μM) against eβG. Furthermore, the molecular docking analysis indicates that compound 4041 has two hydrophobic contacts to residues L361 and I363 in the bacterial loop, but 7145 has one contact to L361. Only compound 4041 can bind to key residue (E413) at active site of eβG via hydrogen-bonding interactions. These novel NYBS-based eβG specific inhibitors may provide as novel candidate compounds, which specifically inhibit eβG to reduce eβG-based carcinogenesis and intestinal injury.

The crystal structures of h G and e G have been reported [16,17]. In addition, e G has a unique "bacteria loop" (LGIGFEAGNKPKELYSE) [16] which is absent in h G. Some known drugs such as Amoxapine [23,24] and Loxapine [23] and e G inhibitors [16] have been also demonstrated to interact with the residues of bacterial loop and active sites of e G [16,24]. Those reports indicate that the area around the unique loop and the active site is an important target for e G inhibitor selection.
For the development of specific e G inhibitors, we demonstrated a crystal structure of recombinant e G (provided by Steve R. Roffler, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan) in complex with D-glucaro-1, 5-lactone which revealed that the inhibitor was bound at the residues (E413, E504) of active site. And, we further compared the active center between e G and h G though overleap. We obtained candidate compounds that selectively inhibit e G via computational screening by DOCK 4.0 program [25,26] and X-ray crystal structure of e G. A chemical database (SPECS) containing ∼300,000 commercially available compounds was computationally screened against a grid box enclosing the unique bacterial loop at e G active site. To prove whether the candidate compounds can effectively inhibit e G without affecting h G activity, compounds were examined based on their specific inhibition for e G versus h G by in vitro G activity-based assays. The binding motifs of e G specific inhibitors were determined by molecular docking studies. The novel e G specific inhibitor may provide a highly effective and selective agent to prevent e G-based carcinogenesis and CID.

Expression and Purification of G Protein.
Plasmid pRESTB containing G gene and a histidine tag at the Nterminus was constructed as described [27]. Recombinant G (human and E. coli) was produced by isopropyl -D-thiogalactopyranoside (IPTG) induction of BL21 (DE3) bacteria. G was purified from bacterial supernatants by affinity chromatography on nickels Sepharose 6 Fast Flow (GE Healthcare). The column was washed by phosphatebuffered saline (PBS), with 50 mM imidazole, and G was eluted by PBS with 250 mM imidazole. The purified G was desalted on a Sephadex G-25 column equilibrated with PBS and stored at −80 ∘ C.

Virtual
Screening of e G Specific Inhibitors. The virtual screening was performed using the DOCK 4.0 program and the X-ray crystal structure of e G (provide by Steve R. Roffler). The B-chain structure of protein, water molecules, and the cocrystallized inhibitor D-glucaro-1,5-lactone were removed. The remaining A-chain protein structure was used to prepare the target site for docking simulations. The activesite region of e G was specified as the target site for ligand docking in virtual screening. Briefly, a molecular surface around the target site was generated with the MS program using a 1.4Å probe radius and this surface was used to generate with the SPHGEN program 43 overlapping spheres to fill the target site. A grid box enclosing the target site was created for grid calculations with dimensions 19.3 × 22.4 × 15.6Å. The force filled scoring grids were calculated with the GRID program using a distance-dependent dielectric constant of 4r, an energy cutoff distance of 10Å, and a grid spacing of 0.3Å. The database for virtual screening was the SPECS compound collection, which included ∼300,000 commercially available compounds (downloaded from the ZINC database web site). The DOCK 4.0 program performs docking simulations using a distance-matching algorithm. The matching parameters used to run virtual screening were set as follows: distance tolerance = 0.5, distance minimum = 2.5, nodes maximum = 10, and nodes minimum = 4. The SPECS database was computationally screened against the active site of e G using the force field scoring function based on interaction energy. Virtual screening was performed on a Silicon Graphics Octane workstation with dual 270 MHz MIPS R12000 processors.
For compound selection, the docking models of the 4724 top-ranked compounds (energy score values ≤ −42.00 kcal/mol) were visually inspected using the software PyMOL. Together with consideration of chemical diversity, the selection of compounds was assisted by analysis of the docking models with respect to shape fitting and hydrogen-bonding and hydrophobic interactions. Finally, we selected 59 compounds for enzyme inhibition assays against E. coli and human Gs. The compounds for testing were purchased from the SPECS Company. The SPECS ID number and docking energy score for compounds are listed in the supporting information.

In Vitro G-Activity Assay of e G Specific Inhibitors.
The candidate compounds were purchased from SPECS (The Netherlands). Each candidate was provided as a solid power and dissolved in 100% DMSO (Sigma-Aldrich) to 10 mM as stock. Candidates were screened for their inhibition specificity of e G versus h G, which were conducted at pH 7.3 or pH 5.4, in triplicate, respectively. 40 L purified G was treated with 10 L compound solution at 37 ∘ C for 30 min and sequentially incubated with 50 L of pNPG (Sigma-Aldrich) at 37 ∘ C for 30 min. Reactions were quenched with 5 L of 2 N sodium hydroxide (Sigma-Aldrich). Each reaction consisted of 3.75 ng purified G, 50 M compound, and 5 mM pNPG in PBS containing 10% DMSO and 0.05% BSA (Sigma-Aldrich). G-activities were measured by color development of pNP detected on a microplate reader at OD 405 nm. Results are displayed as percent of G activity compared with The Scientific World Journal 3 the untreated control. For IC 50 determination, compounds at various concentrations (100 M to 0.001 M) were added.

Molecular Docking Studies of e G Specific Inhibitors.
The crystal structure of e G for the virtual screening was also utilized in the docking studies of compounds 7145 and 4041. Hydrogen atoms were added to the A-chain protein structure, and the resulting structure was used in the docking simulations. The 3D structures of compounds were built and optimized by energy minimization using the MM2 force field and a minimum RMS gradient of 0.05 in the software Chem3D 6.0 (CambridgeSoft Corp., Cambridge, MA). Docking simulations were performed using the GOLD 5.0 program [28] on an HP xw6600 workstation with Intel Xeon E5450/3.0 GHz Quadcores as the processors. The GOLD program utilizes a genetic algorithm (GA) to perform flexible ligand docking simulations. In the present study, for each of the 30 independent GA runs, a maximum number of 100000 GA operations was performed on a single population of 100 individuals. Operator weights for crossover, mutation, and migration were set to 95, 95, and 10, respectively. The Gold-Score fitness function was applied for scoring the docking poses of compounds. The docking region was defined to encompass the active site of e G. The best docking solution (with the highest GOLD fitness score) for a compound was chosen to represent the most favorable predicted binding mode to e G.

Comparison of the Active Site Structures of e G and h G.
To identify the active site of e G, recombinant fulllength e G was purified and shown to hydrolyze pNPG to PNP for detecting G activity. The enzyme was crystallized in complex with an established inhibitor, D-glucaro-1, 5lactone. Crystal structure of e G (provided by Steve R. Roffler) in complex with D-glucaro-1, 5-lactone revealed that the inhibitor was bound at residues (E413 and E504) of the active site (Figure 1(a)). To compare the structures of the active centers between e G and h G (PDB ID 1BHG), Gs were analyzed by computer simulation technology. After superimposition, the crystallized structure of e G is 45% similar to h G. Moreover, there is a "bacterial loop" within e G which is absent in h G (Figure 1(b)). Similar results have been also shown in other reports [16]. This e G unique loop of the active center is an ideal target site for screening compounds that can selectively inhibit e G activity.

In Silico Virtual Screening of e G Inhibitor Candidates.
To identify potential e G inhibitors that can selectively block e G activity, but not h G, the virtual screening proceeded based on the different structures of the active center between e G and h G. The SPECS database (∼300,000 commercially available compounds) was computationally screened against the "grid box" which contains the bacterial loop of e G and active site using the DOCK program (version 4.0). Fifty-nine candidate compounds were acquired from the initial virtual screening which was designed to target the bacterial loop of e G and its active site. The docking energy scores of 59 candidate compounds measured by the DOCK program are −43 to −55 kcal/mol (

3.3.
Screening of e G Inhibitor Candidates by In Vitro G Activity Assay. To prove whether these 59 candidate compounds can effectively inhibit e G without affecting h G activity, 50 M compounds were examined for their specific inhibition for e G versus h G by in vitro G-based activity assays, in which the conversion of pNPG to PNP was detected by measuring the increases in PNP absorbance at OD 405 nm. The result showed that all the 59 candidate compounds displayed selective inhibition against e G activity. The inhibiting ability against e G activity, especially, was >95% in 7 candidates of e G specific inhibitors (Table  S1). Based on these results, we concluded that the pocket site in the unique loop and active site of e G are an ideal site to screen e G specific inhibitors through virtual screening. We found that compound 7145 (4-tert-butyl-N-(4-oxo-1(4H)-naphthalenylidene-benzenesulfonamide) can inhibit >95% e G activity and does not hamper h G activity at 50 M condition (Table S1). The result indicated that the derivatives of naphthalenylidene-benzenesulfonamide (NYBS) might effectively and specifically inhibit the e G activity. Based on the NYBS structure, we performed the substructure search and then found compound 4041 (4-methyl-N-(4-oxo-3-(1H-1,2,4-triazol-5ylsulfanyl)-1(4H)naphthalenylidene)benzenesulfonamide). In particular, compound 4041 has been shown to be a more potent e G antagonist than compound 7145. Figure 2 and Table 1 show that while compound 4041 (IC 50 = 2.8 M) can selectively inhibit >80% e G activity, at 10 M, the inhibition of compound 7145 (IC 50 = 31.6 M) is 55% inhibition. Compared to D-saccharic acid 1,4-lactone (saccharolactone), which showed higher inhibition on h G, our candidate compounds displayed specificity against e G activity ( Figure 2). Based on these results, we concluded that the derivatives of NYBS may provide a novel specific inhibitor to reduce e G-based intestinal injury and CID.

Molecular Docking Studies of e G Specific Inhibitors.
To predict the binding modes of compounds 7145 and 4041 in the active site of e G, we performed molecular docking studies using the GOLD 5.0 program. As depicted in Figure 3 and R562 through the SO 2 group and to E413 through the 1,2,4-triazole moiety. Compound 4041 makes hydrophobic interactions with the surrounding residues, including V446, M447, Y472, and L561. The residues L361 and I363 in the bacterial loop make hydrophobic contact with compound 4041 (Figure 4). Compound 4041 has a GOLD fitness score of 64.91 higher than that of compound 7145. Figure 5 shows

Discussion
In this study, we have obtained potent and selective e G inhibitors from in silico virtually screening and further confirmed their inhibition specificity by in vitro G activitybased assay. All the 59 candidate compounds from the initial screening showed high effective and selective inhibition against e G. We identified the two most promising compounds, compound 7145 and its derivate compound 4041, showing IC 50 values of 31.6 M and 2.8 M, respectively. Importantly, compound 4041 with naphthalenylidenebenzenesulfonamide displayed inhibition selectivity against e G by binding to the active site at E413 and the unique loop of e G at L361 and I363.

6
The Scientific World Journal High-throughput screening (HTS) allows researchers to screen millions of compounds for lead identification in drug discovery. However, this method is limited by the size of compound library. Generally, a compound library is quite costly, and the screening process is time-consuming; thus, the limitations have become more apparent. Hence, virtual screening has become an important tool to access novel drugs for lead indentation [29]. The hit rate of virtual screening can reach up to 2-24% which is much higher than HTS with 0.01-0.001% [30]. In our study, we obtained 59 potential e G inhibitors via virtual screening of a library which consisted of ∼300,000 compounds. All candidate compounds showed specific inhibition against e G, but not h G, and met the criteria as virtual screening. The structure-based virtual screening can select compounds with no range limitation and narrow down the candidates for further evaluation, which saves both money and time.
h G is a lysosomal enzyme of normal tissues, and quite low levels of h G are found in serum [31,32]. In contrast, e G is mainly found in the intestine. Both h G and e G catalyze hydrolysis of -D-glucuronic acid residues from the nonreducing end of glycosaminoglycans [33,34], but the enzyme has a unique acidic optimum pH. While h G displays maximal catalytic activity at pH 4-4.5 [32,35], e G exhibits optimal activity at neutral pH. Inhibiting h G may cause MPS [21,22], a lysosomal storage disease that can affect appearance, physical abilities, organ and system functioning, and, in most cases, mental development. It is crucial to screen compounds that can only block e G activity but not affect h G.
A unique loop structure was found in e G which lacked h G after superimposition of two Gs, which provides a target site to screen compounds that can distinguish the two Gs [16]. We found 59 candidate compounds which can selectively inhibit e G activity through molecular docking against the grid box enclosing the bacterial loop and active site of e G. Some known drugs, such as Amoxapine and Loxapine, have been demonstrated to interact with the residues of bacterial loop and active site of e G and inhibit variant bacterial G activity [24]. Wallace and colleagues indicated that the key residues of bacterial loop are L361 and F365 [16], indicating that we can develop the e G specific inhibitor by targeting the unique loop of e G. In this report, compound 4041 can bind to E413 (key residue in active site of e G) through the 1,2,4-triazole moiety but not show in the compound 7145. Furthermore, compound 4041 has twohydrophobic contacts to residues L361 and I363 in the bacterial loop. But, compound 7145 shows one hydrophobic contact withresidue L361. In e G activity assay, compound 4041 (IC 50 = 2.8 M) also shows a higher inhibiting ability than compound 7145 (IC 50 = 31.6 M). We concluded that the inhibiting ability of e G has positive correlation with the interacting quantities to the active site and the unique loop of e G. e G inhibitors can be developed as a chemotherapy adjuvant to reduce CID [16,24]. CID is a main side effect that occurs in up to 50-80% of patients depending on chemotherapy regimen [36]. There are several studies indicating that inhibiting G activity can reduce CID and intestinal injury [37]. Inhibition of intestinal G by antibiotics could reduce CPT-11-induced diarrhea in vivo. But, antibiotics will kill all native gut floras, including probiotics within the digestive tract, which is not recommended for chemotherapeutic patients. Moreover, the e G has also been considered to play a pivotal role in the development of colon carcinogenesis. For example, the DMH and NNK (ROS based carcinogen) have been report that their glucuronide metabolite may be re-toxic by e G and induce intestinal damage and colon tumor in vivo [6][7][8][12][13][14]. e G specific inhibitors may act as colon cancer chemoprevention agents by reducing the generation of xenobiotics from glucuronide metabolites. Thus, the specific e G inhibitor can be applied in nutrient supplement for cancer prevention.

Conclusions
In conclusion, we have identified that two compounds, compound 7145 and compound 4041, can selectively inhibit e G activity without disrupting h G activity by binding to the active site and the unique loop within e G. Because of their high specificity and efficacy against e G, they have great potential to be developed as a chemotherapy adjuvant for antidiarrhea treatment and cancer chemoprevention agent. Moreover, we proved that inhibitors for the desire enzymes can be selected from virtual screening based on the structure docking showing a high hit rate, which may provide a fast and inexpensive approach for new drug discovery.