Chemical Feature-Based Molecular Modeling of Urotensin-II Receptor Antagonists : Generation of Predictive Pharmacophore Model for Early Drug Discovery

For a series of 35 piperazino-phthalimide and piperazino-isoindolinone based urotensin-II receptor (UT) antagonists, a thoroughly validated 3D pharmacophore model has been developed, consisting of four chemical features: one hydrogen bond acceptor lipid (HBA L), one hydrophobe (HY), and two ring aromatic (RA). Multiple validation techniques like CatScramble, test set prediction, and mapping analysis of advanced known antagonists have been employed to check the predictive power and robustness of the developedmodel.The results demonstrate that the bestmodel, Hypo 1, shows a correlation (r) of 0.902, a rootmean square deviation (RMSD) of 0.886, and the cost difference of 39.69 bits. The model obtained is highly predictive with good correlation values for both internal (r = 0.707) as well as external (r = 0.614) test set compounds. Moreover, the pharmacophore model has been used as a 3D query for virtual screening which served to detect prospective new lead compounds which can be further optimized as UT antagonists with potential for treatment of cardiovascular diseases.


Introduction
The role of urotensin-II (U-II) cyclic peptides and their cellsurface receptor in cardiovascular regulation is very well established.Human U-II is considered to be the most potent vasoconstrictor known, approximately 10 times more potent than endothelin-1 [1].It is an 11 amino acid cyclic peptide, expressed mainly in the blood vessels, heart, liver, kidney, skeletal muscle, and lung [2].Urotensin II receptor (UT) is a Gq protein coupled receptor originally identified as the orphan GPR14 receptor by Ames et al. in 1999 [1].The utility of UT antagonists as potential therapeutic agents in treating atherosclerosis, hypertension, and metabolic syndrome has been suggested by a wide range of studies in animal models [3].Therefore a UT antagonist may have enormous therapeutic potential in the treatment of hypertension and cardiac and renal failure.
Palosuran (ACT-058362), the only UT antagonist drug candidate, when administered intravenously, protected against renal ischemia in a rat model [4].However, clinical studies of palosuran were ceased in May 2005 due to lack of efficacy in humans [5].Therefore it is the need of the hour that new UT antagonists should be developed for the treatment of cardiovascular and renal diseases.
Pharmacophore modeling has proven extremely successful in the drug design process by demonstrating structureactivity relationships [6][7][8][9].A good pharmacophore model collects important common chemical features of molecules distributed in the 3D space and provides a rational hypothetical conformation responsible for activity.Thus, it provides the essential structural requirements for the ligands to have good interactions at the target, which may also prove helpful in the identification of new active and specific inhibitors [10,11].
Earlier, Lescot et al. in 2007 performed 3D pharmacophore-based studies on a set of diverse UT antagonists including benzazepine, biphenylcarboxamide, quinoline, sulfonamide, indole, and quinolone derivatives.They depicted two distinctively different models based on a common feature alignment approach comprising two aromatic rings, one 2 Journal of Chemistry hydrophobic group, a basic amine centre, and a hydrogen bond acceptor, each with a different spatial distribution, thus suggesting a lack of convergence between the considered antagonist ligands [12,13].In view of this we envisaged to undertake QSAR based pharmacophore modeling of UT antagonists taking into account both chemical structures as well as activity data of a series of compounds.In the present study, a 3D QSAR based pharmacophore modeling was employed to understand the structural features that are common in the piperazino-phthalimide and piperazinoisoindolinone based UT antagonists [14].Our pharmacophore model consists of one hydrogen bond acceptor lipid (HBA L) and one hydrophobic (HY) and two ring aromatic (RA) features.The model has been thoroughly validated using Fisher's cross validation [15] and activity prediction of internal test set as well as external data set of compounds.Also, the pharmacophore model has been used as a 3D query for virtual screening which has detected potential new compounds which can be further optimized as UT antagonists.

Data Set.
A series of UT antagonists based on piperazino-phthalimide and piperazino-isoindolinone groups were taken from the literature [14] (Table 1).The basic requirements of training set selection were followed; that is, a minimum of 16 structurally diverse compounds should be selected to avoid any chance of correlation and the activity data should have a range of 3-5 orders of magnitude.These compounds covered a wide range of UT inhibition activity from 1 nM to 65000 nM represented as rat FLIPR IC 50 values.35 compounds were selected to construct the training set.Excluding the training set compounds, the remaining compounds were used as test set to evaluate the efficiency of the pharmacophore model.
All structures were built using ChemDraw Ultra 8.0 (Cambridge Soft Corp., Cambridge, MA) and imported to Accelry's Discovery Studio 2.0 (DS 2.0, Accelrys Inc., San Diego, CA) window.Their energies were minimized to the closest local minima using the generalized CHARMM force field as implemented in DS 2.0 program.

Generation of Pharmacophores.
As a prerequisite to 3D pharmacophore development, conformational models for the compounds were generated using the "best" conformer generation method [16].The poling algorithm was used, which seeks to provide a broad coverage of conformational space with a maximum conformational energy of 20 kcal/mol above the lowest energy conformation [17].The number of conformers generated for each compound was limited to a maximum of 250.The catalyst model treated the molecular structures as templates comprising chemical functions localized in space that binds effectively with complementary functions on the respective binding proteins.The feature mapping protocol in catalyst generated three chemical feature types, hydrogen bond acceptor lipid (HBA L), hydrophobic (HY), and ring aromatic (RA), effectively mapping all the critical chemical features of all molecules in the data set.
These three features were used to generate the pharmacophore hypotheses.The value for uncertainty was defined as 3, which represents the ratio range of uncertainty in the activity value based on the expected statistical straggling of biological data collection.Minimum points and minimum subset points were kept at the default values of 4. Based on the conformations of each compound, the HypoGen module of the catalyst was used to generate three-dimensional pharmacophore models.

Statistical
Assessment of the Generated Hypotheses.The HypoGen module performs two important cost calculations (represented in bit units) that determine the success of any pharmacophore hypothesis [18].First is the fixed cost, which represents a simple model that fits all data perfectly, while second is the null cost which presumes that there is no relationship in the data and that the experimental activities are normally distributed around their average value.A meaningful pharmacophore hypothesis may result when the difference between these two values is large; for instance, a value of 40-60 bits for the unit of cost difference implies a 75-90% probability of the correlation between experimental and predicted activities [18].Further, total cost which sums over error cost, weight cost, and configuration cost should be close to the fixed cost, and there should be a significant difference between null and total cost.Two other parameters that also determine the quality of any pharmacophore hypothesis are the configuration cost or entropy cost, which depends on the complexity of the pharmacophore hypothesis space and should have a value <17, and the error cost, which is dependent on the root mean square differences between the estimated and the actual activities of the training set molecules.The root mean square deviation (RMSD) and the correlation coefficient represent the quality of the correlation between the estimated and the actual activity data.
The hypotheses generated by the HypoGen module have been analyzed for their statistical significance in terms of cost function analysis, correlation coefficient, and root mean square deviation.

CatScramble Validation Test.
To evaluate the statistical relevance of the models, CatScramble validation has been applied.The CatScramble validation procedure is a cross validation based on Fischer's randomization test, where the biological activity data are randomized within a fixed chemical data set and the HypoGen process is initiated to explore possibilities of other hypotheses of good predictive values.For a statistically significant pharmacophoric model, the hypothesis generated prior to scrambling should be better than the rest, having lower cost values and higher correlation.The statistical significance is given by the equation of significance = [1 − (1 + )/], where  is the total number of hypotheses having total cost lower than the most significant hypothesis and  is the number of initial HypoGen runs plus random runs.
The biological activities of the molecules in the training set were randomized and the resulting training sets were used for the HypoGen runs keeping all parameters as per the initial HypoGen calculation.In our validation test, we selected the   8 Journal of Chemistry 95% confidence level, and 19 spreadsheets were generated by CatScramble.

Activity Prediction of Internal and External Test Sets.
Test set validation was performed to evaluate the prediction performance of the generated pharmacophore model.An internal test set of 9 compounds from the same chemical series, not involved in the training set, was selected to validate the best pharmacophore model.Moreover, the selected model was also validated using an external test set of 15 compounds taken from a different chemical series having diverse structures [19].The reason behind this was to test the prediction ability of the model in a wide molecular domain where the external molecules are not similar to the training set molecules.The test set molecules were built and energy minimized and their conformational analysis was done similar to the training set.The pharmacophore mapping protocol was applied which uses catalyst to identify ligands that map to a pharmacophore and aligns the ligands to the query.Only the best mapping for each ligand was allowed.The squared correlation coefficient (r 2 ) was obtained by plotting the actual versus estimated activities of the test set compounds.

External Validation with Advanced UT Antagonists.
As an additional approach to external validation, some advanced UT antagonists (Figure 1) obtained from the literature [4,[20][21][22][23] were mapped on the obtained pharmacophore model.Mapping of these compounds with Hypo 1 was studied for the number of features mapped and fit values.

Virtual Screening with Sequential
Filters.The validated pharmacophore model was applied as a 3D query tool to screen the Maybridge chemical database consisting of over 56,000 organic compounds to retrieve new chemical entities as potent UT antagonists.The virtual screening was performed using the Fast/Flexible search option of DS Catalyst to retrieve putative compounds having their chemical moieties spatially mapped with corresponding features in the pharmacophoric query.Sequential computational filters were applied to find out the best compounds from such a large pool of compounds.
The compounds were first filtered by Lipinski's "rule of five" that sets the criteria for drug-like properties.According to this rule, poor absorption is expected if molecular weight >500, log  > 5, hydrogen bond donors > 5, and hydrogen bond acceptors > 10 [24].Compounds violating more than one of these rules may not have appropriate bioavailability.[25].
Secondly, the molecules that satisfied all the features of the pharmacophore model used as the 3D query in database searching were retained as hits.This was done on the basis of fit values.Conformational analysis using the "Best" conformation generation method was performed.The "ligand pharmacophore mapping" protocol was applied to map the hits onto the pharmacophores.The fit values were calculated on the basis of the chemical substructures matching to the location constraints of the pharmacophoric features and their distance deviation from the feature centers.High fit values signify good matches.In addition to this the database compounds were also selected on the basis of the estimated values.The flowchart in Figure 2 is a schematic representation of the sequential filters for virtual screening.

Statistical Assessment of Pharmacophore Model. The
HypoGen algorithm [26] implemented in Discovery Studio generated ten pharmacophore hypotheses using all the information related to conformational models and chemical features of 35 training set compounds.The fixed cost of the 10 top-scored hypotheses was 136.081 bits, and the null hypothesis cost was 190.604 bits.A difference of 54 bits between null and fixed costs signified the predictive nature of the hypotheses.The first hypothesis (Hypo1) consisting of four features, one HBA L, one HY, and two RA, was considered as the best pharmacophore model, on the basis of high correlation coefficient, high cost difference, and low RMSD (Table 2).
The total cost of Hypo 1 (150.914)and the large difference between null and total cost ( cost = 39.69)coupled with a high correlation coefficient ( = 0.902) and a low RMSD (0.886) ensured that a true correlation had been established in the model.Moreover, the cost difference between total and fixed costs for the best hypothesis was only 14.833 bits, indicating the high probability of the true correlation of the data.The configuration cost (17.286) of the hypothesis slightly exceeded the limit of 17 bits but can be accepted as the model meets the other criteria of validation [27].
Highly validated and statistically perfect pharmacophore model (Hypo1) was used to predict the activity of training set compounds.All the compounds were classified in the activity scale of highly active (<10 nM, +++), moderately active (10-1000 nM, ++), and inactive (>1000 nM, +).Among the 35 training set compounds, four highly active compounds were predicted as moderately active and one moderately active compound was predicted as highly active.Consequently, for 30 out of 35 training set compounds, the predicted activity values were within the same activity scale as the experimental values.
Table 3 represents the actual and predicted activities of the training set based on the best pharmacophore hypothesis.Figure 3 shows the graph plotted between the actual and  confidence that the Hypo1 was not generated by chance, as indicated by the better statistical values (lowest total cost and highest correlation coefficient) of Hypo 1 (Table 4).

Activity Prediction of Internal and External Test Sets.
The model was subjected to test set validation to find out how correctly the model predicts the activity of the test set molecules.Out of the 9 internal test set compounds, one compound showed a high error ratio of 96.2, suggesting that this compound might be a potential outlier (Table 5).Its removal significantly improved the r 2 (0.707) between the actual and estimated activities of the test set (Figure 4).This clearly demonstrated that the pharmacophore model has the ability to predict the activity of new compounds with high efficiency.
The selected pharmacophore was further validated using an external set of phenylpiperidine-benzoxazinones-based 15 molecules with UT antagonist activity [19].The activities of all the external test set compounds were estimated using Hypo 1 (Table 5).The squared correlation coefficient value of 0.614 testified the prognostic nature of the developed pharmacophore model (Figure 5).This validation gave an added confidence in the usability of the selected pharmacophore.
3.4.Pharmacophore Mapping Analysis.The pharmacophore features (1HBA L, 2RA, and 1HY) and their interfeature distances are given in Figure 6(a) and Table 6.The two most active compounds 7a and 7n are mapped very well with all the four pharmacophore features (Figures 6(b) and 6(c)).The pharmacophore mapping reveals that the basic piperazine nitrogen having ethyl group (R 2 ) is mapped to the HBA L feature and phthalimide and isoindolinone benzene ring is mapped to one RA whereas the dimethoxybenzene ring is mapped to the other RA feature.The HY feature is mapped with one of the methoxy group of dimethoxybenzene.The least active compound 5l is mapped only with two features (1RA and 1HY) out of four, missing 1RA and 1HBA L (Figure 6(d)).The pharmacophore mapping analysis discloses the importance of the main scaffold towards antagonist activity.Nevertheless, the N-alkyl groups at R 2 position are important for the basicity of the piperazine nitrogen.Ethyl has been the most favourable N-alkyl group for antagonist activity.On the other hand, bulkier N-alkyl substitutions creating steric hindrance are not favourable for activity.The pharmacophore mapping shows that R 1 substitutions are not playing an important role in the drugreceptor interaction which is evident by the presence of some  Moreover, the conformationally constrained scaffold having less conformational freedom is important for activity.

Mapping with Advanced UT Antagonists.
As an additional validation step, some of the chemically diverse advanced UT antagonist drug candidates like palosuran, SB-611812, SB-657510, SB-706375, and SB-436811 [4,[20][21][22][23] were mapped on the developed pharmacophore.All five compounds showed three-feature mapping with good fit values (Figure 7).SB-611812 and SB-657510 mapped with two RA and one HY, missing the HBA L feature.Whereas SB-706375, SB-436811, and palosuran mapped well with one RA, one HY, and one HBA L, thus missing one RA feature.SB-611812 showed the maximum fit value of 6.811.
3.6.Virtual Screening.A total of 298 compounds were retrieved as hits from virtual screening of Maybridge database.These compounds were first screened for drug-like properties using Lipinski's rule of 5 as filter.230 compounds that passed the screening were overlaid with the best 3D pharmacophore model (Hypo1) by using the "Best Fit" selection.Out of these, 20 compounds had an estimated activity within the range of 1-1000 nM and fit value >5.5.The best 5 compounds having estimated activity (nM) in single and double digits are reported in Table 7.

Conclusions
In view of the potential of UT antagonists as novel cardiovascular agents, we attempted to develop a 3D pharmacophore model to evaluate the structural requirements for potent UT antagonists.The developed pharmacophore model was validated with multiple methods.The results demonstrate that one hydrogen bond acceptor lipid (HBA L) and two ring aromatic (RA) and one hydrophobic (HY) features contribute significantly towards the antagonist activity.The pharmacophore model was used for screening the Maybridge chemical database to identify new compounds as UT antagonists.This pharmacophore model provides stringent structural requirements along with interfeature distances and can be further used for the design and development of novel UT antagonists.

Figure 1 :Figure 2 :
Figure 1: Advanced UT antagonists used for external validation.

−Log IC 50 Figure 3 :
Figure 3: Graph plotted between the actual and estimated activity of training set of compounds.

Figure 4 :Figure 5 :
Figure 4: Graph plotted between the actual and estimated activity of internal test set of compounds.

aFigure 6 :Figure 7 :
Figure 6: The obtained pharmacophore model (orange: RA, green: HBA L, blue: HY, and grey: excluded volumes) showing (a) interfeature distances, (b) mapping of one of the most potent compounds, 7a, (c) mapping of the other most potent compound, 7n, and (d) mapping of the least active compound 5l.

Table 2 :
Statistical values and features for the top 10 hypotheses.

Table 3 :
Actual and predicted activity of the training set compounds.

Table 4 :
Validation of the Hypo 1 using the CatScramble program.

Table 5 :
Actual and predicted activities of internal and external test sets based on the best pharmacophore hypothesis Hypo 1.

Table 6 :
Interfeature distances (in Å) between the obtained pharmacophore features a .

Table 7 :
Hits obtained from pharmacophore-based Maybridge chemical compound database screening.