3D QSAR Studies of DAMNI Analogs as Possible Non-nucleoside Reverse Transcriptase Inhibitors

The non-nucleoside inhibitors of HIV-1-reverse transcriptase (NNRTIs) are an important class of drugs employed in antiviral therapy. Recently, a novel family of NNRTIs commonly referred to as 1-[2-diarylmethoxy] ethyl) 2-methyl-5-nitroimidazoles (DAMNI) derivatives have been discovered. The 3D-QSAR studies on DAMNI derivatives as NNRTIs was performed by comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methods to determine the factors required for the activity of these compounds. The global minimum energy conformer of the template molecule 15, the most active molecule of the series, was obtained by simulated annealing method and used to build the structures of the molecules in the dataset. The combination of steric and electrostatic fields in CoMSIA gave the best results with cross-validated and conventional correlation coefficients of 0.654 and 0.928 respectively. The predictive ability of CoMFA and CoMSIA were determined using a test set of ten DAMNI derivatives giving predictive correlation coefficients of 0.92 and 0.98 respectively indicating good predictive power. Further, the robustness of the models was verified by bootstrapping analysis. The information obtained from CoMFA and CoMSIA 3D contour maps may be of utility in the design of more potent DAMNI analogs as NNRTIs in future.


Introduction
The reverse transcriptase (RT) of the Human Immunodeficiency Virus is a key target in the treatment of Acquired Immune Deficiency Syndrome (AIDS) for which no completely successful, chemotherapy is yet available 1 .There are two classes of HIV-1-RT inhibitors, the nucleoside (NRTIs) (e.g., AZT, 3TC, ddI, ddC) and the non-nucleosideinhibitors (NNRTIs) depending on their mechanism of action. NRTIs act on the catalytic site of the reverse transcriptase enzyme, preventing DNA synthesis, whereas NNRTIs bind noncompetitively to a hydrophobic site close to the catalytic site, forcing the enzyme to adopt an inactive conformation 2 .The advantage of NNRTIs over NRTIs is that they are relatively less cytotoxic and more selective than NRTIs. However, the real advantage is abolished by a broad cross resistance by most of the NNRTIs.To date only three non-nucleosideRT inhibitors have been approved for clinical use namely nevirapine, delaviridine and efavirenz . Several NNRTIs (MKC442, Troviridine, S-1153/ AG1549 PNU142721, ACT, and HBY1293/GW420867X) are currently undergoing clinical trials. Other examples of NNRTIs include TIBO compounds 3 , HEPT derivatives 4 , BHAP analogs 5 , 2-pyridinones 6 and PETT compounds 7 . Recently 1-[2-(diarylmethoxy)-ethyl] 2-mehyl-5-nitroimidazoles (DAMNIs), a novel family of NNRTIs active at submicromolar concentration has been discovered [8][9][10] .
Comparative molecular field analysis (CoMFA) and Comparative molecular similarity indices analysis (CoMSIA) are powerful and versatile tools to build and design an activity model (QSAR) for a given set of molecules in rational drug design and related applications 11 .CoMFA methodology is based on the assumption that the changes in the biological activity correlate with the changes in the steric and electrostatic fields of the molecules. The CoMSIA 12,13 method differs by the way the molecular fields are calculated and by including additional molecular fields, such as lipophilic and hydrogen bond potential. The additional fields in CoMSIA provide better visualization and interpretation of the obtained correlation in terms of field contribution to the activity of the compound. On the basis of CoMFA and CoMSIA models for DAMNI derivatives, we attempted to elucidate a structure/activity relationship to provide useful information for the design and synthesis of more potent DAMNI analogs and related derivatives with predetermined affinities.

Dataset for analysis
Reported data on a series of 40 DAMNI analogs 8-10 were used ( Table 1). The EC 50 data were used for QSAR analysis as a dependent parameter after converting reciprocal of the logarithm of EC 50 (pEC 50 ) values. EC 50 is the micromolar concentration of the compounds required to achieve 50% protection of MT-4 cells from HIV-1 induced cytopathogenicity, as determined by MTT method. The total set of DAMNI analogs (40 compounds) was divided into the training set (30 compounds) and test set (10 compounds) ( Figure 1).The ratio of training set molecules to test set molecules was in the approximate ratio 4:1.Test and training set compounds were chosen manually such that low, moderate, and high activity compounds were present approximately in equal proportions in both sets.

Molecular modelling
The 3D-QSAR was performed 14 using SYBYL 7.1 installed on a Dell computer with Red Hat Linux Enterprise Version 3.0.The initial conformation of the most active analog 15 was obtained from simulated annealing as it enables the rapid identification of the global minimum energy conformer 15 . The system was subjected to simulated annealing by heating at 1000 K for 1 ps and then cooling at 200 K for1 ps. The exponential annealing function was used and 10 such cycles were run. The least energy conformer obtained by this method was subjected to further minimization. The minimized conformer, thus obtained, was taken as the template and rest of the molecules were built from it. A constrained minimization followed by full minimization was carried out on these molecules in order to prevent the conformations moving to a false region. Tripos force field and partial atomic charges calculated by the Gasteiger-Huckel method were used. Powell's conjugate gradient method was used for minimization. The gradient of 0.05 kcal mol -1 A° -1 was set as a convergence criterion.

Alignment
The most crucial input for CoMFA is the alignment of the molecules. The template molecule 15 was taken and the rest of the molecules were aligned to it using the DATABASE ALIGNMENT method in the SYBYL. The molecules were aligned to the template molecule by using common substructure labeled with* in 1( Figure 1). The aligned molecules are shown in Figure 2.

CoMFA interaction energy calculation 12
The steric and electrostatic CoMFA fields were calculated at each lattice intersection of a regularly spaced grid of 2.0 A° in all three dimensions within defined region. The van der Waals potential and coulombic terms representing the steric and electrostatic fields respectively were calculated using standard tripos force fields. A distance dependent dielectric constant of 1.00 was used. An sp3 carbon atom with +/1.00 charge was used as a probe atom. The steric and electrostatic fields were truncated at +/30.00 kcal mol -1 , and the electrostatic fields were ignored at the lattice points with maximal steric interactions.

CoMSIA interaction energy calculation 12
The steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor potential fields were calculated at each lattice intersection of a regularly spaced grid of 2.0 A°. A probe atom with radius 1.0 A° and +/1.0 charge with hydrophobicity of +/1.0 and hydrogen bond donor and hydrogen bond acceptor properties of +/1.0 was used to calculate steric, electrostatic, hydrophobic, donor and acceptor fields. The contribution from these descriptors was truncated at 0.3 kcal mol -1 .

Partial least square (PLS) analysis 12
PLS method was used to linearly correlate the CoMFA fields to the inhibitory activity values. The cross-validation 16,17 analysis was performed using the leave one out (LOO) method in which one compound is removed from the dataset and its activity is predicted using the model derived from the rest of the dataset. The cross-validated r 2 that resulted in optimum number of components and lowest standard error of prediction were considered for further analysis. Equal weights were assigned to steric and electrostatic fields using COMFA_STD scaling option. To speed up the analysis and reduce noise, a minimum filter value σ of 2.00 kcal mol -1 was used. Final analysis was performed to calculate conventional r 2 using the optimum number of components. To further assess the robustness and statistical confidence of the derived models, bootstrapping analysis for 100 runs was performed.
Bootstrapping involves the generation of many new data sets from original data set and is obtained by randomly choosing samples from the original data set. The statistical calculation is performed on each of these bootstrapping samplings. The difference between the parameters calculated from the original data set and the average of the parameters calculated from the many bootstrapping samplings is a measure of the bias of the original calculations. The entire cross-validated results were analysed considering the fact that a value of r 2 cv above 0.3 indicates that probability of chance correlation 12 is less than 5%.

Predictive correlation coefficient 12
The predictive ability of each 3D-QSAR model was determined from a set of ten compounds that were not included in the training set. These molecules were aligned, and their activities were predicted. The predictive correlation coefficient (r 2 pred ), based on molecules of test set, is defined as, r 2 pred = (SD-PRESS)/SD where SD is the sum of the squared deviations between the biological activities of the test set and mean activities of the training set molecules and PRESS is the sum of squared deviation between predicted and actual activity values for every molecule in test set.

Results and Discussion
The CoMFA model obtained with 30 DAMNI derivatives in training set resulted in a sixcomponent model with cross-validated correlation coefficient of 0.697 and minimum standard error. This analysis was used for final non-cross validated run, giving a correlation coefficient of 0.925 giving a good linear correlation between the observed and predicted activities of the molecules in the training set. To test the predictive ability of the resulting model, a test set of ten molecules excluded from the model creation work was used. The predictive correlation coefficient of 0.924 was obtained for CoMFA model. A high r 2 value of 0.739 during 100 runs of bootstrapped 18,19 analysis further supports the statistical validity of the model. The results of PLS analysis for CoMFA and CoMSIA are shown in Table 2. The alignment of the training set molecules is shown in Figure 2. The relative contributions of steric and electrostatic fields for CoMFA are in the ratio 6:3. Steric interactions of molecule with active site of the enzyme could be an important factor for NNRT inhibitory activity. A plot of predicted (CoMFA) versus actual activity for training set molecules is shown in Figure 3.   bs =correlation coefficient after 100 runs of bootstrapping analysis, SD=/standard deviation from 100 bootstrapping runs, S=/steric field, E=electrostatic field, H=/hydrophobic field, D=/hydrogen bond donor field, A=/hydrogen bond acceptor field. Figure 4 represents the plot of predicted (CoMSIA) versus actual activity values, while the test set residuals of CoMFA and CoMSIA analyses are shown in Figure 5. The actual, predicted and residual values of training and test set for CoMFA and CoMSIA are given in Residuals  Tables 3 and 4 respectively. Contour maps were generated as scalar product of coefficients and standard deviation associated with each CoMFA column. The 3D-QSAR contour maps revealing the contribution of CoMFA and CoMSIA fields are shown in Figures. 6 and   The red regions near the third position of imidazole ring of template molecule indicates that biological activity can be enhanced by introduction of more electronegative groups at this position for strong electrostatic field interactions. The blue region near the fifth substitution position of imidazole ring suggests that biological activity will be decreased by electronegative group at the above-mentioned position. Two red contours near the thiophene group between suggest that biological activity will be diminished by introduction of electropositive groups at this site. Green contour near fourth and fifth position of thiophene ring indicates that the bulky groups at this position will increase activity. Yellow contour near the sixth position of phenyl ring indicates that bulky substituents at these positions decrease anti-HIV activity. The CoMSIA results were obtained using the same structural alignment and same training and test set as defined in the CoMFA. The combination of steric and electrostatic fields in CoMSIA gave the best results (Model 1), giving cross-validation correlation coefficient of 0.654, conventional correlation coefficient of 0.928 and predictive correlation coefficient of 0.982. The other combinations like (i) steric, electrostatic and hydrophobic fields (Model 2) and (ii) all fields (Model 3) in CoMSIA also gave statistically significant models. The other combinations in CoMSIA gave statistically insignificant results (data not shown). The Models 2 and 3 exhibit relatively lower cross-validated, conventional and predictive correlation coefficients compared to model 1, the best amongst various fields combinations in CoMSIA. The Model 1 of CoMSIA was used for final analysis and predictions. The r 2 value of 0.640 during 100 runs of analysis 18,19 also shows that the Model 1 is stable and statistically robust. The contributions of steric and electrostatic fields of ComSIA are in ratio 6:3 (Table 2).Comparing this with the field contributions of CoMFA analysis, it is revealed that steric and electrostatic interactions could be an important factor for NNRT inhibitory activity. A comparison of the residuals of the models from CoMFA and CoMSIA is made to evaluate their predictive ability (Table 4). Molecule 33 shows high residual value and this may be due to the absence of a methyl group in the second position of the imidazole ring in 33 with respect to the corresponding substituents in other test molecules. Considering the steric contours of CoMSIA (Model 1), green (G) contours indicate favourable regions while yellow (Y) contours indicate unfavourable regions for bulkier substituents. In the electrostatic contours, the introduction of electronegative substituents in red (R) regions may increase the affinity while in blue (B) regions decrease the affinity. The steric contours (Figure 6a and 7a) produced by CoMFA and CoMSIA (Model 1) respectively are quite different. The analysis of steric CoMSIA contours shows presence of a green contour near the third and fourth position of the phenyl ring, a small yellow contour at the fifth position of the phenyl ring and a yellow contour enclosing the thiophene ring. The electrostatic contours ( Figure. 6b and 7b) produced by CoMFA and CoMSIA are slightly different. The analysis of electrostatic CoMSIA contours shows presence of a small red (R) contour near fourth position of the phenyl ring of the template molecule, a blue (B) contour near the third and fourth position of the phenyl ring and a blue contour near the thiophene ring. The best CoMSIA model (Model 1) indicates that hydrophobic, hydrogen bond donor and hydrogen bond acceptor fields do not improve the model. The CoMFA and CoMSIA models described are predictive enough to guide the design of new molecules.

Conclusions
The 3D-QSAR analyses, CoMFA and CoMSIA have been applied to a set of DAMNI analogs active against HIV -1 RT .Statistically significant models with good correlative and predictive power for NNRT inhibitory activities of the DAMNI analogs were obtained. The initial geometry of the template molecule (15, the most active of the series) was obtained from the simulated annealing approach and was then used to derive remaining structures. The robustness of the derived models was verified by bootstrapping method. The comparison of CoMFA and CoMSIA models reveal that the combination of steric and electrostatic fields in CoMSIA gave the best results. Results of this study may be utilized an important basis for future drug design studies and synthesis of more potent HIV-1 RT inhibitors.