CoMFA and CoMSIA Studies on Inhibitors of HIV-1 Integrase-Bicyclic Pyrimidinones

Abstract: To understand the structural requirements of HIV-1 integrase inhibitors and to design new ligands against human HIV-1 integrase with enhanced inhibitory potency, a 3D QSAR (quantitative structure-activity relationship) study with comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) for a dataset of 35 bicyclic pyrimidinones which are inhibitors of human HIV-1 integrase was performed. QSAR models were computed with Sybyl. The 3D QSAR model showed very good statistical result, namely q, r and rpred values were high for both CoMFA and CoMSIA. Based on the high values for q and r we are confident that the 3D QSAR model gives good predictions that may be used to design better HIV-1 integrase inhibitors. The CoMFA and CoMSIA models reveal that steric and electrostatic fields contribute significantly with biological activities of the studied compounds.


Introduction
Human immunodeficiency virus type 1 (HIV-1) is responsible for human acquired immunodeficiency syndrome (AIDS), one of the most urgent world health threats.The HIV-1 genome, together with protease and reverse transcriptase encodes a third enzyme, integrase, which is necessary for the viral life cycle and is an appealing target for drug discovery for its essential direct counterpart in the host cells 1,2 .HIV-1 integrase catalyses the incorporation of the viral DNA into the host cell's genome through a multi-step process.In the initial assembly step, integrase binds to specific sequences of the double stranded viral DNA, newly formed by reverse transcription.An endonucleolytic cleavage of the last two nucleotides from the 3' terminus of both strands of the viral DNA generates a processed DNA/integrase pre-integration complex, which crosses the nuclear membrane to access the host genome.The cell DNA is then cleaved by integrase in a nonspecific way and the shortened strands of viral DNA are integrated into the host DNA sequence.
Selective inhibition of the HIV-1 integrase activity causes an interruption of the HIV-1 replication cycle and could represent an answer to the unmet medical need for new and improved treatments 3 .
It is reported that, bicyclic pyrimidinones [4][5][6] have been evolved from N-methyl pyrimidinone.Introduction of a suitably substituted amino moiety modulated the physical and chemical properties of the molecules and conferred nanomolar activity in the inhibition of spread of HIV-1 infection in cell culture.
Quantitative structure-activity relationships (QSAR) models establish statistical relationships between the biological activity exerted by a series of compounds and a set of parameters determined from their structures (structural descriptors).The mathematical 3D QSAR equations can be computed with the help of a large number of statistical models such as multi-linear regression, partial least squares (PLS) or artificial neural networks.An important component of all QSAR models is a proper validation and evaluation of the prediction power.
CoMFA and CoMSIA are usually applied in combination with experimental techniques when molecules are highly flexible 7,8 .In the present study molecules are not as flexible and theoretical calculations are sufficient for their alignment.
The CoMFA and CoMSIA methodologies assume that a suitable sampling of steric, electrostatic and hydrogen-bond donor fields around a set of aligned molecules provides all the information necessary for understanding their biological properties.The present study is aimed to gain insights into the steric, electrostatic and hydrogen-bond donor properties of bicyclo pyrimidinones, their influence on the activity and to derive predictive 3D QSAR models for design and prediction of the activities of new derivatives for this class of inhibitors.

Experimental
The in vitro biological activity data reported as IC 50 for inhibition of human HIV-1 integrase by the bicyclic pyrimidinones was taken from the published work of Muraglia et al. 9 and used for the current study.The reported IC 50 values were converted into the corresponding pIC 50 values.In the present study there are four core molecules with different substituents comprising of 35 bicyclic pyrimidinones.Their structures, pIC 50 and the predicted activities from CoMFA and CoMSIA were given in the Table 1 Three dimensional structure building and all modeling were performed using the Sybyl 6.9 molecular modeling program package 10 .Energy minimization was performed using the Tripos force field 11 and the Gasteiger-Huckel charge with a distance dependant dielectric and conjugate gradient algorithm with convergence criterion of 0.001 k cal/mol.

Alignment
In the present study the optimized structures were aligned on the template BP30, which is the most active molecule among the set (Figure 1).All the molecules were aligned by the Align Database command available in Sybyl using maximum substructure.This adjusts the geometry of the molecules such that their steric and electrostatic fields match the fields of the template molecule.

CoMFA analysis
The steric and electrostatic CoMFA 12 potential fields were calculated at each lattice intersection of a regularly spaced grid of 2.0Å.The grid box dimensions were determined automatically in such a way that region boundaries were extended beyond 4.0Å in each direction from coordinates of each molecule.The Van der Waals potentials and columbic terms were calculated using the Tripos force field.A sp 3 hybridized carbon atom with +1 charge served as probe atom to calculate steric and electrostatic fields.The steric and electrostatic contributions were truncated to +30.0 k cal/mol.

CoMSIA analysis
In the CoMSIA 13 model, three physicochemical properties, namely steric, electrostatic and hydrogen-bond donor fields were calculated using a sp 3 hybridized carbon probe atom with a +1 charge and a radius of 1.0Å placed at regular grid spacing of 2.0Å and a similar lattice box as used in CoMFA calculations.

Partial least square (PLS) analysis
The CoMFA or CoMSIA descriptors were used as independent variables and pIC 50 values as dependant variables in partial least square regression analysis [14][15][16][17] .The minimum sigma (column filtering) was set to 2.0 k cal/mol.to improve the signal-to-noise ratio.The predictive correlation coefficient (r 2 pred ) based on the test molecules, is computed with the formula r 2 pred = (SD-PRESS) / SD, where SD is the sum of the squared deviations between the biological activities of the test set and mean activities of training set molecules and PRESS is the sum of squared deviation between predicted and actual activity for every molecule in test set.r 2 cv is also calculated, where r 2 cv = (SD'-PRESS) / SD', where SD' is the sum of the squared deviations between the biological activities of the test set and mean activities of test set molecules.

Validation of the 3D QSAR models
To validate the stability and predictive ability of the obtained models, 7 compounds not included in the construction of CoMFA and CoMSIA models were selected as test set.Thus the data base of 35 molecules consisted of 28 molecules in the training set and 7 in the test set.The plots of experimental activity vs. predicted activity (Figure 2) clearly show that the predicted pIC 50 values of the test set compounds are in good agreement with the experimental data in a tolerable error range, with r 2 pred of 0.549 and 0.546 for CoMFA and CoMSIA models, respectively.These results derived from CoMFA and CoMSIA models show stable models that can reliably used to design novel inhibitors with desired activity for the HIV-1 integrase.

Significance of statistical parameters
The statistical parameters are given in

CoMFA and CoMSIA contour maps
To visualize the information content of the derived 3D QSAR models, CoMFA and CoMSIA contour maps were generated 18 .The field energies at each lattice point were calculated as the scalar results of the coefficient and the standard deviation associated with a particular column of the data table ("stdev * coeff"), which was always plotted as the percentage of the contribution to the CoMFA or CoMSIA equation.
The CoMFA contour maps of steric and electrostatic field are shown in Figures 3 and 4. The contours of the steric map are shown in yellow and green and those of the electrostatic map are shown in red and blue.Greater values of 'Bio-Activity Measurement' are correlated with: more bulk near green; less bulk near yellow; more positive charge near blue and more negative charge near red.Steric contour maps of BP30 (high pIC 50 ), BP17 (medium pIC 50 ) and BP2 (least pIC 50 ) are displayed in Figure 3. Less activity of molecule BP2 can be explained by presence of bulky groups like Nmethyl groups into the disfavored region and the moderate activity of BP17 is due to the penetration of bulky groups from unfavored to favored region.BP30 higher activity can be explained because it is orienting away from the disfavored region and is inclining towards the favored region.Electrostatic contour maps of BP30, BP17 and BP2 are displayed in Figure 4.In BP17, electronegative groups like nitrogen, oxygen are orienting towards both favoring and unfavoring regions which explains its moderate activity.In molecule BP2, the electronegative groups are orienting towards unfavored region, so it shows least activity.In BP30, the electronegative groups are orienting away from unfavored or blue region.Blue contour also has hydrogens which are on the electronegative groups.Both the conditions are favorable.Hence, BP30 shows highest activity.
Contour maps of CoMFA help to design molecules with increased activity.Introduction of bulkier groups or hydrophobic groups on the pyrazole ring in BP30, suggests increased activity.The CoMSIA contour map of the hydrogen-bond donor is shown in Figure 7. Grey contours (contribution level of 80%) indicate regions where hydrogen-bond donor group increases activity.In the molecule BP30 (a), the donor contours are overlapping the donor hydrogens (hydrogens on nitrogen and oxygen which can be substituted easily).Whereas in the molecule BP2 (b), the donor contours do not overlap the donor hydrogens as they are away from the field.This also explains the high activity of BP30 and least activity of BP2.HIV-1 integrase (PDB ID-1qs4) is not a complete protein.Some amino acids are missing.The structure has to be generated and loop modeling has to be done.Once a valid protein structure is obtained inhibitors will be docked into the active site.Depending upon their pharmacodynamic characteristics new molecules can be designed that will be synthesized to analyze their activity against HIV-1 integrase.

Conclusion
In this study, we have established predictive CoMFA and CoMSIA 3D QSAR models for the HIV-1 integrase inhibitors.Compounds like BP30, which has better activity can be taken as best hit molecule based on which new inhibitors of HIV-1 integrase can be designed and used for further studies like docking analysis, synthetic studies and biological activity studies.

Figure 1 .
Figure 1.Atom based alignment of bicyclic pyrimidinones on BP30 et al.

Figure 4 .Figure 5 .Figure 6 .
Figure 4. Electrostatic contour maps of BP30 (a), BP17 (b), BP2 (c) molecules from CoMFA The steric and electrostatic field distributions of CoMSIA are depicted in Figure 5 and 6 and are generally in accordance with the field distributions of CoMFA map (Figure 3 & 4)

Figure 7 .
Figure 7. Hydrogen-bond donor contour maps of BP30 (a), BP2 (b) molecules from CoMSIAHIV-1 integrase (PDB ID-1qs4) is not a complete protein.Some amino acids are missing.The structure has to be generated and loop modeling has to be done.Once a valid protein structure is obtained inhibitors will be docked into the active site.Depending upon their pharmacodynamic characteristics new molecules can be designed that will be synthesized to analyze their activity against HIV-1 integrase.

Table 1 (
a). Structures and biological activities of molecules used in the present study

Table 1 (
b). Structures and biological activities of molecules used in the present study * BP33 1 * represent the test set.'a' stands for the outlier molecule

Table 2
, r 2 value explains a variance in biological activity with 98.1% and 94.7% (CoMFA and CoMSIA respectively) for training set and r 2 pred value explains a variance in biological activity with 54.9% and 54.6% (CoMFA and CoMSIA respectively) for test set.The steric and electrostatic parameters contribute 48.2% and 51.8% towards biological activity by CoMFA.Steric, electrostatic and hydrogen-bond donor parameters contribute 34.0%, 50.5% and 15.5% respectively towards biological activity by CoMSIA.Figure 2. Plots of experimental activity vs. predicted activity from CoMFA and CoMSIA (Pink colored values represent the test set)

Table 2 .
The statistical parameters for the CoMFA and CoMSIA models a Cross validated correlation coefficient, b Non-cross validated correlation coefficient, c Optimum number of components, d Standard error of estimate, e F-test value, f Predictive r 2 , g Predictive r 2 by taking mean of test set activities