D QSAR of Pyrrolo Pyrimidine and Thieno Pyrimidines as Human Thymidylate Synthase Inhibitors

Thymidylate synthase (TS) is a crucial enzyme for DNA biosynthesis and many nonclassical lipophilic antifolates targeting this enzyme are quite efficient and encouraging as antitumor drugs. We report 3D-QSAR analyses on pyrrolo pyrimidine and thieno pyrimidine antifolates to contemplate the mechanism of action and structure-activity relationship of these molecules. By applying leave-one-out (LOO) cross-validation study, cross-validated q value of 0.523 and 0.566 for CoMFA Ligand based(LB) and Receptor based(RB), 0.516 and 0.471 for CoMSIA LB and RB respectively. while the non-cross-validated r values were found to be 0.974 and 0.969 for CoMFA LB and RB, 0.983 and 0.972 for CoMSIA LB and RB respectively. The models were graphically interpreted using CoMFA and CoMSIA contour plots. The results obtained from this study were used for rational design of potent inhibitors against thymidylate synthase.


Introduction
Folate metabolism has long been recognised as an attractive target for cancer chemotherapy because of its indispensable role in the biosynthesis of nucleic acid precursors 1,2 .Thymidylate Synthase (TS) 3 catalyses the reductive methylation of 2′-deoxyuridine -5′monophosphate (dUMP) to 2′-deoxythymidine-5′-monophosphate (dTMP) utilising 5, 10methylene tetrahydrofolate as the source of methyl group as well as the reductant 4 .This represents the sole denovo source of dTMP, hence inhibition of TS, in the development of antitumour agents.
Thymidylate Synthase (TS) is not a new target.However, there is active enthusiasm for the development of improved derivatives for TS specific inhibitors.Several TS inhibitors have been found in clinical utility as antitumour agents.Usually, 2-amino-4oxopyrimidine ring is considered important for potent TS inhibitory activity.Examples of clinically used TS inhibitors are raltitrexed (ZD1694) 5 , pemetrexed (alimta, LY231514) 6 and PDDF 2,7 .The antifolate molecules evaluated in this investigation are derivatives of pyrrolo pyrimidine, thieno pyrimidines, having structures similar to the class of TS antifolates.Due to the interest in new anticancer drugs, several substituted pyrimidine inhibitors were chosen for screening against Human TS.A sound of understanding of the structural requirement for anticancer activity in substituted pyrimidines is important in guiding and optimising drug design efforts.
In the present study, the mechanism of intermolecular interaction between the most potent molecule 1 and TS was studied with the help of comparative molecular field analysis (CoMFA) 8,9 and comparative molecular similarity indices analysis (CoMSIA) 10 methodologies using the partial least squares (PLS) method 11,12 .CoMFA analysis involves the alignment of molecules in a structurally and pharmacologically reasonable manner on the basis of the assumption that each molecule acts via a common macromolecular target binding site.In this method, it is possible to predict the biological activity of molecules and represent the relationships between molecular properties (steric and electrostatic) and biological activity in the form of contour maps.CoMSIA approach calculates similarity indices in the space surrounding each of the aligned molecules in the dataset.CoMSIA is believed to be less affected by changes in molecular alignment and provides smooth and interpretable contour maps as a result of employing Gaussian type distance dependence with the molecular similarity indices it uses.Furthermore, in addition to steric and electrostatic fields of CoMFA, CoMSIA defines explicit hydrophobic and hydrogen bond donor and acceptor fields.Such 3D QSAR models would be of great help in a drug development program since the activity of new analogues could be quantitatively predicted before attempting their synthesis and testing.

Experimental Section
All pyrrolo pyrimidine and thieno pyrimidine derivatives and their invitro biological activity (IC 50 ) were taken from the literature [13][14][15][16][17][18] .The IC 50 values in µM units were converted to the corresponding pIC 50 values using the formula (pIC 50 = -logIC 50 ) (Table 1).The pIC 50 values of the training set described in this manuscript span approximately 3.39 log units.A total of 34 pyrrolo pyrimidine and thieno pyrimidine molecules were used as a data set, of which a set of 7 molecules were chosen as the test set, while the remaining 27 molecules were treated as a training set.The structures of all of the molecules are shown in (Fig. 1 and Table 1).The selected test set represented a range of inhibitory activity similar to that of a training set and was used to evaluate the predictive power of the CoMFA and CoMSIA models.All molecular modeling calculations were performed on a Linux operating system.Three dimensional structure building and all modeling were performed using the SYBYL -X 1.2 molecular modeling program package 19 .Gasteiger-Hückel charges were assigned and then energy minimization of each molecule was performed using the conjugate gradient method and Tripos FF standard force field with a distance-dependent dielectric function.The minimization was terminated when the energy gradient convergence criterion of 0.001 kcal mol −1 •Å −1 was reached.For ligand based 3D QSAR, the low energy conformation thus obtained was then aligned on most active molecule in the series, molecule 1 using ALIGN DATABASE command in SYBYL -X 1.2 taking the substructure that is common to all.The resulting alignment model (Figure 2b) was then subjected to CoMFA and CoMSIA studies.For receptor based alignment best docked mode of the most active molecule 1 was taken as template.All minimized structures were aligned on this template to get the molecular alignment for receptor based CoMFA and CoMSIA (Figure 2c).The accuracy of the prediction of CoMFA and CoMSIA models and the reliability of the contour models depend strongly on the structural alignment of the molecules.The X-ray structure of TS was obtained from protein data bank (1JU6) which was used as receptor site.All water molecules were removed and the protein was minimized using Tripos force field by applying Gasteiger Huckel charges using conjugate gradient method until the energy gradient convergence criterion with 0.05 kcal/mol/Å.The active site was defined having amino acids Arg50, Glu87, Trp109, Asn112, Tyr135, Cys195, His196, Gln214, Arg215, Ser216, Asp218, Phe225, Asn226, His256, Tyr258 20 .The most active molecule 1 was docked into the monomer unit (A) of TS and the dock pose was used as the template onto which the dataset was aligned.Standard Tripos force field was employed for the Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Indices (CoMSIA) analysis.A 3D cubic lattice overlapping all entered molecules and extended by at least 4 Å in each direction with each lattice intersection of a regularly spaced grid of 2.0 Å was created.The steric and electrostatic parameters were calculated in case of the CoMFA fields, while H-bond donor parameters in addition to steric and electrostatic were calculated in case of the CoMSIA fields at each lattice.A sp 3 hybridized carbon atom was used as a probe atom to generate steric (Lennard-Jones potential) field energies and a charge of +1 to generate electrostatic (coulombic potential) field energies.A distance dependent dielectric constant of 1.00 was used.The steric and electrostatic fields were truncated at +30.00 kcal/mol.The similarity indices descriptors were calculated using the same lattice box employed for CoMFA calculations, using sp 3 carbon as a probe atom with a +1 charge, +1 Hbond donor and attenuation factor of 0.3 for the Gaussian type distance.A partial least squares regression was used to generate a linear relationship that correlates changes in the computed fields with changes in the corresponding experimental values of biological activity (pIC 50 ) for the data set of ligands.Biological activity values of ligands were used as dependent variables in a PLS statistical analysis.The column filtering value (s) was set to 2.0 kcal/mol to improve the signal-to-noise ratio by omitting those lattice points whose energy variations were below this threshold.Cross-validations were performed by the leave-one-out (LOO) 9,21 procedure to determine the optimum number of components (ONC) and the coefficient q 2 .The optimum number of components obtained is then used to derive the final QSAR model using all of the training set molecules with non-cross validation and to obtain the conventional correlation coefficient (r 2 ).To validate the CoMFA and CoMSIA derived models, the predictive ability for the test set of molecules (expressed as r 2 pred) was determined by using the following equation: r 2 pred = (SD -PRESS)/SD SD is the sum of the squared deviations between the biological activities of the test set molecules and the mean activity of the training set molecules.PRESS is the sum of the squared deviation between the observed and the predicted activities of the test set molecules.
Since the statistical parameters were found to be the best for the model from the LOO method, it was employed for further predictions of the designed molecules.
For stronger evaluation of the model applicability on new chemicals, the activities of the new molecules were evaluated using these QSAR models.

Results and Discussions
The 3D QSAR -CoMFA and CoMSIA analyses were carried out using pyrrolo pyrimidine and thienopyrimidine derivatives reported as potent Human Thymidylate Synthase inhibitors by Aleem Gangjee et.al. Molecules with precise IC 50 values were selected, a set of 34 molecules were used for derivation of model, these were divided into training set of 27 molecules and test set of 7 molecules.
The CoMFA and CoMSIA statistical analyses are summarised in Table 2. Statistical data shows q 2 loo 0.523 and 0.566 for CoMFA ligand based(LB) and receptor based(RB), 0.516 and 0.471 for CoMSIA ligand based(LB) and receptor based(RB) respectively.The r 2 ncv of 0.974 and 0.969 for CoMFA LB and RB, 0.983 and 0.969 for CoMSIA LB and RB respectively, that includes a good internal predictive ability of the models.To test the predictive ability of the models a test set of seven molecules excluded from the model generation were used.The predictive correlation coefficient r 2 pred of 0.505 and 0.50 for CoMFA LB and RB, 0.866 and 0.810 for CoMSIA LB and RB respectively indicates good external predictive ability of the models.The graph for the actual and predictive pIC 50 for training and test set of CoMFA LB and RB and CoMSIA LB and RB studies shown in Fig. 3(a,b,c,d).The CoMSIA models showed better results than CoMFA models, this shows that the Hbond donor fields that are not included in the CoMFA model are important for explaining the potency of the molecules.The observed and predictive activity of the molecules is provided in Table 1.To visualise the information content of the derived 3D QSAR models, CoMFA and CoMSIA contour maps were generated.The contour plots are the representation of the lattice points and the difference in the molecular field values at lattice point strongly connected with the difference in the receptor binding affinity.Molecular fields define the favourable and unfavourable interaction energies of aligned molecules with a probe atom traversing across the lattice points suggesting the modifications required to design new molecules.The contour maps of CoMFA denote the region in the space where the molecules would be favourably or unfavourably interact with the receptor, while the CoMSIA contour maps denote areas within the specified region where the presence of a group with a particular physicochemical property binds to the receptor.The CoMFA/CoMSIA results were graphically interpreted by field contribution maps using the 'STDEV COEFF' field type.

Design of New Inhibitors
The detailed contour map analysis of both CoMFA and CoMSIA models empowered us to identify the structural requirement for the observed inhibitory activity.Steric favoured and electro negative groups have been substituted on the NH of the glutamic acid side chain with 6 different substituents that gave a good predictive inhibitory activity values for the CoMSIA model and comparable predictive inhibitory activity values for the CoMFA model.Docking studies showed that the existing high active molecule had four hydrogen bonds (3HB-ASN226, 1HB-ASP218) while the designed molecules showed ten hydrogen bonds(2HB-ARG50, 1HB-GLU87, 1HB-ASN226, 5HB-ARG215, 1HB-SER216) as shown in figure 9.This analysis also predicts that the designed molecules have better interactions than the high active molecule.The structures and their predicted activity values along with calculated IC 50 values for both CoMFA and CoMSIA models have been given in Fig. 10 and Table 3.

Conclusions
Our present studies have established that CoMFA and CoMSIA models are quite reliable to efficiently guide further modification in the molecules for obtaining better drugs.They have provided good statistical results in terms of q 2 and r 2 values for pyrrolo pyrimidine and thieno pyrimidine antifolate derivatives.Both CoMFA and CoMSIA models provided the significant correlations of biological activities with steric, electrostatic and Hbond donor fields, establishing the significance of these fields in the selectivity and activity of the molecules.The 3D-QSAR results revealed some important sites, where steric, hydrogenbond donor modifications should significantly affect the bioactivities of the molecules.Using these clues novel human thymidylate synthase inhibitors with high affinity were designed that can be potent molecules for the treatment of cancer.

Figure 2 .
Figure 2. (a) Common substructure used for alignment.(b) Ligand-based alignment on common substructure (c) Receptor-based alignment on common substructure.

Figure 3 .
Figure 3. Scatter plot of observed versus predicted pIC 50 values for the training set (■) and the test set (▲) molecules based on (a)Ligand based CoMFA model,(b) Ligand based CoMSIA model, (c) Receptor based CoMFA model and (d) Receptor based CoMSIA model.

Figure 4 (
Figure 4(a,b) and 5(a,b) shows the contour maps for LB and RB CoMFA derived from the CoMFA PLS models.

Figure 6 .
Figure 6.Showing change in the position of S12 to S11 from molecule 1 to molecule 34.

Figure 10 :
Figure 10: Structure of newly designed molecule.

Table 1 .
Structures of the training and test set of molecules showing Experimental and Predicted activities.
*represent test set of molecules.

Table 2 .
Summary of CoMFA and CoMSIA results.Correlation coefficient b Standard error of estimate c Fisher test value d Cross validated correlation coeffient by leave-one-out method e Predicted correlation coefficients on test set a f Optimum number of principal components.

Table 3 .
The LB and RB Predicted activity values of newly designed molecules.