Quantitative Structure-Activity Relationship Studies for Potential Rho-Associated Protein Kinase Inhibitors

A series of pyridylthiazole derivatives developed by Lawrence et al. as Rho-associated protein kinase inhibitors were subjected to four-dimensional quantitative structure-activity relationship (4D-QSAR) analysis. The models were generated applying genetic algorithm (GA) optimization combined with partial least squares (PLS) regression. The best model presented validation values of , , , , , , and . Furthermore, analyzing the descriptors it was possible to propose new compounds that predicted higher inhibitory concentration values than the most active compound of the series.


Introduction
Rho-associated protein kinases (ROCKs) belong to Ser/Thr kinases protein family that are initially activated by the Rho family of GTPases.Two isoforms, ROCK1 and ROCK2, were identified [1].They share significant homologous sequence in the kinase domain (90%) and in the C-terminal regulatory domains show a significant deviation [2,3].Both are ubiquitously expressed in several tissues of humans and rodents.
The ROCK enzymes have been implicated in a variety of therapeutic areas including cardiovascular diseases [4], central nervous system disorders [5], inflammation [6], and cancer [7].Abnormal activation of the Rho/Rho kinase has shown itself to be important in the activation of pathways that contribute to the pathophysiology of certain diseases, such as arterial and pulmonary hypertension, glaucoma, diabetes, erectile dysfunction, neurodegeneration, and cancer [8,9].Thus, the ROCK inhibition is a promising strategy to prevent the invasion of malignant cells, a central event in the metastatic process [10][11][12], besides enabling a future target for treatment of various diseases.
Y-27632 and Y-30142 (Figure 1) are known as highly selective Rho-kinase inhibitors, as compared to other kinases [13].They are permeable in the cell and act by competing with ATP for the kinase activation site having an inhibitory action in its two isoforms: ROCK1 and ROCK2 [14].these components inhibits smooth muscle contraction and shows effectiveness in normalizing the blood pressure in hypertension models.Compound HA-1077, also known as fasudil hydrochloride or AT877 (Figure 1), is reported as an inhibitor of several protein kinases including Rho kinase with a similar affinity for the Y-27632, but with lower selectivity [15], and is currently the only Rho kinase inhibitor available for clinical use.Since 1995 it has been used in Japan for vasospasm prevention in patients with subarachnoid hemorrhage [16].CID5056270 (Figure 1) has also been reported to potently inhibit ROCK enzymatic activity [17].
A useful tool to develop new prototype compounds for inhibition/activation of the Rho/ROCK is the study of quantitative relationships between chemical structure and biological activity or some physicochemical property (QSAR/QSPR).The QSAR aims to develop logical mathematical models with a response of the biological property by quantifying of the chemical information.Subtle changes in the structure can cause changes in its pharmacological/toxicological activity [18].It may be used to understand and explain the driving forces behind the drug action at the molecular level and it allows the design and development of new compounds with desirable biological properties [19].Using QSAR it is possible to reduce the time and financial cost for the development of new structures and improvement of their effectiveness and selectivity, in addition to reducing the number of experiments conducted on animals [18,20].Therefore, the main purpose of this study is to use the 4D-QSAR method to propose structural changes in pyridylthiazole derivatives [21] to make them future candidates for ROCK1 inhibitors.

Biological Data.
A series of 51 pyridylthiazole analogous compounds have been taken from the report of Lawrence et al. [21] (Figure 6).They were selected because of their ROCK1 isoform inhibition.The IC 50 (half maximal inhibitory concentration) values were converted into molar units and then expressed as − log IC 50 (pIC 50 ).Among the 51 compounds, 9 were chosen to compose the test set.Three randomly selected test groups were formed in order to obtain greater credibility for the model.[22], were obtained from the Protein Data Bank (PDB ID code: 3TV7) [23].The catalytic site was used for modeling the training and test compounds (Figure 6).After that, the complexes were optimized through molecular mechanics, using the CHARMM force field [24] available in the Discovery Studio software [25] used for all the steps above.The enzyme was cut into a distance of 10 > around the ligand to reduce computational requirements, and hydrogen atoms were added to the structure in order to maintain the geometric integrity of the receptor.

Molecular Dynamics Simulation.
The optimized complexes were subjected to the molecular dynamics simulation (MDS) process with the aim of generating a conformational ensemble profile (CEP) for each complex and thereby investigating the conformational flexibility [26,27].The temperature for dynamics was adjusted to 300 K in order to stay near the temperature used in biological assay; the simulation sampling time was 50 ps with 0.001 ps intervals.A distance dependent dielectric function was also applied in order to try to model the explicit solvent effect.In addition, all corresponding C atoms were fixed to prevent a large conformational change of the ligand, which could result in sterically prohibited conformations.

Alignment and Interaction Pharmacophore Elements Definition.
Ten alignments were performed using the ordinate belonging to amino acid residues in the protein catalytic site and the ligand:  2).Each conformation from the CEP of each complex was placed in a reference cubic cell space according to the trial alignment under consideration.The atoms of each compound were classified into six types of interaction pharmacophore element (IPE) corresponding to the kinds of atoms that can occupy each cell box in accordance with the 4D-QSAR methodology [27].IPEs are classified into the following: nonpolar (np); positively charged polar (p+); negatively charged polar (p−); hydrogen bond acceptors (ha); hydrogen bond donors (hd); and aromatic systems (ar).The grid cell occupancy profiles for each of the chosen IPEs were computed and used as the basis set of trial cell occupation descriptors (GCOD, Grid Cell Occupancy Descriptor) that will be used in the construction of QSAR models.The size of the grid cell was set to 2 Å which corresponds to the whole number nearest to twice the van der Waals radius of a hydrogen atom ( vdw = 1.2 Å).

2.5.
Obtaining the 4D-QSAR Models.In order to avoid noise or useless data, databases named DB1, DB2, and DB3 were generated.DB1 databases were built excluding variables where GCODs were equal to zero for all molecules.DB2 and DB3 databases were built excluding variables where variance values were less than 0.1 and 1.0, respectively.The GCODs were used to form the trial basis set for the Genetic Function Approximation (GFA) analysis [28] using Wolf 6.2 software [29].

Validation of the Models.
QSAR modeling produces predictive models derived from application of statistical tools correlating biological activity with descriptors representative of molecular structure or properties.For validation of QSAR models, the following strategies were adopted: (1) Internal validation using leave-one-out cross-validation ( 2 or  2 CV ) [30].
(4) Modified equation for  2 pred giving importance to the difference between  2 and  2 0 (observed and predicted values with the zero axis intersection).Values greater than 0.5 can be a good indicator of a good external predictability [32]: (5) Δ 2  : which are calculated based on the correlation of the observed (-axis) and predicted (-axis) response data with and without the intercept and also by interchanging the axes [33].Change of the and axes gives the value of  2 0 .Values lower than 0.2 can be a good indicator of a good external predictability [33] (6) -randomization: which consists of the exchange of the values of the independent variables randomly.For an acceptable QSAR model  2 -rand value must be less than the correlation coefficient of the nonrandomized design.

Results and Discussion
Data Set 1 was performed with Test Group I (molecules 1, 22, 30, 31, 35, 38, 39, 41, and 43) (Figure 6) and the statistical parameters are shown in Table 1.All the alignments have  2 and  2 value greater than 0.7 and 0.5, respectively, which are acceptable parameter values for a good QSAR model.Analyzing the  2 adj values, the alignment Ali3 0.01 1 did not submit a minimum acceptable value for this parameter, 0.5; thus it was excluded.Another parameter used in the validation of the models was  2 pred and according to the results obtained in this validation, only the alignment Ali1 0.01 1 features a value ≥0.5, and therefore the other alignments were not used.
All the alignments have  2 and  2 value greater than 0.7 and 0.5, respectively; however the  2 adj parameter value was less than 0.5 for Ali3 0.1 2, Ali4 0.  6) and the statistical parameters are shown in Table 3.  2 and  2  CV values are within the permitted, but  2 adj values ≥0.5 were not found for alignments Ali3 0.01 3, Ali4 0.1 3, Ali5 0.1 3, and Ali8 0.1 3, which caused their exclusion.The values of the  2 pred parameter show that only the alignments Ali1 0.1 3, Ali5 0.01 3, and Ali10 0.1 3 remained ≥0.5, which is the minimum acceptable Analyzing  2 -rand parameters, it is observed that all the alignments were satisfactory because all were lower than the respective  2 .As for the  2  parameter, only the values for alignments Ali5 0.01 3 and Ali9 0.1 2 are within the ideal range ≥0.5.Comparing the number of outlier compounds among the alignments it was found that Ali9 0.1 2 presented three compounds and Ali5 0.01 3 did not show any outlier compound.For this reason Ali5 0.01 3 was chosen for further analysis.model has a predictive capacity of 67%.LOO-CV correlation coefficient values over 0.5 reveal that the model is a useful tool for predicting affinities for new compounds in this set.The optimum number of latent variables (PLS components) used for further analysis was seven.
In order to better understand the behavior of the adjusted data for the model, the cross-correlation matrix between different GCODs in the model was calculated [34].There is no correlation ( > 0.7) between the GCODs pairs.
Figure 4 shows William's plot [35] for the model Ali5 0.01 3.As it can be observed, both the training and test set compounds lie in the applicability domain of the model, demonstrating the reliability of the model.
A graphic representation of the 3D pharmacophore embedded in 4D-QSAR model is shown in Figure 5 using compound 18 as reference.GCODs were labeled as (, , , IPE) which means the cartesian coordinate positions of the selected grid cell (, , ) and the respective interaction types (IPE).GCOD (1, 4, −1, ar) shows aromatic IPE type influencing the increased activity of compounds.This descriptor is located close to Phe87 and Lys105 favoring - interactions.GCOD (2, 3, 1, np) shows nonpolar IPE type and is situated between the hydrogen atom of the phenyl ring and the amino acid residue Val90 favoring van der Waals interactions.GCOD (3, 3, −3, np) is related to nonpolar IPE and is located close to amino acid residue Gly85.During the MDS only compounds 24, 30, and 38 had the hydrogen atom of the phenyl ring with high frequency of occupation in this descriptor.Therefore, the position of the GCOD (3, 3, −3, np) was not helpful in this study.GCOD (0, 1, −1, ha) is related to the hydrogen bond acceptor IPE type and shows some frequency of occupancy for all compounds because it is positioned close to the canbonyl group (urealinkage).This suggests the importance of the intermolecular hydrogen bond with Lys105.This observation also was made by Pireddu et al. [22].GCOD (0, −2, 1, np) is related to nonpolar IPE and also shows some frequency of occupancy for all compounds because it is positioned close to the 4-(4-pyridinyl)-2-thiazolyl group, being in all of them.This region is close to Met156, Leu205, and Ala215 favoring van der Waals interactions.In addition it is possible to observe hydrogen bond interaction between the NH backbone of the Met156 and the nitrogen atom of the pyridyl ring.
The GCOD (4, 2, −2, ar) is related to aromatic IPE and has a negative coefficient indicating that the occupation of the  grid cell by the aromatic ring results in a decrease in potency of the compounds.This descriptor is situated between Arg84, Lys200, and Asp202 with highest frequency of occupancy for compounds 17, 25, and 27.This descriptor can explain the difference in the activity of compounds 17-18 and 26-27, depending on the chiral center.When the center is in  position, the substituents occupied by GCOD (4, 2, −2, ar) generate a steric hindrance.This limitation also occurs when one of the compound 25 phenyl rings occupies this position.The GCOD (0, 4, −3, np) is related to nonpolar IPE and presents the most negative coefficient causing great influence on the activity of the compounds.This descriptor showed higher occupation frequency for compounds 28, 34, 35, and 36, respectively, and it is located close to the substituent attached to the benzene ring.This region is at the opening of the entrance of the ROCK1 active site near the part where there will be solvent.This indicates that nonpolar substituents should be avoided in this position and polar substituents should be explored.
Analyzing the results obtained from the 4D-QSAR study and experimental data, new structures were proposed in order to increase the activity of these compounds.The bioactivities of the proposals were made through a virtual test of activity in which we applied (4) of Model 1 from Ali5 0.01 3. The predicted activity values (Figure 7) were obtained.The structures were constructed using molecule 18 as a base, which is the most active of the series.
P1, P2, and P3 proposed structures showed a predicted biological activity greater than the experimental activity of compound 18, which is the most active compound of the studied series.Despite having a high predicted activity value P3 and P4 did not exceed the value of the compound 18.The compounds of Figure 7 were submitted to evaluation using Lipinski rule of five [36].It is a rule to evaluate if a chemical compound with a certain pharmacological or biological activity has properties that would make it a likely orally active drug in humans.The objective is to estimate the solubility and the permeability of drugs administered orally, predicting the influence of the chemical structure on the absorption of a compound.The Lipinski rule of five [36] states that most "drug-like" molecules have milog  ≤ 5, molecular weight (MW) ≤ 500, number of hydrogen bond acceptors ( ON ) ≤ 10, and number of hydrogen bond donors ( OHNH ) ≤ 5 and polar surface area (TPSA) no greater than 140 Å2 .Molecules violating more than one of these rules may have problems with bioavailability.to evaluate the criteria discussed above.The results (Table 4) are encouraging for theproposedcompounds because all proposed molecules are under the conditions stipulated by Lipinski, indicating no absorption problems.

Conclusion
In this study a series of pyridylthiazole inhibitors for ROCK1 were evaluated through RD-4D-QSAR.The best models were obtained from the evaluation of the training set of 44 compounds and 3 test groups of 9 compounds with ten alignments for each combination.The best model was obtained from Ali5 0.01 3, in which the grid cell size employed was 2 Å and validation values  2 = 0.773,  2 CV = 0.672, and  2 adj = 0.608 were obtained, which confirm the sturdiness of the model.Moreover, based on the evaluation of GCODs, 5 structures were proposed using the most active compound of the series, 3 of which (P1, P2, and P3) showed predicted biological activity values higher than the base compound.These compounds were evaluated by Lipinski's rule and did not commit any violation, indicating that they are good candidates for synthesis and pharmacological evaluation.

Figure 2 :
Figure 2: Representation of atoms used in different alignments.

Figure 3 :
Figure 3: Plot of experimental versus predicted potencies using Ali5 0.01 3 for the training (circle) and test set (triangle).

Figure 4 :Figure 5 :
Figure 4: Plot of sample leverage versus Student residuals for the model Ali5 0.01 3.

Figure 7 :
Figure 7: Structure of the proposals and the predicted pIC 50 values.

Table 1 :
Summary of the results of the best 4D-QSAR models obtained for the ten alignments with Test Group I and their respective cuts.
pred value is dependent on the average of the training group; to avoid overestimation, 2 test value calculations were

Table 2 :
Summary of results of the best 4D-QSAR models obtained for the ten alignments with Test Group II and their respective cuts.
performed, which is an additional validation for the test group.In this validation, only the alignments Ali1 0.1 2 and Ali9 0.1 2 showed satisfactory results (≥0.5) and therefore only these two alignments were considered.Data Set 3 was performed with Test Group III(9, 12,

Table 3 :
Summary of the results of the best 4D-QSAR models obtained for the ten alignments with Test Group III and their respective cut.

Table 4 :
Calculated parameters of the Lipinski rule of five for the proposed compounds and compound 18.