Molecular Modeling of Antimalarial Agents by 3D-QSAR Study and Molecular Docking of Two Hybrids 4-Aminoquinoline-1,3,5-triazine and 4-Aminoquinoline-oxalamide Derivatives with the Receptor Protein in Its Both Wild and Mutant Types

Modeling studies using 3D-QSAR and molecular docking methods were performed on a set of 34 hybrids of 4-aminoquinoline derivatives previously studied as effective antimalarial agents of wild type and quadruple mutant Plasmodium falciparum dihydrofolate reductase (DHFR). So, the famous mathematical method multiple linear regression (MLR) was explored to build the QSAR model. The DFT-B3LYP method with the basis set 6-31G was used to calculate the quantum chemical descriptors, chosen to represent the electronic descriptors of molecular structures. On the contrary, the MM2 method was used to calculate lipophilic, geometrical, physicochemical, and steric descriptors. The QSAR model tested with artificial neural network (ANN) method shows high performance towards its predictability. The predicted model was confirmed by three validation methods: leave-one-out (LOO) cross validation, Y-randomization, and validation external. The molecular docking study of three compounds 9, 11, and 26 on both wild and quadruple mutant types of pf-DHFR-TS as the protein target helps to understand more and then predict the binding modes with the binding sites.


Introduction
Malaria is one of the world's greatest global public health challenges. It is most prevalent in sub-African, Asian, and South American countries, and it mostly affects children under the age of five and pregnant women [1,2]. According to a world health organization (WHO) report 2015, estimated 3.2 billion people were at risk of malaria, approximately 212 million cases of malaria worldwide, and 429.000 deaths occurred worldwide in 2015. Of these estimated deaths, 90% occurred in sub-Saharan Africa [1]. Malaria is an infectious and contagious disease caused by the protozoa of the genus Plasmodium [3].
ere are five species that infect humans (P. falciparum, P. vivax, P. malariae, P. ovale, and P. knowlesi) [4]. However, amongst these five species, Plasmodium falciparum is the most severe and lethal species [5]. Many efforts are made in attempts to find out efficient inhibitors for this protein by testing several molecular structures.
e quinoline moiety has attracted a great consideration of the medicinal chemists, as it is one of the crucial pharmacophores accountable for imparting antimalarial action [6,7]. On the contrary, the 1,3,5-triazine derivatives cycloguanil, chlorcycloguanil, clociguanil, and WR99210 are already approved as effective dihydrofolate reductase (DHFR), specific inhibitor of P. falciparum domain, and they selectively inhibit biochemical processes that are vital for parasite growth [8]. Nowadays, to overcome drug resistance problems the concept of hybrid molecules has been introduced, in which two or more pharmacophores are linked together (as quinoline-triazine and qunolineoxalamide), and it is believed that these compounds act by inhibiting simultaneously two conventional targets [9]. In this study, we worked on these two pharmacophores, as two types of hybrids: 4-aminoquinoline-triazine and 4-aminoquinoline-oxalamide [10]. e discovery of new antimalarial drugs is very challenging; the aim of developing a QSAR model is to construct a relationship (using statistical methods) between structural properties and activities using a training set which is capable of predicting the activity of compounds which are not used to build the model by multiple linear regression (MLR) and artificial neural network (ANN) calculations.
e QSAR model has been validated by using an internal and external validation as well as Y-randomization. To develop the binding modes of this set of hybrids in the active sites, we have to perform the docking of three compounds: on the one hand, the highest active compound 26 belonging to the triazine series and the highest active compound 9 from the oxalamide series and on the other hand, the lowest active of the entire series compound 11, with Plasmodium falciparum dihydrofolate reductase-thymidylate synthase (pf-DHFR-TS) in its two forms: the wild type and the quadruple mutant [11].
is study allows the developing of models that not only provide details of the binding modes and key molecular interactions but also allow the prediction of relative inhibition and binding affinities that could be reproduced in silico.

Experimental Data.
In this work, a data set of 34 hybrids of 4-aminoquinoline [10] constituting two groups is explored. e first group (4-aminoquinoline-oxalamides) accounts for 16 molecules numerated from 1 to 16, and the second group (4-aminoquinoline-triazines) contains 18 compounds numerated from 17 to 34 (Figure 1). e chemical structures of these hybrid derivatives with their antimalarial activities (IC 50 ) are presented in Tables 1 and 2. e observations are converted into logarithm scale log IC 50 .

Molecular Descriptors Calculation.
In order to accurately model and predict inhibitors activities, 16 descriptors listed in Table 3 were introduced. Eleven descriptors which are lipophilic, geometrical, physicochemical, and steric descriptors were calculated with the MM2 method with the aid of the ACD/ ChemSketch program [12] and the ChemBioOffice software [13]. On the contrary, 5 electronic descriptors were calculated with the DFT method [14], using the Gaussian03 quantum chemistry package [15]. e optimization of compounds was performed with the DFT method using Becke's three-parameter hybrid function (B3LYP) [16], with a 6-31G basis set in the case of electronic descriptors calculation and with the MM2 method for the remaining descriptors. e totality of descriptors used in this work is represented in Table 3.

Analysis Methods.
Multiple linear regression (MLR) [17] analysis with the descendent selection method was used to select the most appropriate descriptors. It is a mathematical technique to study the relation between one dependent variable and several independent variables. e regression method is based on three criteria: correlation of determination (R 2 ), the Fisher ratio value (F), and the root mean square error (RMSE). e MLR model was generated using the software XLSTAT version 2013 [18]. Note that the MRL has been served to select the used descriptors as the input parameters in the artificial neural network (ANN). e ANN analysis is performed using the SAS JMP package (v8.0, SAS Institute Inc., Cary, NC, USA). e neurons networks are arranged in three layers: e input layer contains six neurons representing the relevant descriptors obtained with the MLR technique, the output layer contains one neuron representing the calculated activities values log IC 50 , and the hidden layer is composed of 3 neurons determined by ρ � (number of weight)/(number of connection). In this work, we used the ρ value interval 1 < ρ < 3 [19,20]. e high correlation coefficient indicates the quality of the equation that fit the data, in order to explore the stability of this equation; the cross-validation method with "leave-one-out" was carried out using the ANN method. Based on this technique,    [21], and thereafter, the corresponding models serve to predict the activity of the removed compound. e LOO cross-validation coefficient R 2 was calculated as follows [22]: where Y exp and Y pred are the observed and predicted values for the dependent variables, respectively, and Y is the average observed value.
In order to ensure the reliability of the QSAR model, the Y-randomization test has been used. is approach consists to randomly mix many properties/experimental activities for the learning series using the same descriptors; the new QSAR model is constructed to exclude the possibility of random correlation in the obtained model [23].
Furthermore, external validation is necessary as the validation method is used to ensure the ability of the QSAR model. However, the data set in this work has been randomly divided into a training set with 28 compounds for the model developed through MLR, and a predicted set with 6 compounds has been reserved to external validation.          Biochemistry Research International e ability of the built model based on the external prediction set was evaluated by R 2 ext , which could be calculated as follows [22]: where Y pred(test) and Y (test) are the predicted and experimental values of the samples for the prediction set, respectively. Y tr is the average value for the dependent variable for the training set. e value of R 2 ext ≥ 0.5 is considered as an indicator of the reliability of the model. However, Golbraikh and Tropsha showed that R 2 ext is not a good parameter to estimate the reliability of the QSAR model. Indeed, an external validation based on the Golbraikh and Tropsha criteria is necessary [22].
In order to gain insight into the key structural requirements of the antimalarial activity, molecular docking studies are carried out using the AutoDock4.2 program [24]. X-ray crystallography structures of Plasmodium falciparum of the wild type (coded as 1J3I.pdb) and quadruple mutant (coded as 1J3K.pdb) pf-DHFR-TS were obtained from the Protein Data Bank [25]. e minimized protein structures were defined as receptors, and the first step in the preparation of the receptor was the removal of the ligands and the water molecules. In order to simplify the docking analysis, in this docking, the 3D grid was created by the AUTOGRID algorithm [26] to evaluate the interacting energy between protein ligands. e grid maps were constructed using 60, 60, and 60, pointing in x, y, and z directions, with grid point spacing of 0.375Å. e center grid box is of 29.39Å, 5.56Å, and 52.49Å, by the ligand location in the complex.

Results and Discussion
In this study, we used two random distributions of compounds into the training and test sets. e first training set included 28 compounds, and the corresponding test set included 6 compounds. e selected descriptors values, and predicted activities values using the training set obtained by MLR, ANN, and CV methods, are summarized in Table 4.
where N is the number of compounds, R is the correlation coefficient, R 2 is the determination coefficient, RMSE is the root mean square error, and F is the Fisher test. e relevant descriptors involved in the MLR model are HOMO energy, total energy, repulsion energy, torsion, critical temperature, and stretch-bend. e corresponding normalized coefficients are presented in Figure 2, and the correlation of the observed activities with the MLR calculated ones is illustrated in Figure 3.
As indicated by the statistical coefficient values of the correlation between the observed and calculated activities based on this model using the training set are quite significant, and the low RMSE indicates that the model is reliable to a better prediction precision.

Artificial Neural Networks.
In order to increase the probability of good characterization of studied compounds, artificial neural networks (ANN) are used as the nonlinear method to generate predictive nonlinear model between observed antimalarial activities values and the set of molecular descriptors obtained by MLR with that of the architecture network (6-3-1). e correlation of the observed activities with the ANN predicted ones is illustrated graphically in Figure 4.
As it is shown in Figure 4, a good correlation between observed antimalarial activities values and predicted activities by ANN is obtained, in fact the correlation coefficient R � 0.98, the determination coefficient R 2 � 0.97, and the standard error of estimate RMSE � 0.09. Such results show that the selected descriptors by MLR are pertinent, the ANN model possesses a significantly statistical quality, and the model proposed to predict antimalarial activity is relevant.

Cross Validation.
e QSAR model proposed to predict the activity of new compounds should be tested. To validate our results, we used the LOO procedure, which involves removing a single molecule from the set containing 28 molecules and making a prediction for antimalarial activity.
is procedure is repeated 28 times in order to estimate the predictive ability of such models. e correlation of the observed activities with the calculated cross-validation ones is shown graphically in Figure 5. e obtained correlation (R � 0.90, R 2 � 0.81, and RMSE � 0.16) shows a high predictive power of the MLR model.
is result shows that our QSAR model is not sensitive to this operation of putting a molecule aside and putting it back into the learning series. is is a first indication of the stability of the selected QSAR model.

Y-Randomization.
e Y-randomization test was performed to make sure that there is no random correlation. In this way, we could test the validity of the established QSAR model and check that the selected descriptors are not random, and consequently, the result model should have low statistical quality. e results of the Y-randomization method are given in Table 5 and Figure 6. e new QSAR model built using the Y-randomization method is represented by the following equation: (4) e correlation coefficient value of the mixture samples is close to that obtained by applying the model by the training set. is result provides the absence of dependence between descriptors included in the QSAR model.

External Validation.
In a study on efficient methods of validation for QSAR models, Golbraikh and Tropsha showed that LOO methods are necessary but not sufficient, claiming that external validation is inevitable and proposed some criteria which would help to validate a QSAR model. is validation is done in two steps: validation of the model MLR have calculated new compounds which are not used in the model development of the training set (Table 6) and verification of the Tropsha criteria (Table 7). e results show that Golbraikh and Tropsha criteria are successfully validated. All validations indicate that the built QSAR model is robust and satisfactory. e model established in this study meets all of the principles for QSAR validation and can be used to predict the antimalarial activity.

Docking Studies.
In a pioneering study on the binding modes and the localization of the principal active sites in wild and mutant protein performed with a potent inhibitor Coefficient of determination for the plot of predicted versus observed for the test set r 2 > 0.6 0.77 r 2 0 r 2 at zero intercept 0.71 r ′2 0 r 2 for the plot of observed versus predicted activity for the test set at zero intercept 0.59 Slope of the plot of predicted versus observed activity for the test set at zero intercept 0.85 < k < 1.15 1.07 Slope of the plot of observed versus predicted activity at zero intercept 0.85 < k′ < 1.15 0.92  1,3,5-triazine derivative which is a preclinical molecule called WR99210, it is found that the important sites are located in Ile14, Ala16, Met55, Asp54, Ser108, Ile164, and Tyr170 in the case of the wild type and Ala16, Cys50, Asn51, Cys59, Asn108, Leu164, and Tyr170 in the case of mutant protein [28]. In a tentative to give insight into the interaction modes and to find out the interaction types established with this protein (pf-DHFR-TS) in its two forms, wild and mutant, the molecular docking study performed in this work is applied on three compounds 9 (IC 50 � 15.58), 11 (IC 50 � 261.84), and 26 (IC 50 � 5.23) with the binding sites of both wild type and quadruple mutant. e docking results and docked conformations of ligands in the active sites are represented in Figure 7.
In the case of the wild type, compound 26 performs hydrogen bonding with the carboxylate oxygen atoms of ILE164, SER108, and SER111 by the involvement of the two NH groups bounded to the triazine group and one of the triazine nitrogen, with, respectively, the distances 2.49 Ǻ, 2.77 Ǻ, and 2.93 Ǻ, and a nonbonded p-sigma interaction between phenyl of quinoline with MET55 at a distance of 3.52 Ǻ. However, in the case of the quadruple mutant, three hydrogen bonds with ASN108, SER111, and ALA16 were observed by the involvement of two NH groups linked to 1,3,5-triazine and the oxygen of the morpholino group with a distance of 2.85 Ǻ, 1.84 Ǻ, and 2.91 Ǻ, respectively. For the compound 9, two hydrogen bonds are formed between an oxygen and an azote of the oxalamide group with ILE164 and TYR170 with, respectively, the distances 2.24 Ǻ and 2.73 Ǻ in the case of the wild type. But in the case of the quadruple mutant, it forms two hydrogen bonds with ALA16 and LEU164 through the involvement of two azotes (the first is linked to the oxalamide group, and the second belongs to the diethylamine group), with distances of 2.77 Ǻ and 2.80 Ǻ, respectively. However, compound 11 showed only one hydrogen bonding interaction with LEU40 in both cases.
In the analysis of these results, we have at first observed that the residues with which the compounds 26 and 9 have formed their interactions are mentioned as the most important binding sites for antimalarial activity [28], which is not the case for compound 11. Secondly, we observed that the number of hydrogen bonds differs from the most active compound which belongs to the triazine family, to the less active compound which belongs to the oxalamide family. So, this could explain the potent antimalarial activity for compound 26 and the importance of the triazine group to enhance the antimalarial activity compared to the oxalamide group.

Conclusion
e present study on a series of 4-aminoquinoline-triazines and 4-aminoquinoline-oxalamides was carried out using 3D-QSAR and docking techniques in the aim to predict the antimalarial activity. e group contribution method (for both training and test sets) was used to develop a reliable QSAR model for predicting antimalarial activity. e result of MLR and ANN methods using the training set clearly shows a strong relationship between the structural properties and the activity. us, the correlation coefficient for both methods shows good predictive ability of the model. e model is validated by internal and external validation methods including (leave-one-out) cross validation and Y-randomization. e obtained model shows good quality of the robustness to predicting the antimalarial activity. e observed activity was further corroborated via a molecular docking study which gave explanation to the differences observed among activities of compounds especially between the triazine family and oxalamide one. Results of these studies provided details of the predicted binding modes and the key molecular interactions. ese will provide opportunities for medicinal chemists to develop new antimalarial drugs, by using new hybrid molecules.