Development of Dual Inhibitors against Alzheimer's Disease Using Fragment-Based QSAR and Molecular Docking

Alzheimer's (AD) is the leading cause of dementia among elderly people. Considering the complex heterogeneous etiology of AD, there is an urgent need to develop multitargeted drugs for its suppression. β-amyloid cleavage enzyme (BACE-1) and acetylcholinesterase (AChE), being important for AD progression, have been considered as promising drug targets. In this study, a robust and highly predictive group-based QSAR (GQSAR) model has been developed based on the descriptors calculated for the fragments of 20 1,4-dihydropyridine (DHP) derivatives. A large combinatorial library of DHP analogues was created, the activity of each compound was predicted, and the top compounds were analyzed using refined molecular docking. A detailed interaction analysis was carried out for the top two compounds (EDC and FDC) which showed significant binding affinity for BACE-1 and AChE. This study paves way for consideration of these lead molecules as prospective drugs for the effective dual inhibition of BACE-1 and AChE. The GQSAR model provides site-specific clues about the molecules where certain modifications can result in increased biological activity. This information could be of high value for design and development of multifunctional drugs for combating AD.


Introduction
Alzheimer's disease (AD) is an irreversible chronic brain disorder among elderly people [1][2][3]. AD is characterized by steady cognitive impairment, memory loss, and decline in language. It is one of the leading causes of death in the world. For instance, it was estimated that 5.2 million Americans of all ages were suffering from AD in 2013 making it the sixth leading cause of death in the United States (Alzheimer's association; http://www.alz.org/). The devastating pathological hallmarks of AD are extracellular accumulation of neurotoxic amyloid (A ) peptides [4], loss of the presynaptic markers of the cholinergic system in the brain, mitochondrial dysfunction, and formation of dense neurofibrillary tangles of hyperphosphorylated tau protein in the central nervous system [5][6][7].
Most of the U.S. Food and Drug Administration approved drugs are available for the symptomatic treatment of AD.
Among these drugs, donepezil, tacrine, rivastigmine, and galantamine are based on cholinergic hypothesis [8][9][10][11]. Furthermore, memantine is an antagonist drug of N-methyl-D-aspartate receptor [12][13][14]. However, the observable toxic issues such as hepatotoxicity, vomiting, diarrhea, and nausea forced these drugs to move out from the pharmaceutical market [15]. Moreover, medicational pharmacokinetic effects of these drugs are just for marginally alleviating the symptoms and not to have interruption in neurodegenerative cascade which is the root pathophysiology of AD [16][17][18]. Considering the complex heterogeneous etiology of AD, modulation of one enzyme might not be sufficient enough for the effective treatment of AD. Therefore, the present day research in AD drug development is shifting towards identification and design of multitargeted novel molecules instead of single targeted molecules for the long term suppression of AD. For instance, Piazzi et al. report AChE inhibitor purposely designed to bind at both the catalytic and the peripheral sites of the human enzyme [19].
Most of the experimental evidences suggest that deposition of amyloid plaques in the brain of Alzheimer's patients is the key factor of pathogenic cascade of the disease [16,20]. A , which is the core component of the amyloid plaques [15], is produced by subsequent cleavage of a large transmembrane protein-amyloid precursor protein (APP)-by two different proteolytic enzymes -and -secretase [21]. The complete biochemical mechanism of proteolytic cleavage depends on the protein-protein interactions between APP and -amyloid cleavage enzymes (BACE-1) [22]. Blocking the interface between these protein interactions has huge therapeutic potential for slowing down the long term progression of AD. It has been reported that acetylcholine esterase (AChE) also plays an important role in accumulation of A and acts as a promoter of A fibril production [23]. This activity of AChE is associated with its peripheral anionic site (PAS). Since BACE-1 plays a major role in the initiation of neuropathological cascade of plaque formation and AChE accelerates A deposition in brain, both of these enzymes hold considerable promise as therapeutic targets of AD. Thus, dual target directed strategy is more likely to show comprehensive obliteration of AD in synergistic manner. Multitarget drugs are more efficient as they prevent unwanted compensatory mechanisms, which might result in cellular redundancy, from developing [24].
Discovery of small molecules for targeting proteinprotein interfaces beholds enormous challenges and is accounted by various factors, namely, shape of typical protein-protein interface and flexibility of proteins among others. To speed up the drug discovery process, various fast and accurate computational methods have been illustrated which assist the development of novel therapeutic drugs to interrupt the interaction between proteins [25,26]. Usage of quantitative structure activity relationship-(QSAR) based approaches is worthwhile when knowledge of ligand molecules for a particular target is available. Group-based QSAR (GQSAR) is one of the most recent and effective ligand-based drug designing approaches which uses descriptors evaluated specifically for the substituent groups or fragments of the ligands. This approach identifies the specific sites where the groups need to be modified for designing optimized molecules with enhanced biological activity [27]. GQSAR model can be developed by applying statistical methods like partial least square (PLS), principle component regression, multiple regression, continuum regression, and k-Nearest Neighbour on a series of congeneric compounds in order to gain insights into the effects of descriptors on their biological activity [27,28].
Herein, our attempts are focused on the discovery of novel small molecules that can compete to bind with one of the interacting proteins with higher binding affinity in order to disrupt the interactions between APP and BACE-1 and simultaneously are able to bind to the PAS site of AChE. Present study describes a detailed GQSAR analysis on 1,4-dihydropyridine (DHP) derivatives, reported as potential inhibitors of BACE-1 [4], in order to elucidate the structural features of the molecular fragments of these molecules that lay significant contribution towards their biological activity. GQSAR model was further used to develop a combinatorial library of novel molecules followed by their activity prediction. Mechanistic analysis of binding modes of these identified leads within the active site of both targets was performed using docking studies. Thus, our study delineates identification of novel leads having dual inhibiting effects due to binding to both, BACE-1 and the PAS of AChE.  [21]. 2D chemical structures of DHP analogues along with their biological activities are presented in Table 1. Molecules were converted into 3D format and then energetically optimized using Vlife Engine module of Vlife Molecular Design Suite (Vlife MDS) [29]. The optimized molecules were generated using Merck Molecular force field, distance dependent function, and energy gradient of 0.01 kcal/mol.

Fragmentation and Descriptor
Calculation. All molecules considered here had a common DHP scaffold and 4 substitution sites where different R-groups were attached. On the basis of different R-groups, each molecule was divided into 4 fragments or groups in order to perform GQSAR analysis. Optimized dataset of all molecules was considered for GQSAR analysis on the basis of common DHP template. A total of 705 physicochemical descriptors were calculated for various groups present at each substitution site using Vlife MDS. These included 2D descriptors such as element count, extended topological indices, Merck molecular force field atom type count, and electrotopological and alignment independent descriptors among others [30]. Independent variable calculation was further followed by removal of invariable columns containing constant values for more than 90% molecules, which finally resulted in 311 independent variables from the large pool of descriptors.

Selection of Test Set and Training Set.
With an aim to develop a GQSAR model, the dataset was split into two optimal training and test sets using random selection method. The robustness of these sets was evaluated by generating unicolumn statistical parameters such as mean, standard deviation, maximum, and minimum for both test and training sets. The dataset division satisfied the criteria of an appropriate model; namely, the maximum of the test set was less than the maximum of the training set and the minimum of the training set was greater than the minimum of the test set. This analysis validated the selected training and test sets.

GQSAR Model Generation.
To select the optimal subset of variables (descriptors) that can significantly correlate with biological activity of molecules from the pool of descriptors, various variable selection methods such as step-wise search algorithm, genetic algorithm, and simulated annealing among others can be used. A number of statistical methods such as partial least square (PLS), multiple regression, and principle component regression can be used for model building. Herein, simulated annealing combined with PLS regression was used to generate the GQSAR model. Simulation of a physical process is known as simulated annealing, which involves heating the system to a high temperature and then gradually cooling it down to room temperature [31]. All the values of statistical parameters for simulated annealing were kept as default. The number of terms (number of descriptors) to be included in the final GQSAR model was kept as 3.

Model Evaluation and Validation
. The developed GQSAR model was evaluated using two types of validationinternal and external validations. Internal (cross) validation was carried out using leave-one-out method [32]. Crossvalidation coefficient q 2 was calculated as where and̂are the actual and the predicted activity of the th molecule in the training set, respectively, and mean is the average activity of all molecules in the training set. For external validation of the model, the pIC 50 values of the test set molecules were predicted and the pred r 2 value that provides the statistical correlation between predicted and actual activities of the test set compounds was calculated as follows: where and̂are the actual and the predicted activity of the th molecule in the test set, respectively, and mean is the average activity of all molecules in the training set. All these statistical parameters were used to evaluate the quality of the model. Correlation coefficient (r 2 ) described the fitness of training set data whereas predictive correlation coefficient (pred-r 2 ) was used to evaluate the fitness of test set. Cross-validation coefficient (q 2 ) and F-test (Fischer's value) showed the statistical significance of the regression model and the standard errors (pred r 2 se, q 2 se, and r 2 se) gave an idea of the quality and fitness of the model. Low standard error values indicated that the model is absolute and robust. The model is said to be robust and predictive if these statistical parameters satisfy the following conditions: r 2 > 0.6, pred r 2 > 0.5, and q 2 > 0.6 [33,34].

Combinatorial Library Generation and Activity Prediction.
A combinatorial library was generated using Leadgrow module of Vlife MDS. For library generation a number of substitutions were made using various atoms and groups like alkyl, alkene, acids, aromatic rings, rings, carbonyl, cyanate, -O-CH 3 , -O-C 2 H 5 , amide, benz, and hydrzo at all substitution sites (R1, R2, R3, and R4) of DHP template. The final GQSAR model generated was used for biological activity prediction of the compounds of the combinatorial library. The water molecules and all other heteroatoms were removed from the protein crystal structure. The protein was further prepared using Schrodinger's protein preparation wizard [36]. Conversion of all combinatorial structures to 3D form and further optimization were carried out using LigPrep module of the Schrodinger suite. All possible conformers for each molecule were generated using LigPrep. Docking studies were performed using Glide module of Schrodinger suite by creating a cubic grid (10 × 10 × 10Å) around the active site residues of BACE-1 that are involved in cleavage of APP. The molecules of combinatorial library with high predicted activity were subjected to high throughput virtual screening (HTVS) protocol followed by Glide's extra precision (XP) docking protocol for futher docking refinement.

Dual Inhibition Effect Studies.
Keeping in mind our aim to discover potent novel dual inhibitors of AChE and BACE-1, the above screened molecules were again subjected to docking at PAS site of AChE. This PAS site is involved in accumulation of A in the human brain. Crystal structure of human AChE (resolution: 2.0Å) was obtained from PDB (PDB ID: 4M0E) [35]. Protein preparation and optimization was done using Schrodinger suite. Selected molecules having high XP scores were then checked for their druglike properties using Lipinski filters. The two top scoring compounds showing dual inhibitory property were analyzed to observe the molecular mode of interaction between the target proteins and the ligands using ligplot program [37].

Results and Discussion
Here we have attempted to identify a novel GQSAR model depicting robust statistical correlation between structure and activity of DHP analogues which have been reported as potent suppressors of BACE-1. The adopted strategy initially identified a pool of 311 molecular descriptors to be used as independent variables. The pIC 50 value was used as the dependent variable. The dataset of 20 compounds was divided into two groups: test set including 5 molecules and training set including the rest of the molecules. The training set was  Table 1 where R1, R2, and R3 are the 2D descriptors along with their respective coefficient and the last numerical term in this equation is the regression constant. This equation explains that the descriptor DeltaEpsilonA shows positive contribution at substitution site R2 of DHP common moiety. However, the other two descriptors, NitrogensCount and K3alpha at R1 and R3 substitution sites, respectively, contribute negatively towards the biological activity of molecules. The contribution of these descriptors is illustrated in Figure 1. Below is the brief description of these molecular descriptors.   [39,40]. Among the various basic parameters of ETA, DeltaEp-silonA is a measure of contribution of unsaturation and electronegative atom count [41] which is extensively applied for modelling various toxicity end-points in the quantitative domain of structure-activity relationships [42]. Here, it was observed that DeltaEpsilonA showed 46.98% contribution in activity enhancement of molecule when present at R2 site. Originally, R2 site was occupied by three different groups, namely, methylbenzylamine [NH-( ) methylBn], benzyl ester (OBn), and acetyl group.

R1-NitrogensCount.
This physicochemical descriptor lies in the section of element count descriptors. As the name suggests, it indicates the number of nitrogen atoms present in a compound. This descriptor was observed to provide a 37.25% negative contribution at R1 substitution site which was originally engaged with different alkyl groups.
R3-k3alpha. The Kier and Hall Kappa molecular shape indices are intended to capture the overall aspects of molecular shape [43]. Third order Kappa Alpha (K3alpha) shape index is a subset of Kappa indices and the information encoded in it specifically refers to attributes of the shape of molecule. In present GQSAR model, K3alpha was found to have 15.74% negative participation at R3 substitution site for the enhancement of biological activity of molecules. This site was originally occupied by sulphonamide group, amide group, and ester group.  of predicted over actual values for training and test sets, respectively, and the linear scatter plot depicts the distance of training and test data points from the regression line which relatively gives an idea about the difference between actual and predicted activity values of both sets.

Combinatorial Library Preparation and Activity Prediction.
The common moiety (Figure 3(a) Gln 73 Gly 34 Thr 231 Thr 232 Trp 115 Ser 325 at R3 and R4 site, respectively. Surprisingly, approximately all the high activity molecules were found to bear F atom at R2 site suggesting that the presence of F atom at R2 site plays a crucial role in activity enhancement. Therefore, constant value of 0.557 for extended topochemical descriptor R2-DeltaEpsilonA was observed. The constant low values of 1 and 0 for negatively contributing descriptors R1-NitrogensCount and R3-K3alpha depicted their role in activity enhancement.

Docking Analysis.
Docking studies for 3405 molecules of combinatorial library were carried out against AChE and BACE-1. To filter out the chemically correct structures, molecules were converted into 3D format and then optimized using LigPrep module of Schrodinger suite which reduced the number of molecules for further analysis to 3238. Among these molecules, a total of 1310 and 1482 compounds having good binding affinity for BACE-1 and AChE, respectively, were identified using HTVS. After HTVS, the highest docking scores for both targets, BACE-1 and AChE, were found to be −10 kcal/mol and −12 kcal/mol, respectively. Compounds with Glide score above −8 kcal/mol for BACE-1 and −6 kcal/mol for AChE were then subjected to XP protocol for further refinement of Glide score. The two top scoring compounds showing dual inhibitory property against both targets were selected for further evaluation of their mechanistic molecular mode of interaction with the target proteins. boxamide (further referred to as EDC and FDC, resp.) were found possessing dual target inhibitory capability. 2D structures of these compounds along with the common moiety are shown in Figure 3(b). The docking results revealed that EDC had the highest XP score of −15.20 kcal/mol against BACE-1 and a significant XP score of −11.92 kcal/mol against AChE. On the other hand, FDC was found to interact with strong binding affinity of −14.39 kcal/mol with BACE-1 and of −11.85 kcal/mol with AChE. Rest of all the docking parameters for these two ligand molecules with respect to both the targets were also taken into consideration and are summarized in Table 3. The pIC 50 value of both these lead compounds was 6.10 as predicted by the generated GQSAR model. The drug-like properties of the chosen compounds were also taken into account and both of the leads were found to have satisfactory values for all the essential druglike properties such as logP value and molecular weight which are listed in Table 4.
EDC-AChE Complex. Since EDC was evaluated as a dual inhibitor of two different targets BACE-1 and AChE, the mechanistic mode of interaction was also analysed for EDC-AChE complex. In this complex, EDC was observed to form four hydrogen bonds and numerous hydrophobic contacts with PAS residues [23] along with some other surrounding amino acids. Two amino acids Tyr124 and Ser293 were involved in the formation of hydrogen bonds. The residues involved in hydrophobic contacts were Tyr72, Asp74, Trp286, His287, Leu289, Gln291, Glu292, Phe295, Arg296, Phe297, Tyr 337, Phe338, and Tyr341. Convincing docking score and high number of hydrogen bonds as well as hydrophobic interactions suggested EDC to be a significant inhibitor of AChE. Binding of EDC within the PAS of AChE is illustrated in Figure 6.
FDC-AChE Complex. Similar to EDC, the second lead molecule FDC was also evaluated for its dual inhibition property. Docking analysis for FDC-AChE complex showed that FDC was interacting with the PAS cavity of AChE. For this docked complex, three hydrogen bonds formed by two AChE residues (Glu292 and Tyr341) and FDC atoms were detected. A total of 13 hydrophobic contacts were identified with residues Tyr72, Asp74, Tyr124, Trp286, Leu289, Gln291, Ser293, Phe295, Arg296, Phe297, Tyr337, Phe338, and Gly342. The interaction mode of FDC-AChE complex showing hydrogen bonds with their respective bond length and hydrophobic interactions is illustrated in Figure 7.

Conclusion
This study is an attempt to identify novel dual inhibitors targeting BACE-1 and AChE enzymes. Structural characteristics of a set of dihydropyridine derivatives were studied using a novel group-based QSAR analysis. The GQSAR analysis revealed the importance of 2D descriptors and showed that the chemical group variations in the molecules substantially  influenced their biological activity. We also generated a large combinatorial library of 86400 compounds by carrying out substitutions at four different sites of DHP. GQSAR model was utilized further for activity prediction of prepared combinatorial library. The two compounds (EDC and FDC) having high predicted inhibitory activity and the highest docking scores against both of the targets were identified as possessing dual inhibitory properties. We have also provided mechanistic insights into the binding mode of action of these leads. The enhanced predicted activity, high binding score, and the presence of crucial drug like molecular properties provide substantial evidence for consideration of these compounds as potent dual inhibitors for future prospective of AD treatment. This information could be of high value for design and development of novel multitargeted drugs against AD possessing improved binding properties and low toxicity to human cells.