Matrix metalloproteinases (MMPs) have distinctive roles in various physiological and pathological processes such as inflammatory diseases and cancer. This study explored the performance of eleven scoring functions (D-Score, G-Score, ChemScore, F-Score, PMF-Score, PoseScore, RankScore, DSX, and X-Score and scoring functions of AutoDock4.1 and AutoDockVina). Their performance was judged by calculation of their correlations to experimental binding affinities of 3D ligand-enzyme complexes of MMP family. Furthermore, they were evaluated for their ability in reranking virtual screening study results performed on a member of MMP family (MMP-12). Enrichment factor at different levels and receiver operating characteristics (ROC) curves were used to assess their performance. Finally, we have developed a PCA model from the best functions. Of the scoring functions evaluated, F-Score, DSX, and ChemScore were the best overall performers in prediction of MMPs-inhibitors binding affinities while ChemScore, Autodock, and DSX had the best discriminative power in virtual screening against the MMP-12 target. Consensus scorings did not show statistically significant superiority over the other scorings methods in correlation study while PCA model which consists of ChemScore, Autodock, and DSX improved overall enrichment. Outcome of this study could be useful for the setting up of a suitable scoring protocol, resulting in enrichment of MMPs inhibitors.
Matrix metalloproteinases (MMPs) are zinc-dependent endopeptidases that play a central role in various physiological processes and pathological conditions including cancer and inflammatory diseases. One of the main problems for developing a new class of drugs as MMP inhibitors is the issue of selectivity. This family shares a very similar active site that makes traditional chemical approach for developing of selective inhibitors time-consuming. In this case the computational approaches including molecular docking can help the medicinal chemistry [
As reliability of different scoring functions is very target-dependent [
Some proposed consensus docking [
The work reported here seeks to address two questions. (1) How can different scoring functions predict the experimental binding affinities for MMPs-inhibitor complexes? (2) Do the well-performed scoring functions have also reasonable performance in an enrichment study on a member of MMPs family (MMP-12)?
The test set consisted of 100 MMPs-ligand complex structures formed of 10 human MMPs types. We excluded the structures with conflictive reported binding affinities. The 3D structures were taken from PDB (Protein Data Bank) and then underwent some refinements. Firstly, water and other cocrystalized molecules were removed from the retrieved PDB files. Then, the protein and corresponding ligand (inhibitor) were extracted to separate PDB files. The file formats changed to mol2 as it was a necessary step for some subsequent analysis. The hydrogens were added to both protein and ligand molecules. All of the selected PDB structures had experimentally determined Ki, Kd, or IC50. The logarithm of Ki, Kd, or IC50 was employed as experimental binding affinity in our study. The detailed structural information for each is presented in Supplementary Material available online at pAffinity = −Log (Ki, Kd or IC50) (
Metal ion (catalytic zinc ion) was saved as a part of the macromolecule. The Gasteiger partial charge was assigned for ligands. All of the above procedures were done using the PyMOL (
Various scoring functions have been evaluated in this study. 11 scoring functions including the five SYBYL built-in scoring functions (D-Score [
We employed previously defined consensus scoring (rank-by-number and rank-by-rank methods [
Finally, the principal component analysis (PCA) was applied on various set of scores of enrichment study to evaluate the discrimination power of PCA on our evaluated set of compounds. PCA is a powerful tool for different aspects of data evaluation including classification and pattern recognition. It can simplify and reduce the dimensionality of multivariate data set while preserving as much of the relevant information. The principal components (PCs) are linear combinations of the original variables. The first principal component (PC1) has the largest possible variance. The second principal component (PC2) is uncorrelated to the first one, and it accounts for most of the remaining variance. PCA model has been employed in our study for discrimination of actives among decoys in virtual screening results based on obtained scores from various scoring functions. In case of our study the PCA was applied to generate linear combination of different scores and extracted the main variation in the data as PC1 and subsequent rescoring and reranking of virtual screening results based on formulated PC1. The contribution of an individual score to the calculated PC can be described by its loading value.
The inhibitors molecules of docking set were prepared basically from the MMP-12 inhibitors spreadsheet taken from ChEMBL database [
The Glide (Glide, version 5.7, Schrödinger, LLC, New York, NY, 2011) was used for docking studies. As mentioned above a set of inhibitors and decoys was docked in MMP-12 (PDB code: 3F17) active site. For receptor preparation, water molecules were removed, hydrogens were added, and protein structure was minimized using protein preparation wizard [
The scoring functions were evaluated via calculation of the linear correlation between predicted binding affinity scores and experimentally determined binding affinities. Pearson’s correlation coefficient (
To evaluate the performance of the scoring functions in discriminating actives among decoys the scoring functions performance was tested on docked active and decoy compounds. The receiver operating characteristic (ROC) curve and enrichment factor (EF) were applied to determine the performance of each scoring function. The increase in area under the curve (AUC) of ROC curve can be used as an indicator of improvement in discrimination between true ligands from decoys. AUC can have a value between 0 and 1, in which AUC = 0.5 means that the method of interest performed like a random selection in average, while AUC = 1 means the complete discrimination between true and false cases (active and decoys). EF is defined as the fraction of active compounds found divided by the fraction of the screened library:
EF1% and EF2% are shown the ability of a particular scoring method to retrieve true ligands with a high rank among virtual screening results. They could be even more informative than AUC of ROC curve index, as scoring functions with AUC of ROC curve around 0.5 could still have an acceptable performance at early stage of the curve that can be detected using EF1% or EF2%.
All of the statistical test and plotting were done using R (R: a language and environment for statistical computing; R Foundation for Statistical Computing, Vienna, Austria; URL
The −Log experimental binding affinities (pAffinity) for the selected test set of MMPs-ligand complexes range from −3.9 to 4, spanning about 8 orders of magnitude with a mean value of 1.40 and STD of 1.52 (Supplementary Material). The correlation table of scoring functions (scores from all the 11 scoring functions as well as two consensus scorings) are shown in Supplementary Material. Table
Correlation coefficients (Pearson’s and Spearman’s correlation coefficients) for 11 individual and two consensus scoring functions with pAffinity.
Pearson’s correlation coefficient with pAffinity | Spearman’s correlation coefficient with pAffinity | |
---|---|---|
Consensus (rank-by-rank) | 0.298 | 0.227 |
Consensus |
−0.303 | −0.211 |
AutoDock4.1 | −0.049 | 0.019 |
ChemScore | −0.253 | −0.216 |
D-Score | −0.090 | −0.048 |
DSX | −0.368 | −0.255 |
F-Score | −0.390 | −0.391 |
G-Score | −0.178 | −0.148 |
PoseScore | −0.321 | −0.227 |
RankScore | −0.311 | −0.285 |
PMF-Score | −0.148 | −0.147 |
Vina | −0.078 | −0.036 |
X-Score | −0.209 | −0.109 |
The scoring functions are ranked from the best (1) to the worst (5) according to the correlation with experimental data.
Based on Rp | F-Score1 | DSX2 | PoseScore2 | RankScore2 | ChemScore3 | X-Score4 | G-Score4 | PMF-Score4 | D-Score5 | Vina5 | AutoDock4.15 |
|
|||||||||||
Based on Rs | F-Score1 | RankScore2 | DSX2 | PoseScore2 | ChemScore3 | G-Score4 | PMF-Score4 | X-Score4 | D-Score5 | Vina5 | AutoDock4.15 |
Scatter plot for the best performed scoring functions. Correlation of each scoring function relative to other scoring functions as well as experimental binding affinity (pAffinity) is shown.
Based on the fact that PoseScore and RankScore have online based interfaces they were excluded from rescoring assessment.
ROC curve plots specificity against sensitivity at different cutoff values (in this case, different scores). The enrichment ability of scoring functions was assessed on a set of docked compounds including known inhibitors and decoys. Table
The performance characteristics of scoring functions in discrimination of true binders after docking.
Scoring method | AUC of ROC curve | EF20% | EF10% | EF2% | EF1% | |
---|---|---|---|---|---|---|
Glide (HTS) | Glide | 0.653348 | 2.142857 | 2.5 | 1.785714 | 3.571429 |
F-Score | 0.242776 | 0 | 0 | 0 | 0 | |
PMF-Score | 0.501041 | 0.535714 | 0.714286 | 0 | 0 | |
G-Score | 0.551684 | 1.428571 | 1.428571 | 5.357143 | 10.71429 | |
D-Score | 0.515996 | 1.428571 | 1.785714 | 7.142857 | 7.142857 | |
ChemScore | 0.648363 | 2.678571 | 3.571429 | 7.142857 | 14.28571 | |
X-Score | 0.56054 | 1.25 | 1.785714 | 1.785714 | 3.571429 | |
DSX | 0.632341 | 2.142857 | 2.857143 | 1.785714 | 0 | |
Autodock | 0.646775 | 1.964286 | 2.142857 | 7.142857 | 10.71429 | |
Vina | 0.560097 | 1.25 | 1.071429 | 0 | 0 | |
|
||||||
Glide (SP) | Glide | 0.730062 | 2.758621 | 3.448276 | 8.62069 | 13.7931 |
F-Score | 0.409975 | 0.172414 | 0.344828 | 0 | 0 | |
PMF-Score | 0.496598 | 0.689655 | 0.689655 | 0 | 0 | |
G-Score | 0.56524 | 1.724138 | 1.724138 | 1.724138 | 3.448276 | |
D-Score | 0.549838 | 1.206897 | 1.724138 | 1.724138 | 0 | |
ChemScore | 0.757174 | 2.758621 | 4.827586 | 12.72414 | 20.68966 | |
X-Score | 0.605001 | 1.896552 | 1.724138 | 1.724138 | 3.448276 | |
DSX | 0.683476 | 2.586207 | 3.793103 | 6.896552 | 10.34483 | |
Autodock | 0.690234 | 1.896552 | 3.793103 | 12.06897 | 13.7931 | |
Vina | 0.592455 | 1.551724 | 1.37931 | 1.724138 | 3.448276 | |
PC1 | 0.79963 | 3.448276 | 5.862069 | 18.96552 | 34.48276 |
ROC curve of (a) Glide-Score, (b) DSX, (c) Autodock, and (d) ChemScore for Glide (HTS) virtual screening results.
ROC curve of (a) Glide-Score, (b) DSX, (c) Autodock, (d) ChemScore, and (e) PC1 for Glide (SP) virtual screening results.
Plot of PC2 against PC1 for Glide virtual screening results (SP). ▲: actives; ○: decoys.
It was clear that MMPs are still interesting targets for pharmaceutical studies. On the other hand, scoring functions have different performance on different targets [
The overall performance of scoring functions in prediction of experimental binding affinities of MMPs 3D structures in presence of inhibitors was not satisfying in comparison with those reported in some previous studies on other targets [
However, the scoring functions with top correlation coefficients (DSX and ChemScore) associated with the best ROC curve and EFs in rescoring virtual screening results of MMP-12. This was validated by the applicability of the predictivity power study results for MMPs. As the PCA potential for improving virtual screening results was demonstrated in previous reports [
The ultimate goal of this study was to determine which of the scoring functions or combinations of them would yield the best results in terms of enrichment when used against MMPs in a virtual screening study. Our study was retrospective and virtual screening was only performed in case of MMP-12. However, due to high similarity between active site structure and sequence among MMPs family, the similar results were expected for other members.
The author declares that there is no conflict of interests in his work.
This work is financially supported by Mashhad University of Medical Sciences.