Sulfonamide Based β-Carbonic Anhydrase Inhibitors: 2D QSAR Study

The carbonic anhydrases (CAs) (or carbonate dehydratases) form a family ofmetalloenzymes that catalyze the rapid interconversion of carbon dioxide and water to bicarbonate and protons (or vice versa), a reversible reaction that occurs rather slowly in the absence of a catalyst. The β-CAs have been characterized in a high number of human pathogens, such as the fungi/yeasts Candida albicans, Candida glabrata, Cryptococcus neoformans, and Saccharomyces cerevisiae and the bacteria Helicobacter pylori, Mycobacterium tuberculosis, Haemophilus influenzae, Brucella suis, and Streptococcus pneumonia. The β-CAs in microorganisms provide physiological concentration of carbon dioxide and bicarbonate (CO 2 /HCO 3 ) for their growth. Inhibition of β-CAs from the pathogenic microorganism is recently being explored as a novel pharmacological target to treat infections caused by the these organisms. The present study aimed to establish a relationship between the β-CAs inhibitory activity for structurally related sulphonamide derivatives and the physicochemical descriptors in quantitative terms. The statistically validated two-dimensional quantitative structure activity relationship (2D QSAR) model was obtained through multiple linear regression (MLR) analysis method using Vlife molecular design suits (MDS). Five descriptors showing positive and negative correlation with the β-CAs inhibitory activity have been included in the model.This validated 2DQSARmodel may be used to design sulfonamide derivatives with better inhibitory properties.


Introduction
The CAs belong to the family of metalloenzymes that catalyze the rapid interconversion of carbon dioxide and water to bicarbonate and protons (or vice versa), a reversible reaction that occurs rather slowly in the absence of a catalyst.The active site of most carbonic anhydrases contains a zinc ion [1].Genetically five different types of CAs enzymes are known till date.The -CAs are present in vertebrates, protozoa, algae, and some bacteria and also in cytoplasm of green plants [2].While the -CAs are predominantly found in bacteria, algae, chloroplasts of both mono-and dicotyledons and some fungi and archaea [3], the -CAs are found in archaea and some bacteria [4].Both the -and -CAs forms are present only in marine diatoms [5].
These enzymes which catalyze the interconversion between carbon dioxide and bicarbonate, with release of a proton, are involved not only in pH homeostasis and regulation but also in biosynthetic reactions, such as gluconeogenesis and ureagenesis in animals, CO 2 fixation (in plants and algae), and electrolyte secretion in a variety of tissues/organs, with many of the 16 mammalian CAs isozymes being the established drug targets for design of diuretics, antiglaucoma, antiepileptic, antiobesity, and/or anticancer agents [6][7][8][9][10].
The -CAs in a microorganism provide physiological concentration of carbon dioxide and bicarbonate (CO 2 / HCO 3 − ) for its growth.Thus, inhibition of -CAs from the pathogenic microorganism is emerging as a novel pharmacological target to treat infections caused by them [19].
QSAR is a computerised statistical method which tries to explain the observed variance in the biological effect of compounds as a function of molecular changes caused by the nature of substituent.The quantitative structure-activity relationship (QSAR) approach became very useful and largely widespread for the prediction of biological activities, particularly in drug design.This approach is based on the assumption that the variations in the properties of the compounds can be correlated with changes in their molecular features [20].
Several agents belonging to different chemical classes like sulfonamides, sulfamates, aromatic and aliphatic carboxylates, boronic acids, dithiocarbamates, and so forth have been reported to cause -CAs inhibition in the in vitro inhibition studies [21][22][23][24][25][26][27][28][29][30][31].Amongst these, sulphonamide derivatives have shown especially good -CAs inhibitory activity and thus display a potential for development as effective antimicrobial agents [29].While there is a report that correlates the structures of some sulphonamide derivatives with their -CAs inhibitory activity [32], no attempts have so far been made to correlate the structure of reported -CAIs with their inhibitory activity.Hence, it was thought appropriate to perform a QSAR study to understand the correlation between the physicochemical parameters and the -carbonic anhydrase inhibitory (-CAI) activity of the sulphonamide derivatives reported in the literature.It is expected that such 2D QSAR studies will provide better tools for rational design of promising -CAIs.

Methodology
2.1.Material and Methods.All molecular modeling studies were performed using the VLife MDS [33].The studies were carried out on Dell PC with a Pentium IV processor and Windows XP operating system.The structures of all compounds were sketched and cleaned in Chem Draw Ultra 8.0 version [34].Energy minimization and geometry optimization were conducted using the merck molecular force field (MMFF) method with the root mean square gradient set to 0.01 kcal/mol Å and the iteration limit to 10, 000.

Model Development.
The 2D QSAR model was generated by MLR method by using V-Life MDS.The model shows the relation between the biological activity (dependant variable) and molecular descriptors (independent variables) by using linear equations.This method of regression estimates the values of the regression coefficients by applying least square curve fitting method.MLR is the traditional and standard approach for multivariate data analysis.Multivariate analysis is the analysis of multidimensional data matrices by using statistical methods.Such data metrices can involve dependent and/or independent variables.For getting reliable results, parameters were set such that the regression equation should generate number of independent variables (descriptors) 5 times less than that of compounds or molecules.

Statistical Analysis.
Statistical quality of generated model was judged based on parameters such as squared correlation coefficient  2 , crossed validated  2 , which is relative measure of quality of fit and Fischer's value -test which represents ratio between the variance of calculated and residual variance, and pred  2 [36].The best way to evaluate quality of regression model is internal validation of QSAR model.Mostly to check internal validation, leave-one-out (LOO) cross-validation method is used.In LOO method, one object (one biological activity value) is eliminated from training set and training dataset is divided into subsets (number of subsets = number of data points) of equal size.Model is built using these subsets and dependent variable value of the data point that was not included in the subset is determined, which is a predicted value.Mean of predicted will be same for  2 and LOO  2 (cross-validated correlation coefficient value) since all the data points will be sequentially considered as predicted in LOO subset.Same procedure is repeated after elimination of another object until all objects have been eliminated once.To calculate  2 , the following equation was used: where  pred ,  act , and  mean are predicted, actual, and mean values of the pK  , respectively.Σ( pred −  act ) 2 is the predictive residual error sum of squares (PRESS).Definitive validity of model is examined by mean of external also, which evaluates how well equation generalizes.To calculate the pred  2 , the following equation was used: pred where  pred(Test) and  Test are predicted and observed activity values, respectively, of test set compounds, and  Training is the mean activity value of training set.Statistical significance of these models was further supported by "fitness plot" obtained for each model; this is a plot of experimental versus predicted activity of training and test set compounds and provides an idea about how fit the model was trained and how well it predicts activity of external test set (Figure 2) [37][38][39].
For selection of variables a two-way stepping algorithm was used.The following MLR QSAR model-1 (3) was generated: pK  = 0.4356 + 1.8779 (±0.1786) × chi3cluster The above model-1 (3) was not found to be satisfactory.Thus, to improve the quality of model, outliers (u21, u25, and u26) were identified from data set by calculating residuals value (Observed activity-Predicted activity) and removed.
Compounds that have unexpected biological activity and are unable to fit in a QSAR model and are known as outliers [40].
Using data set (model- From model-2 (4) we can conclude that using the twoway stepping algorithm, the most significant descriptors contributing to model are chi3cluster, SsssCHCount, and Polar surface area including P and S, K3alpha, and SsssOCount.The description of the descriptors used in the model is given in Table 4. Generated QSAR model shows high squared correlation coefficient  2 = 0.90 between descriptors (chi3cluster, SsssCHCount, and Polar surface area including P and S, K3alpha, and SsssOCount) and -CAI activity against Candida albicans Nce103.The squared correlation coefficient  2 = 0.90 also explains 90% of the variance in biological activity.Cross-validation of model was performed by LOO method.The  2 = 0.84 qualifies it to be a valid model.The contribution graph for 2D QSAR model (Figure 1) reveals that the descriptors chi3cluster and k3alpha are contributing 44.0% and 08%, respectively.Three more descriptors SsssCHcount and Polar surface area including P and S and SssOcount are contributing inversely 30.30%, 10.0%, and 02.0%, respectively, to biological activity.
Figure 2 shows the data fitness plot of model ( 4).The plot is an idea about linearity fit between the observed and predicted activity.
The correlation matrix given in Table 5 strongly supports the fact that no two descriptors used in model are correlated.
The predicted activity for the molecules used in generating and testing the 2D QSAR model using ( 4) is presented in Table 6.

Conclusion
A set of 65 molecules of sulphonamide derivatives were subjected to 2D QSAR analysis using MLR to understand correlation between the physicochemical parameters and the -CAI activity.A valid QSAR model for designing and predicting the -CAI activity of newer sulphonamide derivatives has been successfully generated using MLR method.The 2D QSAR model (4) generated indicates that chi3cluster

Figure 1 :Figure 2 :
Figure 1: Contribution graph for various descriptors used in the model.

Table 1 :
Structure and -CAs inhibitory activity of benzenesulfonamides.
Calculation.2D QSAR study requires the calculation of molecular descriptors.A large number of theoretical 2D individual descriptors such as Mol.Wt.,Volume, XlogP, and smr; physiochemical such as Estate Numbers, Estate contributions, Polar Surface Area, Element Count, Dipole moment, and Hydrophobicity XlogpA, Hydrophobicity SlogpA; and topological such as T 2 Cl 6, T C Cl 6, T T S 7, and T T Cl 7 type were computed.A total of 736

Table 4 :
Description of 2DQSAR model descriptors.

Table 6 :
Observed and predicted biological activity of -CAIs by 2D QSAR model-2.