Prediction of Antifungal Activity of Gemini Imidazolium Compounds

The progress of antimicrobial therapy contributes to the development of strains of fungi resistant to antimicrobial drugs. Since cationic surfactants have been described as good antifungals, we present a SAR study of a novel homologous series of 140 bis-quaternary imidazolium chlorides and analyze them with respect to their biological activity against Candida albicans as one of the major opportunistic pathogens causing a wide spectrum of diseases in human beings. We characterize a set of features of these compounds, concerning their structure, molecular descriptors, and surface active properties. SAR study was conducted with the help of the Dominance-Based Rough Set Approach (DRSA), which involves identification of relevant features and relevant combinations of features being in strong relationship with a high antifungal activity of the compounds. The SAR study shows, moreover, that the antifungal activity is dependent on the type of substituents and their position at the chloride moiety, as well as on the surface active properties of the compounds. We also show that molecular descriptors MlogP, HOMO-LUMO gap, total structure connectivity index, and Wiener index may be useful in prediction of antifungal activity of new chemical compounds.


Introduction
In recent years the number of applications of quaternary ammonium compounds (QACs) has increased considerably. Gemini QACs are a group of cationic surfactants containing two head groups and two aliphatic chains linked by a spacer group.
Practical implementation of gemini QACs is a result of their surface active, antielectrostatic, and antimicrobial properties.
It has been demonstrated that gemini QACs exhibit properties superior to mono QACs, such as better solubility, higher adsorption efficiency, and better wetting and foaming [1][2][3][4]. Gemini QACs are more efficient in lowering surface tension and have much lower critical micelle concentration (CMC) [5]. Due to their higher surface activity they have excellent dispersion stabilization and soil clean-up properties [6,7]. It has been also demonstrated that gemini QACs have good antifungal activity [8][9][10], which is higher than mono QACs [11,12]. So it is worth developing new, more effective compounds, such as gemini QACs.
Because of the increasing resistance of microorganisms to commonly used disinfectants, the synthesis of new types of microbicides is a very important topic [13]. Formation of resistant strains of fungi is not as common as formation of resistant strains of bacteria [14]. Nevertheless, knowledge of properties of chemical compound, which influence the antifungal activity of gemini QACs, enables designing and synthesis of new, active chemical entities.
The main goal of our study was to investigate relationships between selected molecular parameters and features describing chemical structure and surface active properties and antifungal activity (described as MFC (minimal fungicidal concentration)). In MFC study Candida albicans ATCC 90028 strain was used. In structure-activity relationship study (SAR), modified method, based on a rough set theory, was employed.
Candida albicans is one of major opportunistic pathogens causing a wide spectrum of diseases in human beings. It can cause infections that range from superficial infections of the skin to life-threatening systemic infections [15]. Given the limited number of suitable and effective antifungal agents, together with increasing drug resistance of the pathogens, it is important that new classes of antifungals are discovered [16]. Moreover, better understanding of which features of chemical compounds decide high antifungal activity may provide further information useful for the improvement of antifungal action.
Data that describe the analyzed series of gemini imidazolium chlorides can be seen as classification data, where parameters characterizing structure and surface active properties, as well as molecular parameters, are condition attributes (independent variables) and antifungal activity is represented by class labels assigned to chlorides by a decision attribute (dependent variable). Structure-activity relationships can be discovered from these data by explaining the class assignment in terms of condition attributes. To this end, we applied the rough set concept [17], and its particular extension called Dominance-Based Rough Set Approach (DRSA) [18][19][20][21].

Gemini Imidazolium Chlorides.
We analyzed 10 homologous series of gemini imidazolium chlorides with hydrophobic chain ranging from CH 3 to C 16 H 33 and with the length of spacer from C 2 to C 12 . Synthesis, surface active properties, and antimicrobial activity of a part of 140 3,3 -( ,dioxaalkyl)bis(1-alkylimidazolium) chlorides were described earlier [22]. Moreover, we determined molecular descriptors for synthetized structures. The antifungal activity was determined by the MFC values. The final stage of our study was an analysis of structure-activity relationships using DRSA [21].

Chemical
Structure. Chemical structure of chlorides was described by the following parameters (see Figure 1 and Table 1): (i) : number of carbon atoms in -spacer, (ii) : number of carbon atoms in -substituent.

Surface Active Properties.
Surface active properties of analyzed chlorides were described by the following parameters:  (v) Δ ads : free energy of adsorption of molecule (kJ/mol).

Molecular Parameters.
We also considered molecular parameters of analyzed compounds, which were calculated with Dragon and Gaussian software. Molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or a result of a standardized experiment [23]. Those parameters were 2.5. Antifungal Activity. Candida albicans ATCC 90028 microorganisms were used to evaluate antifungal activity of compounds by minimal fungicidal concentration (MFC). MFC determination method was presented in [22]. According to the value of MFC objects were sorted into three decision classes: Values of MFC for activity classes were determined on the basis of antimicrobial activity of benzalkonium chloride and didecyldimethylammonium chloride used as reference antifungals.

SAR Analysis Based on DRSA-Description of the Method.
DRSA assumes that the value sets of condition attributes are ordered and monotonically dependent on the order of decision classes. DRSA proved to be an effective tool in analysis of classification data which are partially inconsistent [24,25]. In the context of this study, inconsistency means that between a pair of chlorides the first one has not worse surface active and molecular properties than the other, although the first one is assigned to a worse class of antifungal activity than the other. The rough set analysis of consistent and inconsistent chlorides prepares the ground for induction of decision rules. The rules derived from data structured using the concept of the DRSA are monotonic, which means that they have the following syntax: "if at i (chloride) ≥ val i and at j (chloride) ≥ val j and ⋅ ⋅ ⋅ and at p (chloride) ≥ val p , then chloride is assigned to at least a given class, " "if at k (chloride) ≤ val k and at l (chloride) ≤ val l and ⋅ ⋅ ⋅ and at s (chloride) ≤ val s , then chloride is assigned to at most a given class, " where at h is an ℎth condition attribute and val h is a threshold value of this attribute, which makes an elementary condition at h (chloride) ≥ val h or at h (chloride) ≤ val h composing a condition part of a rule indicating assignment of a chloride to at least (or at most) a given class (weak, medium, or good), respectively. In the above syntax of the rules, it is assumed that value sets of all condition attributes are numerical and ordered such that the greater the value, the more likely it is that the chloride has good antifungal activity; analogously, it is assumed that the smaller the value, the more likely it is that the chloride has weak antifungal activity. Attributes ordered in this way are called gain-type. Cost-type attributes have value sets ordered in the opposite direction, such that elementary conditions on these attributes have opposite relation signs. In case of gemini imidazolium chlorides data, it is not known a priori whether condition attributes are gain or cost attributes. Therefore, we proceeded as described in [26]: each original attribute is considered in two copies, with the first copy assumed to be gain-type and the second cost-type. The applied transformation of data is noninvasive; that is, it does not bias the relationships identified between condition attributes and the decision attribute. Then, an induction algorithm constructs decision rules involving elementary conditions on one or both copies of particular attributes. For example, for a rule indicating the assignment of a chloride to class good (at least good), the following elementary conditions concerning attribute at i may appear: where ↑at i and ↓at i are gain-type and cost-type copies of attribute at i , respectively. Note that this transformation of attributes allows global and local monotonic relationships to be discovered between condition attributes and class assignment. A monotonic relationship is global when it can be expressed by a single elementary condition concerning gain-type or cost-type attribute. Local monotonicity relationship requires conjunction of two elementary conditions of different types. In case of assignment of a chloride to class good we can have such a local monotonicity relationship; for example, when concentration of a surface active property is below a certain point, the greater the value the better the assignment, but after that point further increase may have a negative effect (i.e., the lower the value the better the assignment).

Results and Discussion
3.1. Information System. Information system is the basis of SAR analysis of the chemical compounds. It includes a set of objects (in rows) described by a set of attributes (in columns). The set of attributes is composed of condition and decision attributes. In our case, condition attributes describe surface active properties, molecular descriptors, and structure (the length on spacer and the length of -chain) of analyzed chlorides. The decision attribute concerns antifungal properties of bis-quaternary imidazolium chlorides represented by some limit values of MFC for Candida albicans ATCC 90028. A part of information system is presented in Table 2. Table 3 includes strong and relevant decision rules obtained for good and weak classes of chlorides presented in Table 2. These are rules selected from the set of all minimal decision rules induced from information table processed by DRSA.  We did not induce rules for class "medium" since these rules are not interesting from the viewpoint of SAR analysis (it is more important to know what are the features of chlorides with definitely good or weak antimicrobial properties). However, the presence of chlorides from the "medium" class is important in the rule induction process. The rules with conclusion "good" discriminate chlorides with "good" antimicrobial properties from those chlorides which have "medium" or "weak" properties (analogously for rules with conclusion "weak").

Decision Rules.
The decision rules provide guidelines for synthesis of new compounds with better antifungal properties. The rules are characterized by various parameters, such as examples (i.e., number of objects covering a given rule), strength (i.e., the proportion of objects covered by premise that are also covered by conclusion), or confirmation (i.e., measure that is quantifying the degree to which premise provides evidence for conclusion).
In Table 3 only attributes that were present in decision rules are included.
Rules are characterized by their strength defined as a ratio of the number of chlorides matching the condition part of the rule to the total number of chlorides in the sample. Sets of decision rules, which are essential for the analysis presented in this work, were induced from gemini imidazolium chlorides data, which were collected in an information system. A part of the system can be seen in Table 2. These data were transformed as described above and structured according to the DRSA. The induction algorithm that was applied to construct rules is called VC-DomLEM [27]. The algorithm was implemented as a part of software package called jMAF (http://idss.cs.put.poznan.pl/site/139.html), based on the java Rough Set (jRS) library. The sets of induced rules were used to construct component classifiers in variable consistency bagging [28,29]. Variable consistency bagging (VC-bagging) [29] was applied to increase the accuracy of results produced by VC-DomLEM.
Both rule relevance and relevance of attribute, which are present in condition part of rules, were estimated by measuring Bayesian confirmation, as described in [30]. In this process, decision rules were constructed repetitively on bootstrap samples and tested with chlorides that were not included in the samples.
In the "good" class of antifungal activity strong rules, supported by a large number of objects, were obtained. The most interesting rules are characterized by high confirmation measures. In decision rules covering chlorides with good activity against Candida albicans, chlorides with -spacer longer or equal to 6 atoms of carbon predominate. We can also observe that optimal length of -substituent is from 7 to 11 carbon atoms in a chain. Moreover, those rules emphasize that CMC is important from the point of view of assigning new compounds into a good class of activity. As it was mentioned before, we included molecular descriptors into our SAR analysis. Results are as follows: Moriguchi octanolwater partition should be in the range [3.836; 6.94], the energy difference between the HOMO and LUMO should be less than or equal to −0.17314, Balaban index should be greater than or equal to 1.242, Narumi topological index should be greater than or equal to 21, and total structure connectivity index should be less than or equal to 0.218.
When we consider assigning new chlorides into weak decision class, the length of -spacer in compound's moiety should be shorter or equal to 6 atoms of carbon. We can also observe that values of surface tension at critical micelle concentration greater or equal to 50.1, values of surface excess greater or equal to 2.48, and values of free energy of adsorption of molecule less than or equal to 23.2 are important when considering weak activity against Candida albicans strains. Decision rules for weak class of chlorides include only one molecular descriptor, Moriguchi octanolwater partition coefficient, in contrast to good activity class, which included all molecular descriptors, besides Wiener index.

Attribute Relevance.
Results of estimation of predictive confirmation of all attributes (structure, surface active, and molecular ones) in rules induced for class good and weak are presented in Figures 2 and 3.
Let us interpret a rule as a consequence relation "if E, then H, " where denotes rule premise and rule conclusion. For rule relevance, the Bayesian confirmation measure quantifies the contribution of rule premise to correct classification of unseen individuals. Many Bayesian confirmation measures have been described in the literature, of which we used the measure ( , ). This approach allows clear interpretation in terms of a difference of conditional probabilities involving and E; that is, ( , ) = Pr( | ) − Pr( | ¬ ), where probability Pr(⋅) is estimated from the test samples of chlorides. For the relevance of single attributes, the Bayesian confirmation measure quantifies the degree to which the presence of attribute at i in premise , denoted by at ⊳ , provides evidence for or against conclusion of the rule. Here, we used again measure ( , at ⊳ ), which, in this case, is defined as follows: ( , at ⊳ ) = Pr( | at ⊳ ) − Pr( | at ¬ ⊳ ). Consequently, attributes present in the premise of a rule that assigns chlorides correctly or attributes absent from the condition part of a rule that assigns chlorides incorrectly are considered more relevant.
We can observe that attributes Moriguchi octanol-water partition coefficient, the length of substituent, and HOMO-LUMO gap are the most relevant when the good class of activity is considered. On the other hand, the most relevant attributes for weak decision class are the length of -spacer, Balaban index, and LUMO parameter. These results show that all three types of parameters: structure, surface active, and molecular might be helpful in assigning new chemical entities to a specific class of antifungal activity.
Chemical structure of gemini surfactants influences not only their surface properties, but also their antimicrobial activity. It has been widely accepted that optimal antimicrobial activity can be obtained from 10 to 18 atoms of carbon in an aliphatic chain, with an optimum of 12 to 16 atoms of carbon, depending on a bacterial strain [31]. An elongation of the hydrophobic chain increases antimicrobial activity, but only to a given limit, after which, activity decreases. It was also observed that the lowest MFC values are specific for medium-length hydrophobic substituents attached to a  quaternary atom of nitrogen [32]. Similar observations can be found in [33]. Specific properties of gemini compounds, with the above mentioned length of hydrophobic substituents, are related to their ability to form and coexist with small spherical micelles and large aggregates. Below this range only micelles are found, while above this range only aggregates are observed [34].
In this paper, it was found that good antifungal activity for a group of analyzed gemini chlorides is related to -spacer equal to or longer than 6 atoms of carbon. Moreover, we discovered more features being in a strong relationship with a good antifungal activity, regarding Candida albicans strains. Those are not only the length of substituents in a moiety but also logCMC and CMC, Moriguchi octanol-water partition coefficient, the energy difference between the HOMO and LUMO, Balaban index, Narumi topological index, and total structure connectivity index. Those parameters should be taken into consideration when one will plan synthesis of new gemini chloride with a high anti-Candida albicans activity.

Results of Stratified
Cross-Validation. The model constructed by VC-bagging with VC-DomLEM component classifiers showed good classification performance in 5-fold stratified cross-validation, which was repeated 100 times for a better reproducibility of results. First, we considered accuracy of distinction between chlorides that have good and not good (i.e., medium or weak) antifungal activity properties. In this case, on the average, 77.3% of chlorides were correctly classified (81.9% were correctly classified as having good properties, and 70.7% were correctly classified as having not good properties). Second, we checked distinction between chlorides having weak and not weak (i.e., medium or good) antifungal activity properties. On the average, 86.2% of chlorides were correctly classified in this case (80.9% were correctly classified as having weak properties and 88.1% were correctly classified as having not weak properties).

Conclusions
Decision rules presented in this study show that number of carbon atoms in -spacer, number of carbon atoms in -substituent, MlogP, HOMO-LUMO gap, total structure connectivity index, and Wiener index have the most influence on the increase of antifungal activity of 3,3 -( ,dioxaalkyl)bis(1-alkylimidazolium) chlorides. On the other hand, number of carbon atoms in -spacer, value of surface excess, and Wiener index affected decreasing of antifungal activity of studied gemini imidazolium chlorides. Obtained results are directions for synthesis of new active molecules of gemini imidazolium chlorides possessing strong antifungal action. DRSA is a valuable tool to conduct SAR analysis of chemical compounds.