Characterization of Selective Antibacterial Peptides by Polarity Index

In the recent decades, antibacterial peptides have occupied a strategic position for pharmaceutical drug applications and became subject of intense research activities since they are used to strengthen the immune system of all living organisms by protecting them from pathogenic bacteria. This work proposes a simple and easy statistical/computational method through a peptide polarity index measure by which an antibacterial peptide subgroup can be efficiently identified, that is, characterized by a high toxicity to bacterial membranes but presents a low toxicity to mammal cells. These peptides also have the feature not to adopt to an alpha-helicoidal structure in aqueous solution. The double-blind test carried out to the whole Antimicrobial Peptide Database (November 2011) showed an accuracy of 90% applying the polarity index method for the identification of such antibacterial peptide groups.


Introduction
The increasing resistance of pathogen agents towards multiple drugs has oriented parts of the investigation in bioinformatics to fast and efficient techniques that can predict the remarkable impact of antibacterial peptide action. These techniques can help to enhance the sometimes cumbersome chemical synthetic approach as well as the subsequent trial and error experiments to identify the peptide performance.
Among the proposed various classifications of peptides, one of it refers to the alpha-helicoidal versus beta-sheet conformation that the peptides can adopt in aqueous solution. This classification refers to the predominance of certain amino acids in the linear sequence of the peptides such as proline-arginine, cathelicidin, or cysteine. It is important to note that such classification appears to be without any influence on the toxicity or selectivity of the peptide once it got in contact with the target membrane [1,2].
Although nature was used as the main source of peptides with antibacterial properties in the past [3], parts of the research efforts are now more directed towards synthetic strategies. One of these synthetic approaches generate the peptides by replacing and/or removing constitutive amino acids from a natural peptide known for its antibacterial action [4], thus trying to reduce its size while keeping or increasing its toxicity [5]. Another technique consists of joining two peptides that individually do not exhibit antibacterial properties but combined turn out to be highly toxic [6].
To obtain efficient antibacterial peptides by measuring the potential action of each altered peptide with the-above 2 International Journal of Peptides described methods would result in a possibility combination that exceeds by far the capacity of the known verification methods in the laboratory. For instance, the number of possible peptides to be formed from one peptide with 8 amino acids in length would be 20 8 = 25,600,000,000 peptides. This is the reason why contemporary technique profiles to construct antibacterial peptides are the result of joint computational and/or mathematical methods to simulate peptide variations and then to evaluate and qualify these variations to eventually determine if the peptide complies with the required purposes. However, these methods with the aim to simulate the properties of the peptides as well as to evaluate their performance respecting all possible combinatorics are highly complex in their mathematical/computational model design.
In this paper, we present a statistical method that can be attributed to a single physical-chemical property, which is easy to computerize and that efficiently identifies antibacterial peptide subgroups for its highly selective toxicity to bacteria, hereinafter referred to as "Selective Cationic Amphipathic Antibacterial Peptides" (SCAAPs). A SCAAP is characterized by being less than 60 amino acids in length, not adopting an alpha-helicoidal structure in neutral aqueous solution, and showing a therapeutic index higher than 75 [7]. The therapeutic index of a peptide is defined as the ratio between the minimum inhibitory concentration observed against mammalian and bacterial cells [7,8]; that is, the higher the value, the more specific the peptide for bacteriallike membranes. Hence SCAAPs display strong lytic activity against bacteria but exhibit no toxicity against normal eukaryotic cells such as erythrocytes [9].
Our method determines an index that we call polarity index that uses the existent 20 proteic amino acid classification differentiated by its side chain R that divides them in four types and three categories [10]. The three general categories of side chains are nonpolar, polar but uncharged, and charged polar. The nonpolar residues include those with aliphatic hydrocarbon side chains: Gly, Ala, Val, Leu, Ilu, Pro, one aromatic group, Phe, and one "pseudo-hydrocarbon," Met. The polar but neutral category contains two hydroxylcontaining residues, Ser and Thr; two amides, Asn and Gln; two with aromatic rings, Tyr and Trp; one with a sulfhydryl group, Cys. In the charged polar class there are two amino acids with acidic groups, Asp and Glu, and three bases, His, Lys, and Arg (Table 1). The polarity index only makes use of that classification to get the SCAAP characteristic blueprint that in a double-blind test applied to all known peptides registered in the APD database (November 2011) [11] showed a very high efficiency.

Physicochemical Properties.
Peptides can be expressed linearly as an amino acid sequence [12]. Such representation gives the peptide a unique blueprint. From this sequence, mathematical/computational algorithms have been designed with different complexity levels that measure a variety of physicochemical properties [13]. Among the properties on which the linear peptide representation focuses are two that define if a peptide falls into the category of SCAAP [7]; that is, when its measure meets simultaneously the parameters established for the following physicochemical properties: (i) isoelectric point [14] (IP) from 9.65 to 11.80, (ii) hydrophobic moment [14] (HM) from 0.16 to 0.57.
Note that the original parameter values [7] have been extended. For this work, it was decided to take these two properties at a maximum range without considering the so-called AGADIR property, which is the tendency for not adopting an alpha-helicoidal structure in aqueous solution.
As we have already verified [13], this property is not of significance for peptides with a length smaller than 22 amino acids. A statistical-computational method was designed based only on one physicochemical property: polarity, which quickly and efficiently discerns if a peptide falls into the category of SCAAP or not. The verification was carried out by evaluating the IP and HM physicochemical properties.

P[i, j]
Incidence Matrix from a Subject Peptide. The P[i, j] incidence matrix is built by adding to each of its elements the matches that occurred in the peptide subject sequence from the left to the right with two amino acids in length and by moving one amino acid to the right at the time until it arrives at the peptide side end. Each amino acid pair is related to its polarity group. From that association, we identify row i and column j. To the P[i, j] matrix element  will be added 1, resulting thus in P[i, j] = P[i, j] + 1. Finally, the P[i, j] incidence matrix relative frequency distribution is normalized and weighted over a 0.30 factor. This last step helps to enhance the peptide distinctive characteristics by increasing the effect of the relative frequency position of the amino acids pairs in the incidence matrix P[i, j].

Q[i, j]
Incidence Matrix from a SCAAP Set. The Q[i, j] incidence matrix is determined following the same procedure as for the P[i, j] incidence matrix. The peptide used here is the set of peptide sequences described in Table 2.
The peptides used here as SCAAP templates were reported as SCAAP subjects by Del Rio et al. [7]. From the 7 peptides submitted, only those with a therapeutic index higher or equal to 1000 were chosen ( Table 2, entries 3 and 7). 2.6. Trial Data Preparation. 1894 peptides registered in the Antimicrobial Peptide Database (APD) [11] (November 2011) were analyzed and classified by their single and multiple action against fungi, virus, mammalian cells, Gram+/Gram− bacteria, cancer cells, insects, parasites, and sperms. Peptides with more than one action were not included. The single action database only includes peptides with confirmed experimental action on a single pathogen agent, in contrast to multiple-action databases that contain peptides with action on two or more pathogen agents. On this basis, the figures in multiple action databases are overrepresented.

P[i, j] and Q[i, j] Matrices
The verification of peptides found in the single-action database on Gram+/Gram-bacteria was carried out by validating both the isoelectric point (IP) and hydrophobic moment (HM) in the ranges stated (see Section 2.1). The integrity of the APD database information was verified by checking identified peptides by their action in the whole extent of the database itself.

Results
Due to the importance of detecting possible peptide pathogenic action, the use of computer programs that evaluate peptic sequences to predict their action on different pathogen agents such as fungi, virus, mammalian cells, and Gram+/Gram-bacteria has become a standard practice among different research groups. The polarity index method is one of these computer programs, but it differs in measuring exclusively one physicochemical property to identify a SCAAP.
The P[i, j] Incidence matrix delivered by the polarity index method to identify a SCAAP used two peptides known by their toxic activity on Gram+/Gram+ bacteria ( Table 2, Table 3).
Note that the polarity index method only identified SCAAP subjects basically in the bacterial group. Whereas SCAAP subjects identified from the multiple pathogenic action peptide database were fungi (62/638), viruses  (Table 3). Among the 743 peptides with a single action on Gram+/Gram-bacteria, the polarity index method identified 51 SCAAP subjects (Table 4), their   IP and HM parameters were calculated and 46 of them are in the ranges previously mentioned in Section 2.1; that is, IP = 9.65-11.80 and HM = 0.16-0.57. The APD database information integrity verification [11] showed 14 peptides not classified yet. When their activity as SCAAP was double checked by the polarity index method, there was a mismatch. The APD database margin of error did not exceed 8%.

Discussion
All different peptide classifications achieved over the decades seem to be directed to validate the peptide action and toxicity. However, it appears that these two characteristics are intrinsically related to the space where the peptide interacts as well as to the structural form of the subject membrane. Missing peptide specificity in the studied isolated peptides indicates that nature avoids peptide specificity in order not to favor certain pathogen agents in their blocking action.
Most peptides found experimentally show multiple actions on pathogen agents. Thus it appears that the detection and prediction of antibacterial peptides-in our case SCAAP-is more related to general, nonspecific peptide profiles that are well known for their antibacterial action. For that reason and as given in the present case, more efficient algorithms should rather evaluate fundamental characteristics of such peptides and search for small differences among them.
The design of bioinformatical algorithms to detect antimicrobial peptides is basically of two types.
(i) Based on a system of differential equations [15] that characterizes the peptide properties with an exponentially growing complexity. (ii) The inclusion of multiple peptide characteristics without affecting its complexity [16] where the efficiency greatly depends on a skillful peptide set selection.
Our polarity index method falls in the latter category and is characterized by the following.
(i) Effectively excluding multiple action peptides, with a margin of error less than 10% and single-action peptides with a margin of error less than 6%.
(ii) Its efficiency to identify SCAAP subjects which is higher than 90%.
(iii) The simplicity of the computational method which is easy to implement for massive parallel processing in GPUs [17].
(iv) Its straightforwardness by measuring the peptide polarity exclusively and from this information effectively classifying its pathogenic action.
The algorithm involved in this method allows simple modifications to identify in a general level peptide groups by their pathogenic action and in a more specific level to refine the peptide search and identification as in the group used here. The polarity index method uses the amino acid polarity classification; however there are other types of classifications [18,19] that use the amino acid side chain chemical properties such as the neutral pH charge, their type of chemical structure, the reactivity, the elements present, or the ability to form hydrogen bonds. These classifications can be used to generate a more specific peptide blueprint when searched, with features that would not be considered otherwise.
As this method is a simple mathematical and computational algorithm, it does not demand heavy computational resources as processing memory or speed; therefore it can be used to explore peptide regions. These peptide regions can be worked out by evaluating massively all possible peptide combinations with the same length [20], thus taking advantage of the polarity index method simplicity to determine their activity.
International Journal of Peptides 7

Conclusion
The statistical/computacional polarity index method is an effective algorithm to find potential antibacterial peptides from a public domain database. These peptides have been denominated "Selective Cationic Amphipathic Antibacterial Peptides" (SCAAP). The method features a high efficiency to exclude peptides that exhibit single pathogenic action on other pathogens than bacteria, and it is equally efficient to exclude multiple-action peptides. In summary, the polarity index method is an adaptable and efficient method to detect and predict SCAAPs and it is a useful analysis and modeling tool for biological sequences using a single physicochemical property.

Conflicts of Interest
We declare that we do not have any financial and personal relationship with other people or organizations that could inappropriately influence (bias) our work.

Authors Contribution
Experiments conception and design were done by C. Polanco and J. L. Samaniego. Experimental performance was made by C. Polanco. Data analysis was made by T. Buhse. Results discussion was made by: T. Buhse, F. G. Mosqueira, A. Negron-Mendoza, S. Ramos-Bernal, and J. A. Castanon-Gonzalez.