A new hybrid Multiphase Simulated Annealing Algorithm using Boltzmann and Bose-Einstein distributions (MPSABBE) is proposed. MPSABBE was designed for solving the Protein Folding Problem (PFP) instances. This new approach has four phases: (i) Multiquenching Phase (MQP), (ii) Boltzmann Annealing Phase (BAP), (iii) Bose-Einstein Annealing Phase (BEAP), and (iv) Dynamical Equilibrium Phase (DEP). BAP and BEAP are simulated annealing searching procedures based on Boltzmann and Bose-Einstein distributions, respectively. DEP is also a simulated annealing search procedure, which is applied at the final temperature of the fourth phase, which can be seen as a second Bose-Einstein phase. MQP is a search process that ranges from extremely high to high temperatures, applying a very fast cooling process, and is not very restrictive to accept new solutions. However, BAP and BEAP range from high to low and from low to very low temperatures, respectively. They are more restrictive for accepting new solutions. DEP uses a particular heuristic to detect the stochastic equilibrium by applying a least squares method during its execution. MPSABBE parameters are tuned with an analytical method, which considers the maximal and minimal deterioration of problem instances. MPSABBE was tested with several instances of PFP, showing that the use of both distributions is better than using only the Boltzmann distribution on the classical SA.
1. Introduction
In genetics DNA, RNA, and proteins are the basic elements for many researches. DNA is a molecule that contains genetic instructions, which are involved in protein synthesis process [1]. This molecule represents a complete set of hereditary information of any organism. DNA has four different nucleotides, which are adenine, cytosine, guanine, and thymine. This molecule is divided into genes, and a gene is a sequence of nucleotides that express a protein. A functional protein is conformed in an approximated geometrical model of the global minimum energy [2, 3]. This is a dinamic process where the lowest free energy of the protein plus the solvent can be reasonably approximated by the minimum free energy found by Monte Carlo, conformational space annealing, genetic algorithms, and some deterministic methods [3, 4]. In fact, there are some examples, such as insulin alphalytic [5, 6] with natural conformations whose energy is not minimal. This structure is usually named Native Structure (NS). In addition, the free energy of an NS conformation depends on the interaction among the atoms and their relative positions.
Protein Folding Problem (PFP) is an enormous challenge and important problem in bioinformatics, medicine, and other areas [7]. The function of a protein is directly related to its three-dimensional structure, and misfolded proteins can cause a variety of diseases. The aim of this problem is to find the natural tertiary structure of a protein using only a target sequence. A protein can take a high number of different conformational structures from its primary structure to its NS. The computational problem involved to find the NS is known as Protein Folding Problem. Because PFP is an NP-hard problem [8], heuristic methods avoiding the generation of all possible states of the protein are commonly used. In order to find an NS, computational methods search structures on a huge space of possible solutions. These methods can obtain several structures very close to the NS. A particular class of these methods is known to be ab initio which looks for the NS using only the protein’s amino acid sequence.
As a consequence, to solve PFP, new metaheuristics are applied, where simulated annealing (SA) [9, 10] is one of the most successful [11–13]. Currently, classical SA applies a Boltzmann distribution in order to accept bad solutions and escape from local minima. However, to generate high-quality solutions for PFP, new and more efficient SA have been designed; one of them, named Chaotic Multiquenching Annealing Algorithm (CMQA), has obtained very good results for proteins such as Met5-enkephalin, proinsulin, T0549, T0335, and T0281 or 1PLXW, 1T0C, 2K5E, SR384, and 1A19, in PDB format, respectively. There are three central phases of this algorithm [14]: (i) Multiquenching Phase (MQP), (ii) Annealing Phase (AP), and (iii) Dynamical Equilibrium Phase (DEP). All of these phases are explained in the paper; for this introduction all we need to know is that each phase is designed with an annealing approach looking for finding the best configuration of the previous one. At the beginning of the process, MQP improves a random configuration through an annealing procedure executed at extremely very high temperatures; AP searches for a better solution than that of MQP with an annealing search applied at high temperatures, and, finally, DEP is applied at low temperatures looking for a better solution than that obtained by AP. As the classical SA, all of these phases apply Boltzmann distribution for accept bad solutions. However, Bose-Einstein distribution can also be used for escape from local minima [15]. Nevertheless, algorithms using these two distributions in different ranges of temperatures have not been published for PFP.
In this paper, a new SA algorithm named MPSABBE (Multiphase Simulated Annealing based on Boltzmann and Bose-Einstein distributions) is introduced. MPSABBE applies the Boltzmann and Bose-Einstein distributions at high and low temperatures, respectively. The paper shows that using both distributions the quality solution is improved. This paper is organized as follows. In Section 2, PFP is described. In Section 3, the classical SA and MPSABBE algorithms are explained. In Section 4, the SA applied for solving PFP is detailed. In Section 4, all the four MPSABBE’s phases are presented. In Section 5, analytical tuning methods SA and MPSABBE are described. In Section 6, experimental results are shown. Finally, in Section 7, the conclusions of this research are discussed.
2. Protein Folding Problem
PFP is related to the questions of how and why a protein is folded into its NS. The proteins adopt an extreme number of possible conformations [16], which depends on the number of amino acids and the number of conformations by each amino acid. The essential concept introduced by Levinthal is that the PFP is a random search problem. This general idea means that all conformations of a protein (except the native state) are equally likely. Thus, it is more efficient to find the native state by a random search. PFP is an interdisciplinary problem that involves molecular biology, biophysics, computational biology, and computer science. In the ab initio case, NS prediction requires different mechanisms that lead the searching process to a biological three-dimensional structure. As was previously mentioned, this process requires only the amino acids’ sequence. PFP is an enormous challenge and is very hard to find the NS of a protein because the space of possible conformations of the protein is in general extremely large. For all practical purposes, PFP can be defined as follows.
Given
a sequence of n amino acids a1,a2,…,an that represents the primary structure of a protein,
an energy function f∗(σ1,σ2,…,σn), where the variables σ1,σ2,…,σn represent n dihedral angles,
find the following:
the Native Structure such that f∗(σ1,σ2,…,σn) represents the lowest energy value, where
the solution σ∗=σ1,σ2,…,σn defines the best three-dimensional configuration.
Force fields are used to represent the energy of a protein; some of the most common are AMBER [17], CHARMM [18], ECEPP/2 [19–21], ECEPP/3 [22], and GROMACS [23]. These force fields compute energy components, for instance, the electrostatic energy, the torsion energy, the hydrogen bond energy, and the Lennard-Jones energy. In this paper ECEPP/2 force field is used.
The atoms of a protein are represented in three-dimensional cartesian coordinates. There are four types of torsion angles or dihedral angles as follows:
The angle between the amino group and the alpha carbon is referred to as Phi (ϕ). This angle represents the angle between the amino group (or NH_{2}) of the amino acid i and the alpha Carbon Ci in the sequence; specifically, it represents the bond angle between Ni atom of amino group and the central carbon (αCi).
The dihedral angle between the alpha carbon and the carboxyl group is referred to as Psi (ψ). Psi represents the angle between the carboxyl (COOHi) group of the amino acid i and the central carbon i (Ci) of the same amino acid. In particular, Psi measures the angle of the covalent bond between Ci of the carboxyl group and the central carbon (αCi).
For every amino acids sequence, an omega angle (ω) is defined for each two consecutive amino acids i-1,i; specifically, it is the angle of the covalent bond between the atom Ni of amino acid i and carbon Ci-1 of the carboxyl group amino acid i-1.
And, finally, each Chi angle (χ) is defined between the two planes conformed by two consecutive carbon atoms in the radical group.
The variables of the problem are all of these four angles which are in the range [0,360]. In the simulations conducted in this research work, these angles are set with discrete values. Some variables have well-defined ranges as is the case of Psi and Phi angles whose ranges are defined by the Ramachandran plot [24]. The Phi angle is defined in the ranges [180,300] and [45,60]. The Psi angle is defined in three ranges [20,180], [300,330], and [180,205]. Finally, the omega angle is fixed at 180 degrees.
3. Simulated Annealing Algorithm3.1. Simulated Annealing Based on Boltzmann Distribution
Simulated Annealing (SA) Algorithm is a probabilistic method proposed by Kirkpatrick et al. [9] and Černý [10] and is an adaptation of the Metropolis algorithm, which is a Monte Carlo method [25]. SA is based on the gradual metal cooling for crystallization. This algorithm works by emulating the physical process where a metal is heating at very high temperature and then cooled very slowly until its frozen state. When this process happens, the metal is crystallized with the lowest energy configuration. SA is an algorithm that has been used for finding the optimal solution or close to it for different NP-hard problems including biological problems such as sequence alignment [26–28], phylogenetic trees [29], and PFP [30]. From a theoretical point of view, SA converges to the optimal solution or close to the lowest free energy [31]. However, classical SA is not able to find the lowest energy because energy barriers are too high for SA and cannot escape from local minima. As a consequence, variants of this method are proposed [14, 30].
Simulated annealing usually starts at a very high initial temperature (Tinitial). Through a cooling function, the temperature value is gradually reduced from Tinitial to Tfinal, which usually is very close to zero [9, 10]. There are several cooling functions used in SA [31–36], for example,(1)Tk+1=αTk(2)Tk+1=e-αTk(3)Tk+1=Tk1+ηTk.
The most common function is (1). This function reduces the temperature parameter by α factor, which is commonly in the range of 0.7≤α<1.0. A slow cooling is applied when α is very close to 1, while a fast cooling is applied when α is around 0.70.
The classical SA has two cycles as is shown in Algorithm 1; the first one is named temperature cycle and is used to decrease the value of the temperature with a specific cooling function. The second cycle is named metropolis cycle and it generates, accepts, or rejects solutions of the problem to be optimized. The initial and final temperature values are set (see lines (1)-(2)). These values are obtained by an analytical (see Section 5) or experimental way: Tinitial should be as high as possible, while Tfinal should be close to zero. An initial solution (Sinitial) is required in SA; this solution is generated (see line (3)) and is set to Scurrent. At the beginning of the process, the parameter T is set at the initial temperature (see line (4)). The temperature cycle is executed from Tinitial to Tfinal (see lines (5)–(19)). Then the metropolis cycle is repeated (see lines (6)–(17)) a certain number of times until a stop condition, which is explained later in this paper. A new solution (Snew) is generated within the metropolis cycle by applying a small perturbation to the current solution Scurrent (see line (7)). The difference between these two solutions (Snew and Scurrent) is calculated (see line (8)). In practice, SA can be stopped when the probability of accepting a new solution is negligible. For a minimization problem, if this difference is less than or equal to zero (see line (9)), the new solution is accepted (see line (10)). When this difference is greater than zero, the Boltzmann distribution is applied. Then, a Boltzmann probability is calculated using (4) in line (12). If this probability is higher than a random value between 0 and 1 (see line (13)), then the new solution Snew is accepted (see line (14)):(4)PSnew=e-ΔS/T.
<bold>Algorithm 1: </bold>Pseudocode of classical simulated annealing.
(1) Setting initial temperature (Tinitial)
(2) Setting final temperature (Tfinal)
(3) Generate Scurrent from Initial Solution (Sinitial)
(4) T=Tinitial
(5) While (T>Tfinal) do //Temperature Cycle
(6) While (stop condition) //Metropolis Cycle
(7) Generate Snew by applying a perturbation to Scurrent
(8) Obtain difference between Snew and Scurrent
(9) If (difference ≤ 0) then
(10) Accept Snew
(11) else
(12) Boltzmann Probability = exp(-difference/T)
(13) If (Boltzmann Probability > random(0, 1)) then
(14) Accept Snew
(15) end if
(16) end if
(17) end while
(18) Decrease T by a cooling function
(19) end while
(20) Shown better solution (Sbetter)
After the metropolis cycle is completed, the temperature value is reduced by a cooling function (see line (18)). For a maximization problem, if the difference of Snew-Scurrent is greater than zero, the new solution Snew is accepted; else Snew can be rejected or accepted depending on the Boltzmann probability value.
3.2. Simulated Annealing Based on Bose-Einstein Distribution
Statistical Mechanics (SM) study the overall behavior of a system consisting of a large number of particles whose behavior is unpredictable. SM uses statistics and probability theory and thermodynamic principles. According to SM, the occurrence of each future result is determined by a probabilistic function such as Boltzmann and Bose-Einstein distributions. In addition, only the most probable behavior of the system in thermal equilibrium at a given temperature is observed [37]. Bose-Einstein distribution is obtained by finding the most probable distribution, that is, solving the problem defined by maximizing the most probable distribution, subject to the following constraints: (h1) the number of particles (defined by the summation of particles in each microstate) is constant and (h2) the total energy (defined by the summation of individual energies of each microstate) is constant. The problem is solved using Lagrange multipliers. The parameters λ and β are defined as lagrage multiplier of h1 and h2, respectively [38]. Then the Bose-Einstein distribution applied for low and very low temperatures is defined by (5)hΔE=1eλ+βei-1.
Then particles behavior can be modeled by Bose-Einstein distribution defined in (6). This equation defines the acceptance probability distribution of a new configuration of particles:(6)hΔE=1eλeΔE/KT-1,where T is the temperature parameter, λ is related to the constraint of the total of particles in the system, and K is the Boltzmann constant. However, at very high temperatures Bose-Einstein distribution practically becomes the Boltzmann distribution. Nevertheless, at low and very low temperatures, the particles behave differently and they tend to congregate at the same lowest energy state; the result is known as a Bose-Einstein condensate [39]; as a consequence, the system can be modeled by Bose-Einstein distribution. Section 4 presents a new SA applying both Boltzmann and Bose-Einstein distributions for accepting bad solutions for high and low temperatures.
3.3. Simulated Annealing Applied to Solve Protein Folding Problem
The classical Simulated Annealing Algorithm can be implemented to solve the Protein Folding Problem [40] as is shown in the pseudocode of Algorithm 2. The initial and final temperature (see lines (1)-(2)) can be calculated according to the instance of the problem by applying the analytical method parameters of Section 5; that means that the protein should be preprocessed.
<bold>Algorithm 2: </bold>Pseudocode SA applied to protein folding problem.
(1) Tune initial temperature (Tinitial) by analytical method
(2) Tune final temperature (Tfinal) by analytical method
(3) Setting cooling factor (α)
(4) Scurrent is created by modifying the internal angles of protein
(5) Sbetter=Scurrent
(6) Calculate energy of protein applying a Force Field
(7) T=Tinitial
(8) While (T>Tfinal) do //Temperature Cycle
(9) While (stop condition) //Metropolis Cycle
(10) Create new solution (Snew) by modifying internal angles of the protein
(11) Calculate Energy of proteins using a force field funtion
(12) Obtain difference of energies between these two proteins
(13) If (difference ≤ 0) then
(14) Scurrent=Snew
(15) If energy(Snew) > energy(Sbetter) then
(16) Sbetter=Scurrent
(17) end if
(18) else
(19) Boltzmann Probability = exp(-difference/T)
(20) If (Boltzmann Probability > random(0, 1)) then
(21) Scurrent=Snew
(22) end if
(23) end if
(24) end while
(25) Decrease T by cooling function (Tk+1=α∗Tk)
(26) end while
(27) Show the best solution of PFP (Sbetter)
Applying the cooling function (1), the cooling factor value α is required. The temperature value is reduced very slowly; thus, α must be very close to 1 (see line (3)). In order to reduce very fast the temperature, the cooling factor α is set very close to 0.70. An initial solution of PFP is created, which is set to the current solution Scurrent (see line (4)). The internal angles of the initial solution are modified at random. At this point, the best solution Sbetter is Scurrent (see line (5)). The energy of Scurrent is calculated by applying a force field function (see line (6)). Before starting the temperature cycle, the initial is loaded into T variable in line (7). Then the temperature cycle starts (see lines (8)–(26)) with a logic condition (T greater than Tfinal in line (8)). Inside of temperature cycle, the metropolis cycle is executed (see lines (9)–(24)). After this cycle is completed, the value of the temperature is decreased (see line (25)).
Inside the metropolis cycle, a new solution of Protein Folding Problem Snew is generated by modifying the previous solution Scurrent. This is done by modifying the internal angles of the protein (see line (10)). The energy of the protein is calculated (see line (11)), and the difference of energies (i.e., between Snew and Scurrent) is determined (see line (12)). This difference is denoted by ΔS=Scurrent-Snew. The new solution is accepted when the new solution is better than the previous one; thus, the current solution Scurrent is replaced by Snew (see line (14)). When a new solution is worse than the current solution, it can be accepted using the Boltzmann distribution (see line (21)). The probability of this distribution (or acceptance probability) is directly related to the current value of the temperature and the difference of energy between Snew and Scurrent. This probability is calculated by (4). As the temperature value is reduced, the acceptance probability P(Snew) decreases.
4. MPSABBE Algorithm4.1. General Description
MPSABBE is a hybrid algorithm, which has four phases (see Figure 1). These phases are (i) Multiquenching Phase (MQP) applied from extremely high to high temperatures, (ii) Boltzmann Annealing Phase (BAP), which is executed from high to low temperatures, (iii) Bose-Einstein Annealing Phase (BEAP) from low to very low temperatures, and finally (iv) Dynamical Equilibrium Phase (DEP) which applies an annealing process at extremely low temperatures using Bose-Einstein distribution.
MPSABBE phases.
In order to accept worse solutions, BAP and BEAP apply Boltzmann and Bose-Einstein distributions, respectively. This is done with the aim of escaping from local minima. DEP is an extension of BEAP, where the stochastic equilibrium is dynamically detected. This is done by using a regression method into the metropolis cycle; the iterations’ number is considered as the independent variable and the energy value of each iteration as the dependent variable. The equilibrium detection criterion is the slope of the energy function into the metropolis cycle. The four phases MQP, BAP, BEAP and DEP are executed in the temperatures range shown in Table 1. The initial and final temperatures Tinitial and Tf are determined using the analytical tuning method of Section 6. The other temperatures are determined using a variability criterion, such as the variability being larger where the temperature is higher.
Temperatures ranges of MPSABBE.
Phase
Initial temperature
Final temperature
MQP (from very high to high temperatures)
Tinitial
TfMQP
BAP (from high to low temperatures)
TfMQP
TfBAP
BEAP (from low to very low temperatures)
TfBAP
TfBEAP
DEP (from very low to extremely low temperatures)
TfBEAP
Tf
4.2. MQP Phase of MPSABBE
MQP has several subphases. It starts at an extremely high initial temperature (Tinitial), which is obtained by an analytical method [41]. This phase is finished when a threshold temperature (TfMQP) is reached. MQP uses the cooling function given by(7)Tk+1=αQuenchingγkTk,where αQuenching is a decrement factor of the temperature parameter, in the range [0.7,1.0], and defines how fast each MQP subphase is decreased. A very low αQuenching value will decrease the temperature very fast. Besides, γk is defined as(8)γk=1-τk.
The τ parameter is defined by (9), where 0<τ<1, and it defines a quadratic decrement of the temperature. Notice that τ converges to zero and (7) is equivalent to (10):(9)τk=τk-12(10)Tk+1=αQuenchingTk.
The transition between two subphases is based on τ parameter; it occurs when τ converges to zero (τ≈0). When τ is very close to zero, a new MQP subphase is started and τ is set to its initial value. This process continues until the temperature TfMQP is reached. In Figure 2, the MQP phase is shown. In this phase, several subphases are shown. When a subphase is started, the parameter τ is set to its initial value.
MQP phase of MPSABBE algorithm.
In Algorithm 3, the MQP pseudocode of MPSABBE is shown. At setting section (see lines (4)–(6)), the initial temperature is calculated by an analytical method. The final temperature of this phase (TfMQP) is set to an initial value, determined in an experimental way. In line five, the variable T is set to the initial temperature. The factors αQuenching and τ are set to their initial values. The initial solution Scurrent is generated (see line (8)). The energy of this solution Energy(Scurrent) is calculated, and E(Scurrent) and Smin are set to Energy(Scurrent) and Scurrent, respectively.
<bold>Algorithm 3: </bold>MQP pseudocode of MPSABBE algorithm.
(1) MQP Procedure( )
(2) Begin
(3) //Setting section
(4) Tinitial = Initial Temperature calculated by analytical method
(5) TfMQP = Initial value, T=Tinitial
(6) αQuenching = initial value, τ = Initial value
(7) //Creation of initial solution
(8) Scurrent = Create the initial solution, E(Scurrent)=Energy(Scurrent)
(17) elseif exp(-Difference/T)>random[0,1] Then //Boltzmann Probability
(18) Scurrent=Snew
(19) E(Scurrent)=E(Snew)
(20) end if
(21) If E(Scurrent)<E(Smin) then //save Smin
(22) Smin=Scurrent
(23) E(Smin)=E(Scurrent)
(24) end if
(25) Until Metropolis Cycle is Finished
(26) τ = τ∗τ
(27) If τ very close to 0 Then
(28) τ = initial value
(29) end if
(30) T=αQuenching∗(1-τ)∗T
(31) Until T > TfMQP //External Cycle
(32) End procedure
The external cycle is started at line (10), and this is finished at line (31). This internal cycle generates solutions of PFP and accepts or rejects solutions using the Boltzmann distribution. The temperature parameter is decreased into this cycle by applying a cooling function (see line (30)). In this cycle, τ is set by (9) (see line (26)). When τ is very close to zero, this variable is set to its initial value (see line (28)). The Temperature value is calculated by (7).
After the external cycle is started, the metropolis cycle is started too. This cycle generates new solutions of PFP. A new solution Snew is obtained by applying a small perturbation to the current solution Scurrent (see line (12)). The difference between the energies of Snew and Scurrent is calculated (see line (13)). If this difference is less than zero (see line (14)), then the new solution Snew is accepted. Scurrent is replaced by Snew (see line (15)). E(Scurrent) is replaced by E(Snew) (see line (16)). If the difference of energies between these solutions is larger than zero, then the Boltzmann probability is applied (see line (17)). If this probability is larger than a random number between 0 and 1 (see line (17)), then the new solution Snew is accepted (see line (18)). The Scurrent is replaced by Snew (see line (19)). If E(Scurrent) is less than E(Snew) (see line (21)) then Smin is set to Scurrent (see line (22)). The E(Smin) is replaced by E(Scurrent) (see line (23)).
4.3. BAP Phase of MPSABBE
In Algorithm 4, pseudocode of BAP is shown. BAP is based on simulated annealing. The temperature parameter is decreased by (Tk+1=αAnnealingTk) or (Tk+1=e-αAnnealingTk). On the other hand, the length of metropolis cycle is determined by (21) or (27), respectively. In the internal cycle of the BAP, new solutions for the instance are generated. In this cycle, a better solution than a previous one is always accepted. However, worse solutions are accepted or rejected by applying the Boltzmann distribution (4). The length of the Markov chain (i.e., the internal cycle length) is determined by (21), where the increment β is calculated with (22). The initial temperature was set to a threshold value, which was the final temperature of MQP phase. The final temperature of BAP phase is very close to zero.
<bold>Algorithm 4: </bold>Pseudocode of BAP phase of MPSABBE.
(1) BAP Phase( )
(2) Begin
(3) T=TfMQP (Final temperature of MQP Phase)
(4) TfBAP = Final Temperature of this phase
(5) α = initial value (very close to one)
(6) β = Value calculated by analytical method
(7) CM = Initial value
(8) While (T>TfBAP) do
(9) k=1
(10) while (k≤CM) do
(11) Snew = perturbation system(Scurrent)
(12) Difference = Enew-Ecurrent
(13) If (Difference ≤ 0) then
(14) Scurrent=Snew
(15) E(Scurrent)=E(Snew)
(16) ElseIf (exp(-Difference/T))>random[0,1]) then
(17) Scurrent=Snew
(18) E(Scurrent)=E(Snew)
(19) End if
(20) If E(Scurrent)<E(Smin) then //save Smin
(21) Smin=Scurrent
(22) E(Smin)=E(Scurrent)
(23) end if
(24) k=k+1
(25) end while
(26) T=α∗T or T=(exp(-α))∗T
(27) CM=β∗CM
(28) End while
(29) End
4.4. BEAP Phase of MPSABBE
In Algorithm 5, pseudocode of BEAP is shown. Again the external cycle decreases its temperature value according to the cooling functions (1) or (2). This time, the metropolis cycle length is constant, and it is equal to the maximum length of the last metropolis cycle in BAP phase. In this second cycle, the Bose-Einstein distribution is applied for accepting worse solutions.
<bold>Algorithm 5: </bold>Pseudocode of BEAP phase.
(1) BEAP Phase( )
(2) Begin
(3) T = Threshold value; Determine expλ
(4) Tfinal = value very close to zero
(5) α = initial value
(6) β = Value calculated by analytical method
(7) CM = Initial value
(8) While (T>Tfinal) do
(9) k=1
(10) while (k≤CM) do
(11) Sj = perturbation system(Si)
(12) ΔE=Ej-Ei
(13) If (ΔE≤0) then
(14) Si=Sj
(15) ElseIf ((1/(expλ∗expΔE/T-1)))>random[0,1]) then
(16) Si=Sj
(17) End if
(18) k=k+1
(19) end while
(20) T=α∗T or T=(exp(-α))∗T
(21) End while
(22) End
4.5. DEP Phase of MPSABBE
In Algorithm 6, the DEP goal is to detect the stochastic equilibrium by determining the iteration where the slope of the energy function remains very close to zero. In order to do that, let us define the next variables: (a) xi the number of the iterations in the metropolis cycle 1,2,…,n and (b) Ei the energy found for the algorithm in iteration xi. Using a standard least squares method, the slope for n iterations is defined by(11)m=n∑i=1nxiEi-∑i=1nxi∑i=1nEin∑i=1nxi2-∑i=1nxi2,which becomes(12)m=k1∑i=1niEi-k2∑i=1nEi,where(13)k1=12n3-n,k2=6n2+n.
<bold>Algorithm 6: </bold>Pseudocode of DEP phase.
(1) DEP Phase( )
(2) Begin
(3) While (m>0) do
(4) n=1
(5) Esummary = 0, Summary=0
(6) while (k≤CM) do
(7) Sj = perturbation_system(Si)
(8) If E(Sj) = total_clausule then stop( )
(9) ΔE=Ej-Ei
(10) If (ΔE≥0) then
(11) Si=Sj
(12) ElseIf ((1/(exp(λ)∗exp(ΔE/T)-1))>random[0,1]) then
(13) Si=Sj
(14) End if
(15) n=n+1
(16) Summary=Summary+n∗Ei
(17) Esummary=Esummary+Ei
(18) end while
(19) T=α∗T or T=(exp(-α))∗T
(20) k1=12/(n∗n∗n-n)
(21) k2=6/(n∗n+n)
(22) m=k1∗Summary+k2∗Esummary
(23) End while
(24) End
Notice that the complexity of the computation of (12) is O(n). This equation contains only summations; thus, it is less complex than (11). These summations are computed using simple data structures. k1 and k2 are only constants for a particular n value.
5. Analytical Tuning Method5.1. Parameters Setting Based on Boltzmann Distribution
Parameters of MPSABBE are tuned by the analytical method [42]. The initial temperature is defined by the maximum difference named maximum decrement ΔZmax, which is calculated using a sample of random protein structures at the highest temperature range. In this sample, the energy of two consecutive protein structures defines a simple decrement of energy ΔZi,j, and ΔZmax is the maximum difference in the sample. On the other hand, the final temperature is calculated by applying the minimum deterioration (i.e., minimum decrement) ΔZmin of a sample of protein structures taken at low temperatures. Analytical tuning based on Boltzmann distribution can be helpful for setting up the initial temperature. The probability of accepting any new solution Snew is near to one (P(Snew)≈1) at high temperatures, so the decrement of the cost function is maximal. The initial temperature (Tinitial) is associated with the maximum deterioration admitted and the defined acceptance probability P(Snew).
Let Scurrent be the current solution and Snew a new proposed one, and Z(Scurrent) and Z(Snew) are the costs associated to Scurrent and Snew, respectively. The maximum and minimum deteriorations are ΔZmax and ΔZmin, respectively; then P(ΔZmax) probability of accepting a new solution Snew with the maximum deterioration is defined by(14)PΔZ=exp-ΔZT.This equation is basically the Boltzmann distribution, which is applied for calculating Tinitial. This temperature value is defined by(15)Tinitial=-ΔZmaxlnPΔZmax.Similarly, the final temperature (Tfinal) is established according to the probability of accepting a new solution Snew with the minimum deterioration. The equation for calculating the final temperature is defined by(16)Tfinal=-ΔZminlnPΔZmin.There are other parameters of MPSABBE that are calculated by applying a particular cooling function; for example, the metropolis cycle length is calculated by applying(17)Tk+1=αTk.
The analytical method determines the metropolis cycle length Lk with a simple Markov model [42]; at high temperatures, only a few iterations are required because, in this condition, the stochastic equilibrium is reached very fast. Nevertheless, at low temperatures, a more exhaustive exploration is needed and Lk should be as largest as possible. Let L1 be Lk at the temperature Tinitial and let Lmax be the maximum metropolis cycle length. Let the temperature Tk be decreased by the cooling function (17) and let Lk+1 be calculated by(18)Lk+1=βLk,where β is the increment coefficient of metropolis cycle (β>1), so Lk+1>Lk and L1 is the initial value. The markov chain length of the last metropolis cycle is equal to Lmax. Functions (17) and (18) are consecutively applied in simulated annealing from Tinitial to Tfinal; consequently Tn and Lmax are obtained by (19) and (20), respectively,(19)Tn=αnTinitial(20)Lmax=βnL1,where n is the steps number from Tinitial to Tfinal.
Notice that the increment coefficient β can be calculated if the initial length L1 and the maximum length value Lmax are available. As is well known the former can simply be set close to one, while the second depends on the exploration level established in the algorithm as follows.
Thus, the number of times that the metropolis cycle is executed can be simply obtained by using (21). Once n is determined the increment of the metropolis cycle length can be calculated by (22):(21)n=lnTfinal-lnTinitiallnα(22)β=explnLmax-lnL1n.
5.2. Parameters Setting Based on Bose-Einstein Distribution
The initial and final temperatures can be calculated by applying the Bose-Einstein distribution. Then, the probability of accepting a new solution with the maximum deterioration P(ΔZmax) is defined by (23). Consequently, the initial and final temperatures are calculated with (24) and (25), respectively,(23)PΔZ=1eΔZ/T-1(24)Tinitial=ΔZmaxlnPΔZmax+1/PΔZmax(25)Tfinal=ΔZminlnPΔZmin+1/PΔZmin.
Let Tk be decreased by the cooling function (2). Thus, Tn is calculated by(26)Tn=e-nαTinitial. As a consequence, n and β are calculted by(27)n=lnTfinal-lnTinitial-α(28)β=explnLmax-lnL1n.
Notice that the increment coefficient β can be calculated if the initial and maximum metropolis length L1 and Lmax are available [42]. As is well known the former can simply be set close to one, while the second depends on the exploration level established in the algorithm. Therefore, for any Si solution, the value of Lmax depends on the size of neighborhood Vsi. Thus, Lmax=CVsi and C=-lnPrSi, where PrSi is the rejection probability for a solution Si. The parameter C ranges from 1 to 4.6; the larger value of C assures a good exploration level in the neighborhood of Si at the final temperature. Hence, different exploration levels can be applied. When we explore with PrSi values of 63%, 86%, 95%, or 99%, the exploration levels are C=1,2,3, or 4.6, respectively. Because Lmax can be very large for PFP instances, it is important to apply a particular process for detecting the stochastic equilibrium; this is done in DEP phase of MPSABBE that detects efficiently the stochastic equilibrium. The next section explains all MPSABBE phases and the performance of using Boltzmann and Bose-Einstein distribution.
6. Experimental Results
MPSABBE is tested with five instances of PFP, which are Met5-enkephalin, proinsulin, T0549, T0335, and T0281. These instances have different sequence’s length and a different number of variables (dihedral angles). The smallest sequence is Met5-enkephalin, which has five amino acids and 19 variables. The largest sequence is a hypothetical protein (CASP T0281), which has 90 amino acids and 458 variables. The proinsulin instance has 31 amino acids and 132 variables; the 2K5E (CASP T0549) has 73 amino acids and 343 variables. The instance Bacillus subtilis (CASP T0335) has 85 amino acids and 450 variables. The dihedral angles used in the simulations were phi (Φ), psi (Ψ), omega (ω), and Chi (χ). The initial and final temperature are tuned analytically. In MQP, parameters αQuenching and τ are set with 0.85 and 0.999, respectively. In each subphase of MQP the final value of τ is set to 0.001.
In Table 2, the results of Met5-enkephalin obtained with MPSABBE algorithm are shown. In this table, we show the traditional average energy, processing time in minutes, and the average of the traditional RMSD (Root-Mean-Square Deviation) [43]. The RMSD was calculated using TM-Align [44]. The best average solution for Met5-enkephalin is −5.0634 kcal/mol with 0.8427 minutes of processing time, and the average RMSD obtained was 0.361 Å (Angstroms). The RMSD is a measure which represents a structural alignment between two proteins (target and solution). The target used in this paper was taken from Protein Data Bank (PDB). An RMSD near to zero is taken as a perfect structural alignment between both proteins. The RMSD is commonly used in protein folding to represent how a new obtained solution by simulation is structurally similar to the target solution. In this case, in Figure 3, the graphic of energy and RMSD for each solution is shown. In this graphic, all energies of Met5-enkephalin calculated by MPSABBE are plotted. This is a solution with poor quality because there are better solutions in the literature; the energy found by MPSABBE was −7.2787 kcal/mol. In Figure 4, the graphics of landscape of Met5-enkephalin is shown. The results obtained in the literature for this case by using ECEPP/2 and with ω fixed at 180 or ω variable were −10.72 [20] and −12.90 [43, 45], respectively. Examining the features of MPSABBE the exploration ability is not good enough; thus, the algorithm requires improvement. Figure 3 shows all solutions generated by MPSABBE; the curve enveloping the number of solutions in Figure 3 is only a descriptive tool to illustrate that the optimal solution is reached when the RMSD is too small; however, this is not really a very good stop condition. Notice that the best result obtained with the classical simulated annealing in the literature using Boltzmann distribution was only −5 kcal/mol [43], while the best result obtained in this case for MPSABBE using Bose-Einstein distribution was −7.2787 kcal/mol.
Average results of Met^{5}-enkephalin with MPSABBE algorithm.
αAnnealing
Average energy (kcal/mol)
Processing time (minutes)
Average RMSD (Å)
0.75
−3.0836
0.1252
0.4517
0.80
−4.3025
0.1701
0.4327
0.85
−4.4093
0.2023
0.3510
0.90
−4.6493
0.3384
0.5097
0.95
−5.0634
0.8427
0.3610
Energy and RMSD for Met5-enkephalin.
Landscape of energy, RMSD, and processing time for Met5-enkephalin.
In Table 3, the results of proinsulin obtained with MPSABBE algorithm are shown. The best average solution for this instance is −122.4350 kcal/mol with 20.7302 minutes of processing time, the average RMSD is 3.127 Å. This solution was obtained with αAnnealing=0.95. In Figure 5 the graphic of energy and RMSD for each solution is shown. In this Figure, some energies of proinsulin calculated by MPSABBE are plotted. The best solution found by MPSABBE was −142.7586 kcal/mol. In Figure 6, the landscape of proinsulin is shown.
Average results of proinsulin with MPSABBE algorithm.
αAnnealing
Average energy (kcal/mol)
Processing time (minutes)
Average RMSD (Å)
0.75
−94.2520
3.0279
3.1370
0.80
−102.5484
3.8918
3.1153
0.85
−102.1247
5.1319
3.1253
0.90
−108.1093
7.8184
3.3083
0.95
−122.4350
20.7302
3.1273
Energy and RMSD of proinsulin.
Landscape of energy, RMSD, and processing time for proinsulin.
In Table 4, the results of T0549 instance obtained with MPSABBE algorithm are shown. The best average solution for this instance is −257.0625 kcal/mol with 106.6151 minutes of processing time, the average RMSD is 4.30 Å. This solution was obtained with αAnnealing=0.95. In Figure 7, the energy and RMSD for each solution are shown. In this figure, some energies of T0549 instance calculated by MPSABBE are plotted. The best solution found was −317.2117 kcal/mol. In Figure 8, the landscape of T0549 is shown.
Average results of T0549 with MPSABBE algorithm.
αAnnealing
Average energy (kcal/mol)
Processing time (minutes)
Average RMSD (Å)
0.75
−183.6351
19.4805
4.3933
0.80
−190.2890
24.9117
4.4180
0.85
−208.0338
31.1958
4.2933
0.90
−231.2849
48.6717
4.2887
0.95
−257.0625
106.6151
4.3037
Energy and RMSD for T0549.
Landscape of energy, RMSD, and processing time for T0549.
In Table 5, the results of T0335 instance obtained with MPSABBE algorithm are shown. The best average solution for this instance is −378.6827 kcal/mol with 202.2453 minutes of processing time; the average RMSD is 3.5793. This solution was obtained with αAnnealing=0.95. In Figure 9, the energy and RMSD for each solution are shown. In this figure, some energies of T0335 instance calculated by MPSABBE are plotted. The best solution was −427.2939 kcal/mol. In Figure 10, the landscape of T0335 is shown.
Average results of T0335 with MPSABBE algorithm.
αAnnealing
Average energy (kcal/mol)
Processing time (minutes)
Average RMSD (Å)
0.75
−249.4399
32.9611
3.7413
0.80
−267.4245
40.4676
3.6750
0.85
−293.0409
52.2383
3.6160
0.90
−335.0567
78.9619
3.5828
0.95
−378.6827
202.2453
3.5793
Energy and RMSD of T0335.
Landscape of energy, RMSD, and processing time of T0335.
In Table 6, the results of T0281 instance obtained with MPSABBE algorithm are shown. The best average solution for this instance is −322.3821 kcal/mol with 187.5070 minutes of processing time; the average RMSD is 4.5 Å. This solution was obtained with αAnnealing=0.95. In Figure 11, the graphic of energy and RMSD for each solution are shown. In this figure, some energies of T0281 instance calculated by MPSABBE are plotted. The best solution found was −380.1765 kcal/mol. In Figure 12, the landscape of T0281 is shown.
Average results of T0281 with MPSABBE algorithm.
αAnnealing
Average energy (kcal/mol)
Processing time (minutes)
Average RMSD (Å)
0.75
−188.9717
32.7761
4.6160
0.80
−193.9981
40.4018
4.6347
0.85
−236.3011
53.3635
4.5507
0.90
−263.1571
79.3565
4.4467
0.95
−322.3821
187.5070
4.5515
Energy and RMSD for T0281.
Landscape of energy, RMSD, and processing time for T0281.
Figures 13–15 show the graphs of energy, which are obtained from consecutive solutions in the cycle of metropolis in specific executions. These figures correspond to the results of energies obtained from the MPSABBE algorithm with Met5-enkephalin, proinsulin, and T0281 instances, respectively.
Energy of MPSABBE with Met5-enkephalin instance.
Energy of MPSABBE with proinsulin instance.
Energy of MPSABBE with T0281 instance.
6.1. Test Hypothesis
In Table 7, the average and deviation of energy and time for each instance applying MPSABBE algorithm are shown. The null hypothesis is defined as H0:μQMPSABBE≤μQCMQA, which means that the average energy of MPSABBE (μQMPSABBE) for each instance is less than or equal to CMQA (μQCMQA) [14]. The alternative hypothesis is defined as H1:μQMPSABBE>μQCMQA. In Table 8, the average and standard deviation of energy and time for each instance applying the proposed algorithm are shown. The average processing times are used for testing the null hypothesis, which is defined as H0:μTMPSABBE≤μTCMQA, which means that the average processing time of MPSABBE (μTMPSABBE) is less than or equal to the average processing time of CMQA (μTCMQA). The alternative hypothesis is defined as H1:μTMPSABBE>μTCMQA. In Table 9, the values obtained for t-student are shown; these values were calculated by applying the average and standard deviation of energy and execution time from Tables 7 and 8.
Average of energy and standard deviation of MPSABBE.
Instance
Energy average (kcal/mol)
Energy standard deviation
Time average (minutes)
Time standard deviation
Met5-enkephalin
−4.3016
0.7410
0.3357
0.2943
Proinsulin
−105.8938
10.4815
8.8670
8.1788
T0549
−214.0610
30.3024
46.1749
35.5258
T0335
−304.7289
52.3785
91.6016
76.1357
T0281
−240.9620
54.8912
85.0103
71.3112
Average of energy and standard deviation of CMQA.
Instance
Energy average (kcal/mol)
Energy standard deviation
Time average (minutes)
Time standard deviation
Met5-enkephalin
−3.7820
0.7848
0.5719
0.4509
Proinsulin
−104.7165
10.8593
18.9617
16.2658
T0549
−217.1220
36.7019
121.9018
96.2037
T0335
−311.3921
39.3025
204.2191
154.3906
T0281
−254.3024
42.6025
231.8738
185.2004
t-student for each instance.
Instance
t-student (for energy)
t-student (for time execution)
Met5-enkephalin
−2.6363
−2.4022
Proinsulin
−0.4272
−3.0368
T0549
0.3522
−4.0444
T0335
0.5573
−3.5832
T0281
1.0515
−4.0533
The value of t-student is −2.6363 (Table 9). The critical value is 1.645. The statistic test determined that the null hypothesis is accepted; thus, MPSABBE generates better quality solution than CMQA, when these approaches are applied with Met5-enkephalin instance. Therefore, the null hypothesis H0:μQMPSABBE≤μQCMQA is rejected, and the average energy of MPSABBE (μQMPSABBE) for Met5-enkephalin instance is less than or equal to CMQA (μQCMQA). For processing execution time, the value of the statistic test (t-student) is −2.4022. Thus, MPSABBE (applied to Met5-enkephalin instance) uses less processing execution time than CMQA.
When the proinsulin instance is applied, the value of the statistic test (t-student) is −0.4272; thus, MPSABBE generates better quality solution than CMQA. For processing execution time, the value of the statistic test (t-student) is −3.0368. MPSABBE (applied to proinsulin instance) uses less processing execution time than the average processing time of CMQA. When the T0549 instance is applied, the value of the statistic test (t-student) is 0.3522, so that MPSABBE generates better quality solution than CMQA. For processing time, the value of the statistic test (t-student) is −4.0444. The MPSABBE (applied to T0549 instance) uses less processing execution time than CMQA. When the T0335 instance is applied, the value of the statistic test (t-student) is 0.5573, so that MPSABBE generates better quality solution than CMQA. For the processing execution time, the value of the statistic test (t-student) is −3.5832. The MPSABBE (applied to T0335 instance) uses less processing time than CMQA. When the T0281 instance is applied, the value of the statistic test (t-student) is 1.0515; thus, MPSABBE generates better quality solution than CMQA. For processing execution time test, the value of the statistic test (t-student) is −4.0533. Then MPSABBE (applied to T0281 instance) uses less processing execution time than CMQA. Therefore, MPSABBE generates the better quality solution and uses less processing execution time than CMQA in all instances.
Notice that the improvement obtained when the two distributions are used is better when the protein is smaller. For instance, for Met5-enkephalin and proinsulin (with five and thirty-one amino acids) MPSABBE surpass CMQA by 13.73 and 1.1243%, respectively; otherwise for T0549, T0335, and T0281 (with 73, 85, and 90 amino acids), these figures were −1.12, −2.13, and −3.75%, respectively. Thus, the new algorithm obtains better results for small proteins than the classical SA.
7. Conclusions
In this paper, a new Simulated Annealing Algorithm named MPSABBE for Protein Folding Problem is presented. This algorithm includes Bose-Einstein and Boltzmann distributions in SA. Traditionally, for PFP, SA only uses the Boltzmann distribution function as the acceptance probability of bad solutions. MPSABBE was compared to a classical SA for protein folding which only applies Boltzmann distribution. According to the experimentation, the new algorithm is more efficient by the use of the two distributions when the proteins are small. The quality of the solutions obtained by the new approach is not always the best alternative, although the difference of the quality solution is only 2 to 5% for the worse cases. Besides, the new approach can overtake the classical quality solution of SA by one to ten percent while execution time is in general lower.
Competing Interests
The authors declare that they have no competing interests.
LewinB.AnfinsenC. B.Principles that govern the folding of protein chainsLiwoA.CzaplewskiC.OłdziejS.ScheragaH. A.Computational techniques for efficient conformational sampling of proteinsLeeJ.ScheragaH. A.RackovskyS.New optimization method for conformational energy calculations on polypeptides: conformational space annealingSohlJ. L.JaswalS. S.AgardD. A.Unfolded conformations of α-lytic protease are more stable than its native stateWangZ.MottonenJ.GoldsmithE. J.Kinetically controlled folding of the serpin plasminogen activator inhibitor 1NgoJ.MarksJ.KarplusM.Computational complexity, protein structure prediction, and the levinthal paradoxNgoJ. T.MarksJ.Computational complexity of a problem in molecular structure predictionKirkpatrickS.GelattJ.VecchiM. P.Optimization by simulated annealingČernýV.Thermodynamical approach to the traveling salesman problem: an efficient simulation algorithmSimonsK. T.KooperbergC.HuangE.BakerD.Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functionsKaufmannK. W.LemmonG. H.DelucaS. L.SheehanJ. H.MeilerJ.Practically useful: what the R osetta protein modeling suite can do for youSimonciniD.ZhangK. Y. J.Efficient sampling in fragment-based protein structure prediction using an estimation of distribution algorithmFrausto-SolísJ.Liñan-GarcíaE.Sánchez-PérezM.Sánchez-HernándezJ. P.Chaotic multiquenching annealing applied to the protein folding problemColeE. A. B.Integral evaluation in semiconductor device modelling using simulated annealing with Bose-Einstein statisticsLevinthalC.Are there pathways for protein folding?PonderJ. W.CaseD. A.Force fields for protein simulationsBrooksB. R.BruccoleriR. E.OlafsonB. D.StatesD. J.SwaminathanS.KarplusM.CHARMM: a program for macromolecular energy, minimization, and dynamics calculationsMomanyF. A.McGuireR. F.BurgessA. W.ScheragaH. A.Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acidsEisenmengerF.HansmannU. H. E.Variation of the energy landscape of a small peptide under a change from the ECEPP/2 force field to ECEPP/3EisenmengerF.HansmannU. H. E.HayryanS.HuC.-K.[SMMP] A modern package for simulation of proteinsNémethyG.GibsonK. D.PalmerK. A.YoonC. N.PaterliniG.ZagariA.RumseyS.ScheragaH. A.Energy parameters in polypeptides. 10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing peptidesBerendsenH. J. C.van der SpoelD.van DrunenR.GROMACS: a message-passing parallel molecular dynamics implementationRamachandranG. N.VenkatachalamC. M.KrimmS.Stereochemical criteria for polypeptide and protein chain conformations. III. Helical and hydrogen-bonded polypeptide chainsMetropolisN.RosenbluthA. W.RosenbluthM. N.TellerA. H.TellerE.Equation of state calculations by fast computing machinesLiñán-GarcíaE.Gallegos-AraizaL. M.Simulated annealing with previous solutions applied to DNA sequence alignmentKimJ.PramanikS.ChungM. J.Multiple sequence alignment using simulated annealingShyi-MingC. H.Multiple dna sequence alignment based on genetic simulated annealing techniquesRicherJ.-M.Rodriguez-TelloE.Vazquez-OrtizK. E.SchützeO.Coello CoelloC. A.TantarA.-A.Maximum parsimony phylogenetic inference using simulated annealingWalesD. J.ScheragaH. A.Global optimization of clusters, crystals, and biomoleculesAartsE.KorstJ.IngberL.Simulated annealing: practice versus theoryKjærulffU.Optimal decomposition of probabilistic networks by simulated annealingVan LaarhovenP. J. M.AartsE. H. L.NouraniY.AndresenB.A comparison of simulated annealing cooling strategiesShenY.KiatsupaibulS.ZabinskyZ. B.SmithR. L.An analytically derived cooling schedule for simulated annealingEisbergR.ResnickR.SullivanJ. D.Quantum physics of atoms, molecules, solids, nuclei and particlesWallF. T.Alternative derivations of the statistical mechanical distribution lawsCornellE. A.WiemanC. E.NiehH.-T.Bose-einstein condensation in a dilute gas: the first 70 years and some recent experimentsProceedings of the International Symposium on Frontiers of Science in Celebration of the 80th Birthday of C. N. Yang2003SingaporeWorld Scientific183222Frausto-SolísJ.Sánchez-PérezM.Lińan-García,E.Sánchez-HernándezJ. P.Threshold temperature tuning simulated annealing for protein folding problem in small peptidesFrausto-SolísJ.Sanvicente-SánchezH.Imperial-ValenzuelaF.WangT.-D.LiX.ChenS.-H.ANDYMARK: an analytical method to establish dynamically the length of the Markov chain in simulated annealing for the satisfiability problemSanvicente-SánchezH.Frausto-SolísJ.A method to establish the cooling scheme in simulated annealing like algorithms3045Proceedings of the International Conference on Computational Science and Its Applications (ICCSA '04)2004Assisi, ItalySpringer75576310.1007/978-3-540-24767-8_80NayeemA.VilaJ.ScheragaH. A.A comparative study of the simulated-annealing and Monte Carlo-with-minimization approaches to the minimum-energy structures of polypeptides: [Met]-enkephalinZhangY.SkolnickJ.TM-align: a protein structure alignment algorithm based on the TM-scoreLiZ.ScheragaH. A.Monte Carlo-minimization approach to the multiple-minima problem in protein folding