1. Introduction

ABI

Advances in Bioinformatics

1687-8035 1687-8027

Hindawi Publishing Corporation

10.1155/2016/7357123

7357123

Research Article

Multiphase Simulated Annealing Based on Boltzmann and Bose-Einstein Distribution Applied to Protein Folding Problem

http://orcid.org/0000-0001-9307-0734

Frausto-Solis

Juan

¹ Liñán-García

Ernesto

² Sánchez-Hernández

Juan Paulo

³ González-Barbosa

J. Javier

¹ González-Flores

Carlos

http://orcid.org/0000-0002-3439-9975

Castilla-Valdez

Guadalupe

¹ Fdez-Riverola

Instituto Tecnológico de Ciudad Madero

Tecnológico Nacional de México

Avenida Sor Juana Inés de la Cruz s/n

Colonia los Mangos

89440 Ciudad Madero

TAMPS

Mexico

tecnm.mx

Universidad Autónoma de Coahuila

Ciudad Universitaria

25280 Arteaga

COAH

Mexico

uadec.mx

UPEMOR

Boulevard Cuauhnáhuac 566

Jiutepec

62550 Mor México

Mexico

upemor.edu.mx

2016

20 6 2016

2016 24 11 2015 05 04 2016 19 04 2016

2016

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

A new hybrid Multiphase Simulated Annealing Algorithm using Boltzmann and Bose-Einstein distributions (MPSABBE) is proposed. MPSABBE was designed for solving the Protein Folding Problem (PFP) instances. This new approach has four phases: (i) Multiquenching Phase (MQP), (ii) Boltzmann Annealing Phase (BAP), (iii) Bose-Einstein Annealing Phase (BEAP), and (iv) Dynamical Equilibrium Phase (DEP). BAP and BEAP are simulated annealing searching procedures based on Boltzmann and Bose-Einstein distributions, respectively. DEP is also a simulated annealing search procedure, which is applied at the final temperature of the fourth phase, which can be seen as a second Bose-Einstein phase. MQP is a search process that ranges from extremely high to high temperatures, applying a very fast cooling process, and is not very restrictive to accept new solutions. However, BAP and BEAP range from high to low and from low to very low temperatures, respectively. They are more restrictive for accepting new solutions. DEP uses a particular heuristic to detect the stochastic equilibrium by applying a least squares method during its execution. MPSABBE parameters are tuned with an analytical method, which considers the maximal and minimal deterioration of problem instances. MPSABBE was tested with several instances of PFP, showing that the use of both distributions is better than using only the Boltzmann distribution on the classical SA.

1. Introduction

In genetics DNA, RNA, and proteins are the basic elements for many researches. DNA is a molecule that contains genetic instructions, which are involved in protein synthesis process [1]. This molecule represents a complete set of hereditary information of any organism. DNA has four different nucleotides, which are adenine, cytosine, guanine, and thymine. This molecule is divided into genes, and a gene is a sequence of nucleotides that express a protein. A functional protein is conformed in an approximated geometrical model of the global minimum energy [2, 3]. This is a dinamic process where the lowest free energy of the protein plus the solvent can be reasonably approximated by the minimum free energy found by Monte Carlo, conformational space annealing, genetic algorithms, and some deterministic methods [3, 4]. In fact, there are some examples, such as insulin alphalytic [5, 6] with natural conformations whose energy is not minimal. This structure is usually named Native Structure (NS). In addition, the free energy of an NS conformation depends on the interaction among the atoms and their relative positions.

Protein Folding Problem (PFP) is an enormous challenge and important problem in bioinformatics, medicine, and other areas [7]. The function of a protein is directly related to its three-dimensional structure, and misfolded proteins can cause a variety of diseases. The aim of this problem is to find the natural tertiary structure of a protein using only a target sequence. A protein can take a high number of different conformational structures from its primary structure to its NS. The computational problem involved to find the NS is known as Protein Folding Problem. Because PFP is an NP-hard problem [8], heuristic methods avoiding the generation of all possible states of the protein are commonly used. In order to find an NS, computational methods search structures on a huge space of possible solutions. These methods can obtain several structures very close to the NS. A particular class of these methods is known to be ab initio which looks for the NS using only the protein’s amino acid sequence.

As a consequence, to solve PFP, new metaheuristics are applied, where simulated annealing (SA) [9, 10] is one of the most successful [11–13]. Currently, classical SA applies a Boltzmann distribution in order to accept bad solutions and escape from local minima. However, to generate high-quality solutions for PFP, new and more efficient SA have been designed; one of them, named Chaotic Multiquenching Annealing Algorithm (CMQA), has obtained very good results for proteins such as M e t 5 -enkephalin, proinsulin, T0549, T0335, and T0281 or 1PLXW, 1T0C, 2K5E, SR384, and 1A19, in PDB format, respectively. There are three central phases of this algorithm [14]: (i) Multiquenching Phase (MQP), (ii) Annealing Phase (AP), and (iii) Dynamical Equilibrium Phase (DEP). All of these phases are explained in the paper; for this introduction all we need to know is that each phase is designed with an annealing approach looking for finding the best configuration of the previous one. At the beginning of the process, MQP improves a random configuration through an annealing procedure executed at extremely very high temperatures; AP searches for a better solution than that of MQP with an annealing search applied at high temperatures, and, finally, DEP is applied at low temperatures looking for a better solution than that obtained by AP. As the classical SA, all of these phases apply Boltzmann distribution for accept bad solutions. However, Bose-Einstein distribution can also be used for escape from local minima [15]. Nevertheless, algorithms using these two distributions in different ranges of temperatures have not been published for PFP.

In this paper, a new SA algorithm named MPSABBE (Multiphase Simulated Annealing based on Boltzmann and Bose-Einstein distributions) is introduced. MPSABBE applies the Boltzmann and Bose-Einstein distributions at high and low temperatures, respectively. The paper shows that using both distributions the quality solution is improved. This paper is organized as follows. In Section 2, PFP is described. In Section 3, the classical SA and MPSABBE algorithms are explained. In Section 4, the SA applied for solving PFP is detailed. In Section 4, all the four MPSABBE’s phases are presented. In Section 5, analytical tuning methods SA and MPSABBE are described. In Section 6, experimental results are shown. Finally, in Section 7, the conclusions of this research are discussed.

2. Protein Folding Problem

PFP is related to the questions of how and why a protein is folded into its NS. The proteins adopt an extreme number of possible conformations [16], which depends on the number of amino acids and the number of conformations by each amino acid. The essential concept introduced by Levinthal is that the PFP is a random search problem. This general idea means that all conformations of a protein (except the native state) are equally likely. Thus, it is more efficient to find the native state by a random search. PFP is an interdisciplinary problem that involves molecular biology, biophysics, computational biology, and computer science. In the ab initio case, NS prediction requires different mechanisms that lead the searching process to a biological three-dimensional structure. As was previously mentioned, this process requires only the amino acids’ sequence. PFP is an enormous challenge and is very hard to find the NS of a protein because the space of possible conformations of the protein is in general extremely large. For all practical purposes, PFP can be defined as follows.

Given (i)

a sequence of n amino acids a 1 , a 2 , … , a n that represents the primary structure of a protein,

(ii)

an energy function f ∗ ( σ 1 , σ 2 , … , σ n ) , where the variables σ 1 , σ 2 , … , σ n represent n dihedral angles,

find the following: (i)

the Native Structure such that f ∗ ( σ 1 , σ 2 , … , σ n ) represents the lowest energy value, where

(ii)

the solution σ ∗ = σ 1 , σ 2 , … , σ n defines the best three-dimensional configuration.

Force fields are used to represent the energy of a protein; some of the most common are AMBER [17], CHARMM [18], ECEPP/2 [19–21], ECEPP/3 [22], and GROMACS [23]. These force fields compute energy components, for instance, the electrostatic energy, the torsion energy, the hydrogen bond energy, and the Lennard-Jones energy. In this paper ECEPP/2 force field is used.

The atoms of a protein are represented in three-dimensional cartesian coordinates. There are four types of torsion angles or dihedral angles as follows: (i)

The angle between the amino group and the alpha carbon is referred to as Phi ( ϕ ). This angle represents the angle between the amino group (or NH₂) of the amino acid i and the alpha Carbon C i in the sequence; specifically, it represents the bond angle between N i atom of amino group and the central carbon ( α C i ).

(ii)

The dihedral angle between the alpha carbon and the carboxyl group is referred to as Psi ( ψ ). Psi represents the angle between the carboxyl ( C O O H i ) group of the amino acid i and the central carbon i ( C i ) of the same amino acid. In particular, Psi measures the angle of the covalent bond between C i of the carboxyl group and the central carbon ( α C i ).

(iii)

For every amino acids sequence, an omega angle ( ω ) is defined for each two consecutive amino acids i - 1 , i ; specifically, it is the angle of the covalent bond between the atom N i of amino acid i and carbon C i - 1 of the carboxyl group amino acid i - 1 .

(iv)

And, finally, each Chi angle ( χ ) is defined between the two planes conformed by two consecutive carbon atoms in the radical group.

The variables of the problem are all of these four angles which are in the range [ 0,360 ] . In the simulations conducted in this research work, these angles are set with discrete values. Some variables have well-defined ranges as is the case of Psi and Phi angles whose ranges are defined by the Ramachandran plot [24]. The Phi angle is defined in the ranges [ 180,300 ] and [ 45,60 ] . The Psi angle is defined in three ranges [ 20,180 ] , [ 300,330 ] , and [ 180,205 ] . Finally, the omega angle is fixed at 180 degrees.

3. Simulated Annealing Algorithm 3.1. Simulated Annealing Based on Boltzmann Distribution

Simulated Annealing (SA) Algorithm is a probabilistic method proposed by Kirkpatrick et al. [9] and Černý [10] and is an adaptation of the Metropolis algorithm, which is a Monte Carlo method [25]. SA is based on the gradual metal cooling for crystallization. This algorithm works by emulating the physical process where a metal is heating at very high temperature and then cooled very slowly until its frozen state. When this process happens, the metal is crystallized with the lowest energy configuration. SA is an algorithm that has been used for finding the optimal solution or close to it for different NP-hard problems including biological problems such as sequence alignment [26–28], phylogenetic trees [29], and PFP [30]. From a theoretical point of view, SA converges to the optimal solution or close to the lowest free energy [31]. However, classical SA is not able to find the lowest energy because energy barriers are too high for SA and cannot escape from local minima. As a consequence, variants of this method are proposed [14, 30].

Simulated annealing usually starts at a very high initial temperature ( T i n i t i a l ). Through a cooling function, the temperature value is gradually reduced from T i n i t i a l to T f i n a l , which usually is very close to zero [9, 10]. There are several cooling functions used in SA [31–36], for example, (1) T k + 1 = α T k (2) T k + 1 = e - α T k (3) T k + 1 = T k 1 + η T k .

The most common function is (1). This function reduces the temperature parameter by α factor, which is commonly in the range of 0.7 ≤ α < 1.0 . A slow cooling is applied when α is very close to 1, while a fast cooling is applied when α is around 0.70.

The classical SA has two cycles as is shown in Algorithm 1; the first one is named temperature cycle and is used to decrease the value of the temperature with a specific cooling function. The second cycle is named metropolis cycle and it generates, accepts, or rejects solutions of the problem to be optimized. The initial and final temperature values are set (see lines (1)-(2)). These values are obtained by an analytical (see Section 5) or experimental way: T i n i t i a l should be as high as possible, while T f i n a l should be close to zero. An initial solution ( S i n i t i a l ) is required in SA; this solution is generated (see line (3)) and is set to S c u r r e n t . At the beginning of the process, the parameter T is set at the initial temperature (see line (4)). The temperature cycle is executed from T i n i t i a l to T f i n a l (see lines (5)–(19)). Then the metropolis cycle is repeated (see lines (6)–(17)) a certain number of times until a stop condition, which is explained later in this paper. A new solution ( S n e w ) is generated within the metropolis cycle by applying a small perturbation to the current solution S c u r r e n t (see line (7)). The difference between these two solutions ( S n e w and S c u r r e n t ) is calculated (see line (8)). In practice, SA can be stopped when the probability of accepting a new solution is negligible. For a minimization problem, if this difference is less than or equal to zero (see line (9)), the new solution is accepted (see line (10)). When this difference is greater than zero, the Boltzmann distribution is applied. Then, a Boltzmann probability is calculated using (4) in line (12). If this probability is higher than a random value between 0 and 1 (see line (13)), then the new solution S n e w is accepted (see line (14)): (4) P S n e w = e - Δ S / T .

<bold>Algorithm 1: </bold>Pseudocode of classical simulated annealing.

(1) Setting initial temperature ( T i n i t i a l )

(2) Setting final temperature ( T f i n a l )

(3) Generate S c u r r e n t from Initial Solution ( S i n i t i a l )

(4) T = T i n i t i a l

(5) While ( T > T f i n a l ) do //Temperature Cycle

(6) While (stop condition) //Metropolis Cycle

(7) Generate S n e w by applying a perturbation to S c u r r e n t

(8) Obtain difference between S n e w and S c u r r e n t

(9) If (difference ≤ 0) then

(10) Accept S n e w

(11) else

(12) Boltzmann Probability = exp ⁡ ( - d i f f e r e n c e / T )

(13) If (Boltzmann Probability > random(0, 1)) then

(14) Accept S n e w

(15) end if

(16) end if

(17) end while

(18) Decrease T by a cooling function

(19) end while

(20) Shown better solution ( S b e t t e r )

After the metropolis cycle is completed, the temperature value is reduced by a cooling function (see line (18)). For a maximization problem, if the difference of S n e w - S c u r r e n t is greater than zero, the new solution S n e w is accepted; else S n e w can be rejected or accepted depending on the Boltzmann probability value.

3.2. Simulated Annealing Based on Bose-Einstein Distribution

Statistical Mechanics (SM) study the overall behavior of a system consisting of a large number of particles whose behavior is unpredictable. SM uses statistics and probability theory and thermodynamic principles. According to SM, the occurrence of each future result is determined by a probabilistic function such as Boltzmann and Bose-Einstein distributions. In addition, only the most probable behavior of the system in thermal equilibrium at a given temperature is observed [37]. Bose-Einstein distribution is obtained by finding the most probable distribution, that is, solving the problem defined by maximizing the most probable distribution, subject to the following constraints: ( h 1 ) the number of particles (defined by the summation of particles in each microstate) is constant and ( h 2 ) the total energy (defined by the summation of individual energies of each microstate) is constant. The problem is solved using Lagrange multipliers. The parameters λ and β are defined as lagrage multiplier of h 1 and h 2 , respectively [38]. Then the Bose-Einstein distribution applied for low and very low temperatures is defined by (5) h Δ E = 1 e λ + β e i - 1 .

Then particles behavior can be modeled by Bose-Einstein distribution defined in (6). This equation defines the acceptance probability distribution of a new configuration of particles: (6) h Δ E = 1 e λ e Δ E / K T - 1 , where T is the temperature parameter, λ is related to the constraint of the total of particles in the system, and K is the Boltzmann constant. However, at very high temperatures Bose-Einstein distribution practically becomes the Boltzmann distribution. Nevertheless, at low and very low temperatures, the particles behave differently and they tend to congregate at the same lowest energy state; the result is known as a Bose-Einstein condensate [39]; as a consequence, the system can be modeled by Bose-Einstein distribution. Section 4 presents a new SA applying both Boltzmann and Bose-Einstein distributions for accepting bad solutions for high and low temperatures.

3.3. Simulated Annealing Applied to Solve Protein Folding Problem

The classical Simulated Annealing Algorithm can be implemented to solve the Protein Folding Problem [40] as is shown in the pseudocode of Algorithm 2. The initial and final temperature (see lines (1)-(2)) can be calculated according to the instance of the problem by applying the analytical method parameters of Section 5; that means that the protein should be preprocessed.

<bold>Algorithm 2: </bold>Pseudocode SA applied to protein folding problem.

(1) Tune initial temperature ( T i n i t i a l ) by analytical method

(2) Tune final temperature ( T f i n a l ) by analytical method

(3) Setting cooling factor ( α )

(4) S c u r r e n t is created by modifying the internal angles of protein

(5) S b e t t e r = S c u r r e n t

(6) Calculate energy of protein applying a Force Field

(7) T = T i n i t i a l

(8) While ( T > T f i n a l ) do //Temperature Cycle

(9) While (stop condition) //Metropolis Cycle

(10) Create new solution ( S n e w ) by modifying internal angles of the protein

(11) Calculate Energy of proteins using a force field funtion

(12) Obtain difference of energies between these two proteins

(13) If (difference ≤ 0) then

(14) S c u r r e n t = S n e w

(15) If energy( S n e w ) > energy( S b e t t e r ) then

(16) S b e t t e r = S c u r r e n t

(17) end if

(18) else

(19) Boltzmann Probability = exp ⁡ ( - d i f f e r e n c e / T )

(20) If (Boltzmann Probability > random(0, 1)) then

(21) S c u r r e n t = S n e w

(22) end if

(23) end if

(24) end while

(25) Decrease T by cooling function ( T k + 1 = α ∗ T k )

(26) end while

(27) Show the best solution of PFP ( S b e t t e r )

Applying the cooling function (1), the cooling factor value α is required. The temperature value is reduced very slowly; thus, α must be very close to 1 (see line (3)). In order to reduce very fast the temperature, the cooling factor α is set very close to 0.70. An initial solution of PFP is created, which is set to the current solution S c u r r e n t (see line (4)). The internal angles of the initial solution are modified at random. At this point, the best solution S b e t t e r is S c u r r e n t (see line (5)). The energy of S c u r r e n t is calculated by applying a force field function (see line (6)). Before starting the temperature cycle, the initial is loaded into T variable in line (7). Then the temperature cycle starts (see lines (8)–(26)) with a logic condition ( T greater than T f i n a l in line (8)). Inside of temperature cycle, the metropolis cycle is executed (see lines (9)–(24)). After this cycle is completed, the value of the temperature is decreased (see line (25)).

Inside the metropolis cycle, a new solution of Protein Folding Problem S n e w is generated by modifying the previous solution S c u r r e n t . This is done by modifying the internal angles of the protein (see line (10)). The energy of the protein is calculated (see line (11)), and the difference of energies (i.e., between S n e w and S c u r r e n t ) is determined (see line (12)). This difference is denoted by Δ S = S c u r r e n t - S n e w . The new solution is accepted when the new solution is better than the previous one; thus, the current solution S c u r r e n t is replaced by S n e w (see line (14)). When a new solution is worse than the current solution, it can be accepted using the Boltzmann distribution (see line (21)). The probability of this distribution (or acceptance probability) is directly related to the current value of the temperature and the difference of energy between S n e w and S c u r r e n t . This probability is calculated by (4). As the temperature value is reduced, the acceptance probability P ( S n e w ) decreases.

4. MPSABBE Algorithm 4.1. General Description

MPSABBE is a hybrid algorithm, which has four phases (see Figure 1). These phases are (i) Multiquenching Phase (MQP) applied from extremely high to high temperatures, (ii) Boltzmann Annealing Phase (BAP), which is executed from high to low temperatures, (iii) Bose-Einstein Annealing Phase (BEAP) from low to very low temperatures, and finally (iv) Dynamical Equilibrium Phase (DEP) which applies an annealing process at extremely low temperatures using Bose-Einstein distribution.

Figure 1

MPSABBE phases.

In order to accept worse solutions, BAP and BEAP apply Boltzmann and Bose-Einstein distributions, respectively. This is done with the aim of escaping from local minima. DEP is an extension of BEAP, where the stochastic equilibrium is dynamically detected. This is done by using a regression method into the metropolis cycle; the iterations’ number is considered as the independent variable and the energy value of each iteration as the dependent variable. The equilibrium detection criterion is the slope of the energy function into the metropolis cycle. The four phases MQP, BAP, BEAP and DEP are executed in the temperatures range shown in Table 1. The initial and final temperatures T i n i t i a l and T f are determined using the analytical tuning method of Section 6. The other temperatures are determined using a variability criterion, such as the variability being larger where the temperature is higher.

Table 1

Temperatures ranges of MPSABBE.

Phase	Initial temperature	Final temperature
MQP (from very high to high temperatures)	T i n i t i a l	T f M Q P
BAP (from high to low temperatures)	T f M Q P	T f B A P
BEAP (from low to very low temperatures)	T f B A P	T f B E A P
DEP (from very low to extremely low temperatures)	T f B E A P	T f

4.2. MQP Phase of MPSABBE

MQP has several subphases. It starts at an extremely high initial temperature ( T i n i t i a l ), which is obtained by an analytical method [41]. This phase is finished when a threshold temperature ( T f M Q P ) is reached. MQP uses the cooling function given by (7) T k + 1 = α Q u e n c h i n g γ k T k , where α Q u e n c h i n g is a decrement factor of the temperature parameter, in the range [ 0.7,1.0 ] , and defines how fast each MQP subphase is decreased. A very low α Q u e n c h i n g value will decrease the temperature very fast. Besides, γ k is defined as (8) γ k = 1 - τ k .

The τ parameter is defined by (9), where 0 < τ < 1 , and it defines a quadratic decrement of the temperature. Notice that τ converges to zero and (7) is equivalent to (10): (9) τ k = τ k - 1 2 (10) T k + 1 = α Q u e n c h i n g T k .

The transition between two subphases is based on τ parameter; it occurs when τ converges to zero ( τ ≈ 0 ). When τ is very close to zero, a new MQP subphase is started and τ is set to its initial value. This process continues until the temperature T f M Q P is reached. In Figure 2, the MQP phase is shown. In this phase, several subphases are shown. When a subphase is started, the parameter τ is set to its initial value.

Figure 2

MQP phase of MPSABBE algorithm.

In Algorithm 3, the MQP pseudocode of MPSABBE is shown. At setting section (see lines (4)–(6)), the initial temperature is calculated by an analytical method. The final temperature of this phase ( T f M Q P ) is set to an initial value, determined in an experimental way. In line five, the variable T is set to the initial temperature. The factors α Q u e n c h i n g and τ are set to their initial values. The initial solution S c u r r e n t is generated (see line (8)). The energy of this solution E n e r g y ( S c u r r e n t ) is calculated, and E ( S c u r r e n t ) and S m i n are set to E n e r g y ( S c u r r e n t ) and S c u r r e n t , respectively.

<bold>Algorithm 3: </bold>MQP pseudocode of MPSABBE algorithm.

(1) MQP Procedure( )

(2) Begin

(3) //Setting section

(4) T i n i t i a l = Initial Temperature calculated by analytical method

(5) T f M Q P = Initial value, T = T i n i t i a l

(6) α Q u e n c h i n g = initial value, τ = Initial value

(7) //Creation of initial solution

(8) S c u r r e n t = Create the initial solution, E ( S c u r r e n t ) = E n e r g y ( S c u r r e n t )

(9) S m i n = S c u r r e n t , E ( S m i n ) = E ( S c u r r e n t )

(10) Repeat //External Cycle

(11) Repeat //Internal Cycle (Metropolis Cycle)

(12) S n e w = Perturbation ( S c u r r e n t ) //Uniform perturbation

(13) Difference = E ( S n e w ) - E ( S c u r r e n t )

(14) If Difference ≤ 0 Then

(15) S c u r r e n t = S n e w

(16) E ( S c u r r e n t ) = E ( S n e w )

(17) elseif exp ⁡ ( - D i f f e r e n c e / T ) > r a n d o m [ 0,1 ] Then //Boltzmann Probability

(18) S c u r r e n t = S n e w

(19) E ( S c u r r e n t ) = E ( S n e w )

(20) end if

(21) If E ( S c u r r e n t ) < E ( S m i n ) then //save S m i n

(22) S m i n = S c u r r e n t

(23) E ( S m i n ) = E ( S c u r r e n t )

(24) end if

(25) Until Metropolis Cycle is Finished

(26) τ = τ ∗ τ

(27) If τ very close to 0 Then

(28) τ = initial value

(29) end if

(30) T = α Q u e n c h i n g ∗ ( 1 - τ ) ∗ T

(31) Until T > T f M Q P //External Cycle

(32) End procedure

The external cycle is started at line (10), and this is finished at line (31). This internal cycle generates solutions of PFP and accepts or rejects solutions using the Boltzmann distribution. The temperature parameter is decreased into this cycle by applying a cooling function (see line (30)). In this cycle, τ is set by (9) (see line (26)). When τ is very close to zero, this variable is set to its initial value (see line (28)). The Temperature value is calculated by (7).

After the external cycle is started, the metropolis cycle is started too. This cycle generates new solutions of PFP. A new solution S n e w is obtained by applying a small perturbation to the current solution S c u r r e n t (see line (12)). The difference between the energies of S n e w and S c u r r e n t is calculated (see line (13)). If this difference is less than zero (see line (14)), then the new solution S n e w is accepted. S c u r r e n t is replaced by S n e w (see line (15)). E ( S c u r r e n t ) is replaced by E ( S n e w ) (see line (16)). If the difference of energies between these solutions is larger than zero, then the Boltzmann probability is applied (see line (17)). If this probability is larger than a random number between 0 and 1 (see line (17)), then the new solution S n e w is accepted (see line (18)). The S c u r r e n t is replaced by S n e w (see line (19)). If E ( S c u r r e n t ) is less than E ( S n e w ) (see line (21)) then S m i n is set to S c u r r e n t (see line (22)). The E ( S m i n ) is replaced by E ( S c u r r e n t ) (see line (23)).

4.3. BAP Phase of MPSABBE

In Algorithm 4, pseudocode of BAP is shown. BAP is based on simulated annealing. The temperature parameter is decreased by ( T k + 1 = α A n n e a l i n g T k ) or ( T k + 1 = e - α A n n e a l i n g T k ). On the other hand, the length of metropolis cycle is determined by (21) or (27), respectively. In the internal cycle of the BAP, new solutions for the instance are generated. In this cycle, a better solution than a previous one is always accepted. However, worse solutions are accepted or rejected by applying the Boltzmann distribution (4). The length of the Markov chain (i.e., the internal cycle length) is determined by (21), where the increment β is calculated with (22). The initial temperature was set to a threshold value, which was the final temperature of MQP phase. The final temperature of BAP phase is very close to zero.

<bold>Algorithm 4: </bold>Pseudocode of BAP phase of MPSABBE.

(1) BAP Phase( )

(2) Begin

(3) T = T f M Q P (Final temperature of MQP Phase)

(4) T f B A P = Final Temperature of this phase

(5) α = initial value (very close to one)

(6) β = Value calculated by analytical method

(7) CM = Initial value

(8) While ( T > T f B A P ) do

(9) k = 1

(10) while ( k ≤ C M ) do

(11) S n e w = perturbation system( S c u r r e n t )

(12) Difference = E n e w - E c u r r e n t

(13) If (Difference ≤ 0) then

(14) S c u r r e n t = S n e w

(15) E ( S c u r r e n t ) = E ( S n e w )

(16) ElseIf ( exp ⁡ ( - D i f f e r e n c e / T ) ) > r a n d o m [ 0,1 ] ) then

(17) S c u r r e n t = S n e w

(18) E ( S c u r r e n t ) = E ( S n e w )

(19) End if

(20) If E ( S c u r r e n t ) < E ( S m i n ) then //save S m i n

(21) S m i n = S c u r r e n t

(22) E ( S m i n ) = E ( S c u r r e n t )

(23) end if

(24) k = k + 1

(25) end while

(26) T = α ∗ T or T = ( exp ⁡ ( - α ) ) ∗ T

(27) C M = β ∗ C M

(28) End while

(29) End

4.4. BEAP Phase of MPSABBE

In Algorithm 5, pseudocode of BEAP is shown. Again the external cycle decreases its temperature value according to the cooling functions (1) or (2). This time, the metropolis cycle length is constant, and it is equal to the maximum length of the last metropolis cycle in BAP phase. In this second cycle, the Bose-Einstein distribution is applied for accepting worse solutions.

<bold>Algorithm 5: </bold>Pseudocode of BEAP phase.

(1) BEAP Phase( )

(2) Begin

(3) T = Threshold value; Determine exp ⁡ λ

(4) T f i n a l = value very close to zero

(5) α = initial value

(6) β = Value calculated by analytical method

(7) CM = Initial value

(8) While ( T > T f i n a l ) do

(9) k = 1

(10) while ( k ≤ C M ) do

(11) S j = perturbation system( S i )

(12) Δ E = E j - E i

(13) If ( Δ E ≤ 0 ) then

(14) S i = S j

(15) ElseIf (( 1 / ( exp ⁡ λ ∗ exp ⁡ Δ E / T - 1 ) ) ) > r a n d o m [ 0,1 ] ) then

(16) S i = S j

(17) End if

(18) k = k + 1

(19) end while

(20) T = α ∗ T or T = ( exp ⁡ ( - α ) ) ∗ T

(21) End while

(22) End

4.5. DEP Phase of MPSABBE

In Algorithm 6, the DEP goal is to detect the stochastic equilibrium by determining the iteration where the slope of the energy function remains very close to zero. In order to do that, let us define the next variables: (a) x i the number of the iterations in the metropolis cycle 1,2 , … , n and (b) E i the energy found for the algorithm in iteration x i . Using a standard least squares method, the slope for n iterations is defined by (11) m = n ∑ i = 1 n x i E i - ∑ i = 1 n x i ∑ i = 1 n E i n ∑ i = 1 n x i 2 - ∑ i = 1 n x i 2 , which becomes (12) m = k 1 ∑ i = 1 n i E i - k 2 ∑ i = 1 n E i , where (13) k 1 = 12 n 3 - n , k 2 = 6 n 2 + n .

<bold>Algorithm 6: </bold>Pseudocode of DEP phase.

(1) DEP Phase( )

(2) Begin

(3) While ( m > 0 ) do

(4) n = 1

(5) E s u m m a r y = 0 , S u m m a r y = 0

(6) while ( k ≤ C M ) do

(7) S j = perturbation_system( S i )

(8) If E ( S j ) = total_clausule then stop( )

(9) Δ E = E j - E i

(10) If ( Δ E ≥ 0 ) then

(11) S i = S j

(12) ElseIf ( ( 1 / ( exp ⁡ ( λ ) ∗ exp ⁡ ( Δ E / T ) - 1 ) ) > r a n d o m [ 0,1 ] ) then

(13) S i = S j

(14) End if

(15) n = n + 1

(16) S u m m a r y = S u m m a r y + n ∗ E i

(17) E s u m m a r y = E s u m m a r y + E i

(18) end while

(19) T = α ∗ T or T = ( exp ⁡ ( - α ) ) ∗ T

(20) k 1 = 12 / ( n ∗ n ∗ n - n )

(21) k 2 = 6 / ( n ∗ n + n )

(22) m = k 1 ∗ S u m m a r y + k 2 ∗ E s u m m a r y

(23) End while

(24) End

Notice that the complexity of the computation of (12) is O ( n ) . This equation contains only summations; thus, it is less complex than (11). These summations are computed using simple data structures. k 1 and k 2 are only constants for a particular n value.

5. Analytical Tuning Method 5.1. Parameters Setting Based on Boltzmann Distribution

Parameters of MPSABBE are tuned by the analytical method [42]. The initial temperature is defined by the maximum difference named maximum decrement Δ Z m a x , which is calculated using a sample of random protein structures at the highest temperature range. In this sample, the energy of two consecutive protein structures defines a simple decrement of energy Δ Z i , j , and Δ Z m a x is the maximum difference in the sample. On the other hand, the final temperature is calculated by applying the minimum deterioration (i.e., minimum decrement) Δ Z m i n of a sample of protein structures taken at low temperatures. Analytical tuning based on Boltzmann distribution can be helpful for setting up the initial temperature. The probability of accepting any new solution S n e w is near to one ( P ( S n e w ) ≈ 1 ) at high temperatures, so the decrement of the cost function is maximal. The initial temperature ( T i n i t i a l ) is associated with the maximum deterioration admitted and the defined acceptance probability P ( S n e w ) .

Let S c u r r e n t be the current solution and S n e w a new proposed one, and Z ( S c u r r e n t ) and Z ( S n e w ) are the costs associated to S c u r r e n t and S n e w , respectively. The maximum and minimum deteriorations are Δ Z m a x and Δ Z m i n , respectively; then P ( Δ Z m a x ) probability of accepting a new solution S n e w with the maximum deterioration is defined by (14) P Δ Z = exp ⁡ - Δ Z T . This equation is basically the Boltzmann distribution, which is applied for calculating T i n i t i a l . This temperature value is defined by (15) T i n i t i a l = - Δ Z m a x ln ⁡ P Δ Z m a x . Similarly, the final temperature ( T f i n a l ) is established according to the probability of accepting a new solution S n e w with the minimum deterioration. The equation for calculating the final temperature is defined by (16) T f i n a l = - Δ Z m i n ln ⁡ P Δ Z m i n . There are other parameters of MPSABBE that are calculated by applying a particular cooling function; for example, the metropolis cycle length is calculated by applying (17) T k + 1 = α T k .

The analytical method determines the metropolis cycle length L k with a simple Markov model [42]; at high temperatures, only a few iterations are required because, in this condition, the stochastic equilibrium is reached very fast. Nevertheless, at low temperatures, a more exhaustive exploration is needed and L k should be as largest as possible. Let L 1 be L k at the temperature T i n i t i a l and let L m a x be the maximum metropolis cycle length. Let the temperature T k be decreased by the cooling function (17) and let L k + 1 be calculated by (18) L k + 1 = β L k , where β is the increment coefficient of metropolis cycle ( β > 1 ), so L k + 1 > L k and L 1 is the initial value. The markov chain length of the last metropolis cycle is equal to L m a x . Functions (17) and (18) are consecutively applied in simulated annealing from T i n i t i a l to T f i n a l ; consequently T n and L m a x are obtained by (19) and (20), respectively, (19) T n = α n T i n i t i a l (20) L m a x = β n L 1 , where n is the steps number from T i n i t i a l to T f i n a l .

Notice that the increment coefficient β can be calculated if the initial length L 1 and the maximum length value L m a x are available. As is well known the former can simply be set close to one, while the second depends on the exploration level established in the algorithm as follows.

Thus, the number of times that the metropolis cycle is executed can be simply obtained by using (21). Once n is determined the increment of the metropolis cycle length can be calculated by (22): (21) n = ln ⁡ T f i n a l - ln ⁡ T i n i t i a l ln ⁡ α (22) β = exp ⁡ ln ⁡ L m a x - ln ⁡ L 1 n .

5.2. Parameters Setting Based on Bose-Einstein Distribution

The initial and final temperatures can be calculated by applying the Bose-Einstein distribution. Then, the probability of accepting a new solution with the maximum deterioration P ( Δ Z m a x ) is defined by (23). Consequently, the initial and final temperatures are calculated with (24) and (25), respectively, (23) P Δ Z = 1 e Δ Z / T - 1 (24) T i n i t i a l = Δ Z m a x ln ⁡ P Δ Z m a x + 1 / P Δ Z m a x (25) T f i n a l = Δ Z m i n ln ⁡ P Δ Z m i n + 1 / P Δ Z m i n .

Let T k be decreased by the cooling function (2). Thus, T n is calculated by (26) T n = e - n α T i n i t i a l . As a consequence, n and β are calculted by (27) n = ln ⁡ T f i n a l - ln ⁡ T i n i t i a l - α (28) β = exp ⁡ ln ⁡ L m a x - ln ⁡ L 1 n .

Notice that the increment coefficient β can be calculated if the initial and maximum metropolis length L 1 and L m a x are available [42]. As is well known the former can simply be set close to one, while the second depends on the exploration level established in the algorithm. Therefore, for any S i solution, the value of L m a x depends on the size of neighborhood V s i . Thus, L m a x = C V s i and C = - ln ⁡ P r S i , where P r S i is the rejection probability for a solution S i . The parameter C ranges from 1 to 4.6; the larger value of C assures a good exploration level in the neighborhood of S i at the final temperature. Hence, different exploration levels can be applied. When we explore with P r S i values of 63%, 86%, 95%, or 99%, the exploration levels are C = 1 , 2 , 3 , or 4.6 , respectively. Because L m a x can be very large for PFP instances, it is important to apply a particular process for detecting the stochastic equilibrium; this is done in DEP phase of MPSABBE that detects efficiently the stochastic equilibrium. The next section explains all MPSABBE phases and the performance of using Boltzmann and Bose-Einstein distribution.

6. Experimental Results

MPSABBE is tested with five instances of PFP, which are M e t 5 -enkephalin, proinsulin, T0549, T0335, and T0281. These instances have different sequence’s length and a different number of variables (dihedral angles). The smallest sequence is M e t 5 -enkephalin, which has five amino acids and 19 variables. The largest sequence is a hypothetical protein (CASP T0281), which has 90 amino acids and 458 variables. The proinsulin instance has 31 amino acids and 132 variables; the 2K5E (CASP T0549) has 73 amino acids and 343 variables. The instance Bacillus subtilis (CASP T0335) has 85 amino acids and 450 variables. The dihedral angles used in the simulations were phi ( Φ ), psi ( Ψ ), omega ( ω ), and Chi ( χ ). The initial and final temperature are tuned analytically. In MQP, parameters α Q u e n c h i n g and τ are set with 0.85 and 0.999, respectively. In each subphase of MQP the final value of τ is set to 0.001.

In Table 2, the results of M e t 5 -enkephalin obtained with MPSABBE algorithm are shown. In this table, we show the traditional average energy, processing time in minutes, and the average of the traditional RMSD (Root-Mean-Square Deviation) [43]. The RMSD was calculated using TM-Align [44]. The best average solution for M e t 5 -enkephalin is −5.0634 kcal/mol with 0.8427 minutes of processing time, and the average RMSD obtained was 0.361 Å (Angstroms). The RMSD is a measure which represents a structural alignment between two proteins (target and solution). The target used in this paper was taken from Protein Data Bank (PDB). An RMSD near to zero is taken as a perfect structural alignment between both proteins. The RMSD is commonly used in protein folding to represent how a new obtained solution by simulation is structurally similar to the target solution. In this case, in Figure 3, the graphic of energy and RMSD for each solution is shown. In this graphic, all energies of M e t 5 -enkephalin calculated by MPSABBE are plotted. This is a solution with poor quality because there are better solutions in the literature; the energy found by MPSABBE was −7.2787 kcal/mol. In Figure 4, the graphics of landscape of M e t 5 -enkephalin is shown. The results obtained in the literature for this case by using ECEPP/2 and with ω fixed at 180 or ω variable were −10.72 [20] and −12.90 [43, 45], respectively. Examining the features of MPSABBE the exploration ability is not good enough; thus, the algorithm requires improvement. Figure 3 shows all solutions generated by MPSABBE; the curve enveloping the number of solutions in Figure 3 is only a descriptive tool to illustrate that the optimal solution is reached when the RMSD is too small; however, this is not really a very good stop condition. Notice that the best result obtained with the classical simulated annealing in the literature using Boltzmann distribution was only −5 kcal/mol [43], while the best result obtained in this case for MPSABBE using Bose-Einstein distribution was −7.2787 kcal/mol.

Table 2

Average results of Met⁵-enkephalin with MPSABBE algorithm.

α A n n e a l i n g	Average energy (kcal/mol)	Processing time (minutes)	Average RMSD (Å)
0.75	−3.0836	0.1252	0.4517
0.80	−4.3025	0.1701	0.4327
0.85	−4.4093	0.2023	0.3510
0.90	−4.6493	0.3384	0.5097
0.95	−5.0634	0.8427	0.3610

Figure 3

Energy and RMSD for M e t 5 -enkephalin.

Figure 4

Landscape of energy, RMSD, and processing time for M e t 5 -enkephalin.

In Table 3, the results of proinsulin obtained with MPSABBE algorithm are shown. The best average solution for this instance is −122.4350 kcal/mol with 20.7302 minutes of processing time, the average RMSD is 3.127 Å. This solution was obtained with α A n n e a l i n g = 0.95 . In Figure 5 the graphic of energy and RMSD for each solution is shown. In this Figure, some energies of proinsulin calculated by MPSABBE are plotted. The best solution found by MPSABBE was −142.7586 kcal/mol. In Figure 6, the landscape of proinsulin is shown.

Table 3

Average results of proinsulin with MPSABBE algorithm.

α A n n e a l i n g	Average energy (kcal/mol)	Processing time (minutes)	Average RMSD (Å)
0.75	−94.2520	3.0279	3.1370
0.80	−102.5484	3.8918	3.1153
0.85	−102.1247	5.1319	3.1253
0.90	−108.1093	7.8184	3.3083
0.95	−122.4350	20.7302	3.1273

Figure 5

Energy and RMSD of proinsulin.

Figure 6

Landscape of energy, RMSD, and processing time for proinsulin.

In Table 4, the results of T0549 instance obtained with MPSABBE algorithm are shown. The best average solution for this instance is −257.0625 kcal/mol with 106.6151 minutes of processing time, the average RMSD is 4.30 Å. This solution was obtained with α A n n e a l i n g = 0.95 . In Figure 7, the energy and RMSD for each solution are shown. In this figure, some energies of T0549 instance calculated by MPSABBE are plotted. The best solution found was −317.2117 kcal/mol. In Figure 8, the landscape of T0549 is shown.

Table 4

Average results of T0549 with MPSABBE algorithm.

α A n n e a l i n g	Average energy (kcal/mol)	Processing time (minutes)	Average RMSD (Å)
0.75	−183.6351	19.4805	4.3933
0.80	−190.2890	24.9117	4.4180
0.85	−208.0338	31.1958	4.2933
0.90	−231.2849	48.6717	4.2887
0.95	−257.0625	106.6151	4.3037

Figure 7

Energy and RMSD for T0549.

Figure 8

Landscape of energy, RMSD, and processing time for T0549.

In Table 5, the results of T0335 instance obtained with MPSABBE algorithm are shown. The best average solution for this instance is −378.6827 kcal/mol with 202.2453 minutes of processing time; the average RMSD is 3.5793. This solution was obtained with α A n n e a l i n g = 0.95 . In Figure 9, the energy and RMSD for each solution are shown. In this figure, some energies of T0335 instance calculated by MPSABBE are plotted. The best solution was −427.2939 kcal/mol. In Figure 10, the landscape of T0335 is shown.

Table 5

Average results of T0335 with MPSABBE algorithm.

α A n n e a l i n g	Average energy (kcal/mol)	Processing time (minutes)	Average RMSD (Å)
0.75	−249.4399	32.9611	3.7413
0.80	−267.4245	40.4676	3.6750
0.85	−293.0409	52.2383	3.6160
0.90	−335.0567	78.9619	3.5828
0.95	−378.6827	202.2453	3.5793

Figure 9

Energy and RMSD of T0335.

Figure 10

Landscape of energy, RMSD, and processing time of T0335.

In Table 6, the results of T0281 instance obtained with MPSABBE algorithm are shown. The best average solution for this instance is −322.3821 kcal/mol with 187.5070 minutes of processing time; the average RMSD is 4.5 Å. This solution was obtained with α A n n e a l i n g = 0.95 . In Figure 11, the graphic of energy and RMSD for each solution are shown. In this figure, some energies of T0281 instance calculated by MPSABBE are plotted. The best solution found was −380.1765 kcal/mol. In Figure 12, the landscape of T0281 is shown.

Table 6

Average results of T0281 with MPSABBE algorithm.

α A n n e a l i n g	Average energy (kcal/mol)	Processing time (minutes)	Average RMSD (Å)
0.75	−188.9717	32.7761	4.6160
0.80	−193.9981	40.4018	4.6347
0.85	−236.3011	53.3635	4.5507
0.90	−263.1571	79.3565	4.4467
0.95	−322.3821	187.5070	4.5515

Figure 11

Energy and RMSD for T0281.

Figure 12

Landscape of energy, RMSD, and processing time for T0281.

Figures 13–15 show the graphs of energy, which are obtained from consecutive solutions in the cycle of metropolis in specific executions. These figures correspond to the results of energies obtained from the MPSABBE algorithm with M e t 5 -enkephalin, proinsulin, and T0281 instances, respectively.

Figure 13

Energy of MPSABBE with M e t 5 -enkephalin instance.

Figure 14

Energy of MPSABBE with proinsulin instance.

Figure 15

Energy of MPSABBE with T0281 instance.

6.1. Test Hypothesis

In Table 7, the average and deviation of energy and time for each instance applying MPSABBE algorithm are shown. The null hypothesis is defined as H 0 : μ Q M P S A B B E ≤ μ Q C M Q A , which means that the average energy of MPSABBE ( μ Q M P S A B B E ) for each instance is less than or equal to CMQA ( μ Q C M Q A ) [14]. The alternative hypothesis is defined as H 1 : μ Q M P S A B B E > μ Q C M Q A . In Table 8, the average and standard deviation of energy and time for each instance applying the proposed algorithm are shown. The average processing times are used for testing the null hypothesis, which is defined as H 0 : μ T M P S A B B E ≤ μ T C M Q A , which means that the average processing time of MPSABBE ( μ T M P S A B B E ) is less than or equal to the average processing time of CMQA ( μ T C M Q A ). The alternative hypothesis is defined as H 1 : μ T M P S A B B E > μ T C M Q A . In Table 9, the values obtained for t -student are shown; these values were calculated by applying the average and standard deviation of energy and execution time from Tables 7 and 8.

Table 7

Average of energy and standard deviation of MPSABBE.

Instance	Energy average (kcal/mol)	Energy standard deviation	Time average (minutes)	Time standard deviation
Met 5 -enkephalin	−4.3016	0.7410	0.3357	0.2943
Proinsulin	−105.8938	10.4815	8.8670	8.1788
T0549	−214.0610	30.3024	46.1749	35.5258
T0335	−304.7289	52.3785	91.6016	76.1357
T0281	−240.9620	54.8912	85.0103	71.3112

Table 8

Average of energy and standard deviation of CMQA.

Instance	Energy average (kcal/mol)	Energy standard deviation	Time average (minutes)	Time standard deviation
Met 5 -enkephalin	−3.7820	0.7848	0.5719	0.4509
Proinsulin	−104.7165	10.8593	18.9617	16.2658
T0549	−217.1220	36.7019	121.9018	96.2037
T0335	−311.3921	39.3025	204.2191	154.3906
T0281	−254.3024	42.6025	231.8738	185.2004

Table 9

t -student for each instance.

Instance	t -student (for energy)	t -student (for time execution)
Met 5 -enkephalin	−2.6363	−2.4022
Proinsulin	−0.4272	−3.0368
T0549	0.3522	−4.0444
T0335	0.5573	−3.5832
T0281	1.0515	−4.0533

The value of t -student is −2.6363 (Table 9). The critical value is 1.645. The statistic test determined that the null hypothesis is accepted; thus, MPSABBE generates better quality solution than CMQA, when these approaches are applied with M e t 5 -enkephalin instance. Therefore, the null hypothesis H 0 : μ Q M P S A B B E ≤ μ Q C M Q A is rejected, and the average energy of MPSABBE ( μ Q M P S A B B E ) for M e t 5 -enkephalin instance is less than or equal to CMQA ( μ Q C M Q A ). For processing execution time, the value of the statistic test ( t -student) is −2.4022. Thus, MPSABBE (applied to M e t 5 -enkephalin instance) uses less processing execution time than CMQA.

When the proinsulin instance is applied, the value of the statistic test ( t -student) is −0.4272; thus, MPSABBE generates better quality solution than CMQA. For processing execution time, the value of the statistic test ( t -student) is −3.0368. MPSABBE (applied to proinsulin instance) uses less processing execution time than the average processing time of CMQA. When the T0549 instance is applied, the value of the statistic test ( t -student) is 0.3522, so that MPSABBE generates better quality solution than CMQA. For processing time, the value of the statistic test ( t -student) is −4.0444. The MPSABBE (applied to T0549 instance) uses less processing execution time than CMQA. When the T0335 instance is applied, the value of the statistic test ( t -student) is 0.5573, so that MPSABBE generates better quality solution than CMQA. For the processing execution time, the value of the statistic test ( t -student) is −3.5832. The MPSABBE (applied to T0335 instance) uses less processing time than CMQA. When the T0281 instance is applied, the value of the statistic test ( t -student) is 1.0515; thus, MPSABBE generates better quality solution than CMQA. For processing execution time test, the value of the statistic test ( t -student) is −4.0533. Then MPSABBE (applied to T0281 instance) uses less processing execution time than CMQA. Therefore, MPSABBE generates the better quality solution and uses less processing execution time than CMQA in all instances.

Notice that the improvement obtained when the two distributions are used is better when the protein is smaller. For instance, for M e t 5 -enkephalin and proinsulin (with five and thirty-one amino acids) MPSABBE surpass CMQA by 13.73 and 1.1243%, respectively; otherwise for T0549, T0335, and T0281 (with 73, 85, and 90 amino acids), these figures were −1.12, −2.13, and −3.75%, respectively. Thus, the new algorithm obtains better results for small proteins than the classical SA.

7. Conclusions

In this paper, a new Simulated Annealing Algorithm named MPSABBE for Protein Folding Problem is presented. This algorithm includes Bose-Einstein and Boltzmann distributions in SA. Traditionally, for PFP, SA only uses the Boltzmann distribution function as the acceptance probability of bad solutions. MPSABBE was compared to a classical SA for protein folding which only applies Boltzmann distribution. According to the experimentation, the new algorithm is more efficient by the use of the two distributions when the proteins are small. The quality of the solutions obtained by the new approach is not always the best alternative, although the difference of the quality solution is only 2 to 5% for the worse cases. Besides, the new approach can overtake the classical quality solution of SA by one to ten percent while execution time is in general lower.

Competing Interests

The authors declare that they have no competing interests.

Lewin

Genes VIII 2004

Pearson Prentice Hall

Anfinsen

C. B.

Principles that govern the folding of protein chains

Science 1973 181 4096 223 230

10.1126/science.181.4096.223

2-s2.0-0015859467

Liwo

Czaplewski

Ołdziej

Scheraga

H. A.

Computational techniques for efficient conformational sampling of proteins

Current Opinion in Structural Biology 2008 18 2 134 139

10.1016/j.sbi.2007.12.001

2-s2.0-42049109472

Lee

Scheraga

H. A.

Rackovsky

New optimization method for conformational energy calculations on polypeptides: conformational space annealing

Journal of Computational Chemistry 1997 18 9 1222 1232

10.1002/(sici)1096-987x(19970715)18:9<1222::aid-jcc10>3.0.co;2-7

2-s2.0-0001176785

Sohl

J. L.

Jaswal

S. S.

Agard

D. A.

Unfolded conformations of α-lytic protease are more stable than its native state

Nature 1998 395 6704 817 819

10.1038/27470

2-s2.0-0032558779

Wang

Mottonen

Goldsmith

E. J.

Kinetically controlled folding of the serpin plasminogen activator inhibitor 1

Biochemistry 1996 35 51 16443 16448

10.1021/bi961214p

2-s2.0-0030475698

Ngo

Marks

Karplus

Computational complexity, protein structure prediction, and the levinthal paradox

The Protein Folding Problem and Tertiary Structure Prediction 1994

Berlin, Germany

Springer

433 506

10.1007/978-1-4684-6831-1_14

Ngo

J. T.

Marks

Computational complexity of a problem in molecular structure prediction

Protein Engineering 1992 5 4 313 321

10.1093/protein/5.4.313

Kirkpatrick

Gelatt

Vecchi

M. P.

Optimization by simulated annealing

Science 1983 220 4598 671 680

10.1126/science.220.4598.671

MR702485

Černý

Thermodynamical approach to the traveling salesman problem: an efficient simulation algorithm

Journal of Optimization Theory and Applications 1985 45 1 41 51

10.1007/bf00940812

MR778156

2-s2.0-0021819411

Simons

K. T.

Kooperberg

Huang

Baker

Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions

Journal of Molecular Biology 1997 268 1 209 225

10.1006/jmbi.1997.0959

2-s2.0-0031585984

Kaufmann

K. W.

Lemmon

G. H.

Deluca

S. L.

Sheehan

J. H.

Meiler

Practically useful: what the R osetta protein modeling suite can do for you

Biochemistry 2010 49 14 2987 2998

10.1021/bi902153g

2-s2.0-77950673061

Simoncini

Zhang

K. Y. J.

Efficient sampling in fragment-based protein structure prediction using an estimation of distribution algorithm

PLoS ONE 2013 8 7

e68954

10.1371/journal.pone.0068954

2-s2.0-84880846530

Frausto-Solís

Liñan-García

Sánchez-Pérez

Sánchez-Hernández

J. P.

Chaotic multiquenching annealing applied to the protein folding problem

The Scientific World Journal 2014 2014 11

364352

10.1155/2014/364352

2-s2.0-84897525071

Cole

E. A. B.

Integral evaluation in semiconductor device modelling using simulated annealing with Bose-Einstein statistics

International Journal of Numerical Modelling: Electronic Networks, Devices and Fields 2007 20 4 197 215

10.1002/jnm.649

2-s2.0-34547323003

Levinthal

Are there pathways for protein folding?

Journal de Chimie Physique 1968 65 1 44 45

Ponder

J. W.

Case

D. A.

Force fields for protein simulations

Advances in Protein Chemistry 2003 66 27 85

10.1016/S0065-3233(03)66002-X

2-s2.0-0242443693

Brooks

B. R.

Bruccoleri

R. E.

Olafson

B. D.

States

D. J.

Swaminathan

Karplus

CHARMM: a program for macromolecular energy, minimization, and dynamics calculations

Journal of Computational Chemistry 1983 4 2 187 217

10.1002/jcc.540040211

Momany

F. A.

McGuire

R. F.

Burgess

A. W.

Scheraga

H. A.

Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acids

The Journal of Physical Chemistry 1975 79 22 2361 2381

10.1021/j100589a006

2-s2.0-5944250450

Eisenmenger

Hansmann

U. H. E.

Variation of the energy landscape of a small peptide under a change from the ECEPP/2 force field to ECEPP/3

The Journal of Physical Chemistry B 1997 101 16 3304 3310

10.1021/jp963014t

2-s2.0-0342270249

Eisenmenger

Hansmann

U. H. E.

Hayryan

C.-K.

[SMMP] A modern package for simulation of proteins

Computer Physics Communications 2001 138 2 192 212

10.1016/s0010-4655(01)00197-7

2-s2.0-0035426074

Némethy

Gibson

K. D.

Palmer

K. A.

Yoon

C. N.

Paterlini

Zagari

Rumsey

Scheraga

H. A.

Energy parameters in polypeptides. 10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing peptides

Journal of Physical Chemistry 1992 96 15 6472 6484

10.1021/j100194a068

2-s2.0-0001731773

Berendsen

H. J. C.

van der Spoel

van Drunen

GROMACS: a message-passing parallel molecular dynamics implementation

Computer Physics Communications 1995 91 1–3 43 56

10.1016/0010-4655(95)00042-e

2-s2.0-0029633168

Ramachandran

G. N.

Venkatachalam

C. M.

Krimm

Stereochemical criteria for polypeptide and protein chain conformations. III. Helical and hydrogen-bonded polypeptide chains

Biophysical Journal 1966 6 6 849 872

10.1016/s0006-3495(66)86699-7

2-s2.0-0013969987

Metropolis

Rosenbluth

A. W.

Rosenbluth

M. N.

Teller

A. H.

Teller

Equation of state calculations by fast computing machines

The Journal of Chemical Physics 1953 21 6 1087 1092

10.1063/1.1699114

2-s2.0-5744249209

Liñán-García

Gallegos-Araiza

L. M.

Simulated annealing with previous solutions applied to DNA sequence alignment

ISRN Artificial Intelligence 2012 2012 6

178658

10.5402/2012/178658

Kim

Pramanik

Chung

M. J.

Multiple sequence alignment using simulated annealing

Computer Applications in the Biosciences: CABIOS 1994 10 4 419 426

10.1093/bioinformatics/10.4.419

2-s2.0-0027935740

Shyi-Ming

C. H.

Multiple dna sequence alignment based on genetic simulated annealing techniques

Information and Management Sciences 2007 18 2 97 111

Zbl1171.90421

Richer

J.-M.

Rodriguez-Tello

Vazquez-Ortiz

K. E.

Schütze

Coello Coello

C. A.

Tantar

A.-A.

Maximum parsimony phylogenetic inference using simulated annealing

EVOLVE—A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation II 2013 175

Berlin, Germany

Springer

189 203 Advances in Intelligent Systems and Computing

10.1007/978-3-642-31519-0_12

Wales

D. J.

Scheraga

H. A.

Global optimization of clusters, crystals, and biomolecules

Science 1999 285 5432 1368 1372

10.1126/science.285.5432.1368

2-s2.0-0033610078

Aarts

Korst

Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing 1989 37

John Wiley and Sons

Wiley-Interscience Series in Discrete Mathematics and Optimization

MR983115

Ingber

Simulated annealing: practice versus theory

Mathematical and Computer Modelling 1993 18 11 29 57

10.1016/0895-7177(93)90204-C

MR1254392

ZBL0819.90080

2-s2.0-43949164756

Kjærulff

Optimal decomposition of probabilistic networks by simulated annealing

Statistics and Computing 1992 2 1 7 17

10.1007/BF01890544

2-s2.0-0001893320

Van Laarhoven

P. J. M.

Aarts

E. H. L.

Simulated Annealing: Theory and Applications 1987 37

Springer Netherlands

Nourani

Andresen

A comparison of simulated annealing cooling strategies

Journal of Physics A: Mathematical and General 1998 31 41 8373 8385

10.1088/0305-4470/31/41/011

ZBL0962.82068

2-s2.0-0032538265

Shen

Kiatsupaibul

Zabinsky

Z. B.

Smith

R. L.

An analytically derived cooling schedule for simulated annealing

Journal of Global Optimization 2007 38 3 333 365

10.1007/s10898-006-9068-2

MR2328018

2-s2.0-34249896668

Eisberg

Resnick

Sullivan

J. D.

Quantum physics of atoms, molecules, solids, nuclei and particles

Physics Today 1975 28 12 51

10.1063/1.3069243

Wall

F. T.

Alternative derivations of the statistical mechanical distribution laws

Proceedings of the National Academy of Sciences of the United States of America 1971 68 8 1720 1724

10.1073/pnas.68.8.1720

Cornell

E. A.

Wieman

C. E.

Nieh

H.-T.

Bose-einstein condensation in a dilute gas: the first 70 years and some recent experiments

Proceedings of the International Symposium on Frontiers of Science in Celebration of the 80th Birthday of C. N. Yang

2003

Singapore

World Scientific

183 222

Frausto-Solís

Sánchez-Pérez

Lińan-García,

Sánchez-Hernández

J. P.

Threshold temperature tuning simulated annealing for protein folding problem in small peptides

Computational & Applied Mathematics 2013 32 3 471 482

10.1007/s40314-013-0027-5

MR3120134

2-s2.0-84887561422

Frausto-Solís

Sanvicente-Sánchez

Imperial-Valenzuela

Wang

T.-D.

Chen

S.-H.

ANDYMARK: an analytical method to establish dynamically the length of the Markov chain in simulated annealing for the satisfiability problem

Simulated Evolution and Learning 2006 4247

Berlin, Germany

Springer

269 276 Lecture Notes in Computer Science

10.1007/11903697_35

Sanvicente-Sánchez

Frausto-Solís

A method to establish the cooling scheme in simulated annealing like algorithms

3045

Proceedings of the International Conference on Computational Science and Its Applications (ICCSA '04)

2004

Assisi, Italy

Springer

755 763

10.1007/978-3-540-24767-8_80

Nayeem

Vila

Scheraga

H. A.

A comparative study of the simulated-annealing and Monte Carlo-with-minimization approaches to the minimum-energy structures of polypeptides: [Met]-enkephalin

Journal of Computational Chemistry 1991 12 5 594 605

10.1002/jcc.540120509

Zhang

Skolnick

TM-align: a protein structure alignment algorithm based on the TM-score

Nucleic Acids Research 2005 33 7 2302 2309

10.1093/nar/gki524

2-s2.0-17644392830

Scheraga

H. A.

Monte Carlo-minimization approach to the multiple-minima problem in protein folding

Proceedings of the National Academy of Sciences of the United States of America 1987 84 19 6611 6615

10.1073/pnas.84.19.6611

MR910448

2-s2.0-0023430366