The Effect of Nonnative Interactions on the Energy Landscapes of Frustrated Model Proteins

The 46and 69-residue BLN model proteins both exhibit frustrated folding to β-barrel structures. We study the effect of varying the strength of nonnative interactions on the corresponding energy landscapes by introducing a parameter λ, which scales the potential between the BLN (λ = 1) and Gō-like (λ = 0) limits. We study the effect of varying λ on the efficiency of global optimisation using basin-hopping and genetic algorithms. We also construct disconnectivity graphs for these proteins at selected values of λ. Both methods indicate that the potential energy surface is frustrated for the original BLN potential but rapidly becomes less frustrated as λ decreases. For values of λ ≤ 0.9, the energy landscape is funnelled. The fastest mean first encounter time for the global minimum does not correspond to the Gō model: instead, we observe a minimum when the favourable nonnative interactions are still present to a small degree.


Introduction
Proteins are biopolymers constructed from a sequence of amino acid residues.The potential energy landscapes of proteins have many degrees of freedom and include important contributions between pairs of residues that are distant in sequence, but close to each other in space.Despite this complexity, many globular proteins fold to a well-defined the native state.According to the thermodynamic hypothesis, this structure is the global free energy minimum for a given sequence [1].Frustration occurs when there are low-lying structures separated by high barriers [2].All the favourable interactions between pairs of residues cannot be accommodated at the same time, which can lead to energetic frustration, where there are several low-lying structures with different patterns of contacts.Geometric frustration occurs when the interconversion of two low-lying structures requires the breaking of several favourable contacts.
A systematic way to simplify the potential energy surface for a protein is to include only attractive interactions between pairs of residues that are in contact in the native state, which constitutes a Gō model [3].Various on-and off-lattice Gō models have been investigated by different authors to study a range of different proteins.In spite of the simplified potential, these models have proved capable of reproducing certain aspects of protein dynamics and thermodynamics [4][5][6][7][8][9][10][11].Using a Gō model tends to lead to funnelled energy landscapes [12], with very little frustration.For some proteins, neglecting nonnative interactions can have a significant influence on the energy landscape [13].
Intermediate potentials can be generated using a parameter, λ, which scales the strength of the nonnative interactions between the Gō (λ = 0) and BLN (λ = 1) limits.The folding thermodynamics of the 46-residue BLN protein have been investigated using this scaled BLN potential [23,32,33], showing that most of the frustration is only present for values of λ ≥ 0.9.The introduction of salt bridges (gatekeepers) to the 46-residue protein also produces energy landscapes of intermediate character [27,28].
In the present work, we study the effect of varying λ on the ease of global optimisation of the 46-and 69residue BLN proteins using a basin-hopping algorithm and a genetic algorithm.We also construct disconnectivity graphs to compare the energy landscapes of the proteins for different values of λ.

Computational Methods
The protein structures were modelled using the following BLN potential [12,21,26,28]:  where R i j is the distance between two beads i and j.The first term is a harmonic bond restraint with K r = 231.Native contacts are defined as all pairs of residues where R i j is less than a fixed cut-off distance in the native state (global minimum) of the protein.When λ / = 1, the value of this cut-off radius will influence the energy landscape.Here, we use 1.167σ for consistency with previous work [12,28,38].
Global optimisation was performed using the basinhopping approach [42][43][44] and a Lamarckian genetic algorithm [38,45], which are both implemented in the GMIN program [46].Each algorithm involves local energy minimisation after each structural perturbation.This minimisation transforms the potential energy surface into the basins of attraction of local minima [47] and removes downhill barriers.The search parameters for both algorithms were optimised in previous work for BLN proteins [38], and these parameters were used without adjustment for all searches presented here (Table 1).The GMIN input files used for these searches are included in the supplementary data (see Supplementary Material available online at doi:10.1155/2012/192613).
The genetic algorithm represents each structure with a genome consisting of the torsion angles in the backbone of the protein.Offspring structures are generated by onepoint crossover from two parent structures.Mutants are generated by making a copy of an existing structure (parent or offspring) and replacing one of the torsion angles.To prevent stagnation of the genetic algorithm searches, a restart operator was used.If an entire generation of offspring contains no solutions that are fitter than any of the parent structures, a new epoch is started with a new random population.For the 69-residue protein, the fittest structure from each epoch survives into the next epoch.
All conformational searches were run until the global minimum structure was found.We report the mean time taken to encounter this structure in conformational searches from randomised starting points to compare the exploration of the energy landscape as a a function of λ.Searches were performed for values of λ between 0 and 1 in steps of 0.1, with additional points at λ = 0.95 and λ = 0.99.The initial structures for this benchmarking were generated using two alternative methods: either random placement of the residues inside a sphere of radius 3σ, or random assignment of the backbone dihedral angles.Full details of all of the global optimisation runs are available as supplementary data.
The disconnectivity graphs for the model proteins were constructed from databases of stationary points generated using the PATHSAMPLE program, [48] which organises independent pathway searches using OPTIM [49].All the transition state searches in OPTIM were conducted in Cartesian coordinates [50] using a quasicontinuous interpolation scheme to avoid chain crossings, with local maxima accurately refined to transition states by hybrid eigenvector-following [51][52][53].Successive pairs of local minima were selected for connection attempts within OPTIM using the missing connection algorithm [54].Disconnectivity graphs [39] will be illustrated for both the 46-and 69-residue scaled BLN proteins with λ values of 0, 0.5, 0.9, and 1.
We also study the effect of λ on key structures of the BLN proteins.These structures were reminimised using values of λ between 0 and 1 in steps of 0.1.Pathways between pairs of interesting minima were studied by Dijkstra analysis [55] in PATHSAMPLE [48], with the discrete paths [56] that make the largest contribution to the steady-state rate constant [56,57] presented here.
With a few exceptions, all of the stationary points of the BLN model proteins are chiral.However, the BLN potential includes no chiral terms, so each structure has an enantiomer with the same energy.When evaluating the optimisation algorithms, we accept convergence to either of the enantiomers of the global minimum.When looking at the pathways, it is important to use the same chirality for both structures, otherwise much longer paths result.For some of the trapped structures, pathways to both enantiomers of the global minimum can be viable.

BLN-46.
Searches for λ = 0 (Gō potential) find the global minimum much more rapidly than when λ = 1 (BLN potential), as one would expect for a more funnelled energy landscape [2,[58][59][60][61].However, the number of steps required varies nonlinearly between these two extremes and behaves Figure 5: Disconnectivity graphs [39] showing the most stable minima accessible by transition states lower than 7 from the global minimum of the 46-residue scaled BLN proteins.For λ ≥ 0.9, only the 1000 most stable minima are shown.The structures of selected minima are illustrated close to the bottoms of the corresponding branches.differently for each search algorithm.When optimising with the GA, the mean first encounter time decreases rapidly from λ = 1 to λ = 0.9 and then more slowly to a minimum at λ = 0.5 (Figure 1).After this minimum, there is a small increase in the required time as λ decreases to 0. This result is consistent with previous observations that the introduction of some nonnative interactions can assist the folding of some proteins [62].Below λ = 0.9, almost all searches find the global minimum within the first epoch of the GA.For larger values of λ, several searches require two or more epochs, leading to much more variation in the first encounter time.The choice of the random starting configurations for the initial population of the GA makes little difference to the mean first encounter time.
In basin-hopping searches, the choice of starting structures makes a large difference to the efficiency of the optimisation.When starting from residues randomly distributed inside a sphere, for values of λ < 0.7, 95% of the searches find the global minimum rapidly.The remaining searches become trapped and require several thousand attempted Monte Carlo moves to escape (Figure 2).In this trap, the first, third, and fourth strands are correctly packed, but the second is wrapped around the outside of the protein (Figure 3).Searches with larger values of λ do not become trapped in this basin, which suggests that the nonnative interactions are important in stabilising the intermediates between this structure and the global minimum.
The trap configuration lies 12.4 above the global minimum when λ = 0 and becomes more unfavourable for larger values of λ (Table 2).The fastest escape route from this trap involves unthreading of the N-terminus from the loop made by the second strand (Table 2).The energy of the highest transition state on this pathway relative to the trapped state increases from λ = 0 to λ = 0.9 before levelling off.The highest transition state on this pathway lies above the barrier to interconversion of the two enantiomers of the global minimum.For searches starting from a random set of torsion angles, this trapping is much less frequent and is only seen in 3 of the 700 searches performed where 0 ≤ λ ≤ 0.6.By retaining some notion of connectivity, these initial structures cover less of the configurational space than the entirely random starting points.However, the complete coverage of conformational space comes at the cost of including more unstable structures, such as the trap seen here.
The five lowest minima in the BLN-46 protein span an energy range of less than (Figure 4).The two most stable minima are in the same basin, and both have all of the BB contacts from the native state.Across the range of λ, the relative energies of these minima are within 0.1 of each other, with the second-best minimum becoming slightly more stable as λ decreases and moving below the former global minimum when λ < 0.3 [12].The next three minima are stabilised by some nonnative contacts and become less stable relative to the global minimum as λ decreases.In the region around λ = 0.5, these structures cease to be minima and fall into the basins of attraction [41] of the two lowest energy structures.
The disconnectivity graphs within 7 of the global minimum for λ = 0 and λ = 0.5 are funnelled and almost indistinguishable (Figure 5).When λ = 0.9, some frustration appears in the low-energy regions of the energy landscape, but it is still mostly funnelled.Almost all of the frustration is introduced between λ = 0.9 and λ = 1, where several alternate β-barrel structures are separated by barriers of 4 to 5 .This organisation is consistent with the increase in the mean first encounter times seen for global optimisation with λ > 0.9 and agrees with previous studies of the thermodynamics of the 46-residue protein [32,33],   where λ = 0 and λ = 0.5 were found to be good folders, λ = 0.9 an intermediate folder, and λ = 1 a poor folder.

BLN-69.
The behaviour of the GA for the 69-residue protein is similar to that for the 46-residue protein, with the fastest search time found at λ = 0.5.When optimising with basin-hopping on the 69-residue protein, there are several   slow searches between λ = 0.4 and λ = 0.8 (Figure 6).There are multiple trap structures, and the one that is seen most frequently, which is responsible for the slowest searches, is formed from three strands from the left-handed barrel and three strands from right-handed barrel (Figure 7).This structure is a six-stranded β-barrel similar to the global minimum, but with two sets of interstrand contacts swapped (1-6 and 3-4 in the global minimum compared to 1-4 and 3-6 in the trap).Conversion from the above structure to the global minimum proceeds either by inversion of the three strands at the N-terminus or of the three strands at the C-terminus.The barriers to these two mechanisms are different and vary with λ (Table 3).The barrier for the fastest pathway for inversion at the C-terminus becomes larger with increasing λ.However, the barrier for inversion of the N-terminus varies much less with λ.In the region where 0.5 ≤ λ ≤ 0.7, the barriers to both routes out of the trap are relatively high, which is a possible explanation for the slow basin-hopping optimisation for these values of λ.This is doubtless an oversimplification when we consider that there are multiple trap structures.For the 69-residue BLN protein, the energies of the five lowest minima span less than 0.4 (Figure 8).One structure lies in the same funnel as the global minimum, and its relative energy increases from 0.2 to 1.6 when λ decreases from 1 to 0. The other three structures occupy different funnels from the global minimum, with several nonnative contacts, and their stability decreases steeply with decreasing λ.Unlike the 46-residue protein, the global minimum structure remains the same for all values of λ.The low-energy region of disconnectivity graphs for values of λ between 0 and 0.9 are mostly funnelled (Figure 9).Almost all of the frustration in this region of the potential energy surface appears for λ > 0.9.

Conclusions
Much of the energetic frustration in the BLN proteins is removed once the potential contains a 10% contribution from the Gō function.When looking at geometric frustration in higher-energy traps, the effect of λ is less predictable.The removal of nonnative interactions can stabilise or destabilise the transition states that must be crossed to escape from these traps.Measures of the landscape complexity [30] could provide a useful way to understand the influence of nonnative interactions and will be considered in future work.

Figure 1 :
Figure 1: Mean first encounter times (number of minimisations) for 100 global optimisation runs initiated from random starting points for the 46-residue scaled BLN protein.The searches were run using a genetic algorithm (red), basin-hopping starting from random structures confined to a sphere (green), and basin-hopping starting from chain structures with randomised dihedral angles (blue).The error bars are the uncertainties calculated at the 95% level.

Figure 2 :
Figure 2: Energy of the minima in the Markov chain for a BH run where trapping occurs for the 46-residue scaled BLN protein with λ = 0.

Figure 3 :
Figure3: The most stable misfolded structure, which acts as a trap for global optimisation of the 46-residue BLN protein, illustrated using the VMD program[40] with a colouring scheme for the beads that varies from red to blue (N-terminus to C-terminus).
2 σ −2 and R e = σ.The second term is a bond angle restraint with K θ = 20 rad −2 and θ e = 1.8326 rad.The third term involves torsional angles, φ, defined by four successive beads.If two or more of these beads are N, then A = 0 and B = 0.2.For all other sequences, A = B = 1.2.The final term introduces pairwise nonbonded interactions.If one residue is L and the other is L or B, then C = 2/3 and D = −1.If either of the residues is N, then C = 1 and D = 0.If both residues are B, then C = 1, but the value of D depends on the presence of the contact in the native state of the protein.For native contacts, D = 1.For nonnative contacts, D = λ, where 0 < λ < 1.The case where λ = 1 is the original BLN potential and λ = 0 is the Gō potential.

Figure 4 :
Figure4: The energies of the five most stable BLN-46 structures relative to the global minimum as a function of λ.Also shown (orange) is the energy of the trap structure illustrated in Figure3.The steep decreases mark the points at which structures cease to be local minima and collapse into the basin of attraction[41] of the global minimum.

Figure 6 :
Figure 6: Mean first encounter times (number of minimisations) for 100 global optimisation runs initiated from random starting points for the 69-residue scaled BLN protein.The searches were run using a genetic algorithm (red), basin-hopping starting from random structures confined to a sphere (green), and basin-hopping starting from chain structures with randomised dihedral angles (blue).The error bars are the uncertainties calculated at the 95% level.

Figure 7 :
Figure 7: Side and top views of the global minimum (left) and trapped (right) structures of the 69-residue BLN protein illustrated using the VMD program[40] with a colouring scheme for the beads that varies from red to blue (N-terminus to C-terminus).

Figure 8 :
Figure 8: The energies of the five most stable BLN-69 structures relative to the global minimum as a function of λ.Also shown (orange) is the trap structure from Figure 7.The steep decreases in energy mark the points at which structures cease to be local minima and collapse into the basin of attraction [41] of the global minimum.

Figure 9 :
Figure9: Disconnectivity graphs[39] showing the minima accessible by transition states lower than 8 from the global minimum of the 69-residue scaled BLN proteins.For λ ≥ 0.5, only the 1000 most stable minima are shown.The structures of selected minima are illustrated close to the bottoms of the corresponding branches.

Table 1 :
Parameters used for the two optimisation strategies.

Table 2 :
Energies of the trapped minimum and transition state for escape from the principal kinetic trap in the 46-residue scaled BLN protein.All energies are in units of and measured relative to the global minimum.

Table 3 :
Energies of the trapped minimum and transition states for escape from the principal kinetic trap by inversion of the N-and C-termini in the 69-residue scaled BLN protein.All energies are in units of and measured relative to the global minimum.