^{1}

^{2}

^{1}

^{2}

We conceptualize protein folding as motion in a large dimensional dihedral angle space. We use Lagrangian mechanics and introduce an unspecified Lagrangian to study the motion. The fact that we have reliable folding leads us to conjecture the totality of paths forms caustics that can be recognized by the vanishing of the second variation of the action. There are two types of folding processes: stable against modest perturbations and unstable. We also conjecture that natural selection has picked out stable folds. More importantly, the presence of caustics leads naturally to the application of ideas from catastrophe theory and allows us to consider the question of stability for the folding process from that perspective. Powerful stability theorems from mathematics are then applicable to impose more order on the totality of motions. This leads to an immediate explanation for both the insensitivity of folding to solution perturbations and the fact that folding occurs using very little free energy. The theory of folding, based on the above conjectures, can also be used to explain the behavior of energy landscapes, the speed of folding similar to transition state theory, and the fact that random proteins do not fold.

Processes that proceed reliably from a variety of initial conditions to a unique final state, regardless of changing conditions, are of obvious importance in biophysics. Proteins in an appropriate solution fold to unique forms and serve as a flagship example of stable processes in biology.

In this paper, we suggest how the action principle in classical mechanics could be used to analyze the stability of the protein folding process, which is of obvious importance per se, but because the techniques described here follow from fundamental physics, this approach will also be useful in the study of the stability of other biophysical processes.

In this introduction, we present a number of technical issues in a descriptive style. Technical details are discussed in a later section.

The action principle is a traditional starting point for classical mechanics. The action is a path integral of the difference between kinetic and potential energy (the Lagrangian), between an initial and final time over a trajectory

When applied to mechanics the vanishing of the first variation of the action immediately yields the (Newtonian) equations of motion. Thus the physical picture described is that of particles moving along trajectories according to the equations of motion.

The most important degrees of freedom in protein molecules are dihedral angles associated in pairs with amino acid residues. In a common protein there might be 500 or more such angles. In folding, the molecule starts in some random assortment of these angles and moves toward a specific native set of angles. We speak of this motion as taking place in the space of dihedral angles.

If one considers protein molecules moving along trajectories in dihedral angle space, then several things are clearly missing from the trajectory picture.

First, the trajectories evidently move toward the common end point or points along the way to the native state but there is nothing explicit in the trajectory itself to define such a convergence; an energy landscape [

Second, trajectories coursing through a rough energy landscape would arrive at the end point over a range of times, that is, diffusion. In contrast, many molecules have narrow melting curves and fast folding times that seem more appropriate to gas phase chemistry (TST). This is currently approached by postulating that the energy landscape [

Third, there is the stability of the folding process. Consider an unfolded molecule in a dilute solution of a suitable denaturing agent. Such an agent interferes with the stability of the native state of the molecule but, curiously, does not deflect the process into alternative folded forms. Similarly, many other perturbations have little or no impact upon the final folded form.

Fourth, there is the problem of initial conditions. In the current view, a folding-ensemble of denatured protein molecules begins at the top of a funnel shaped energy landscape and proceeds down the funnel to the unique native state. The various conformations at the top of the funnel are equivalent in the sense that setting various initial conditions or subjecting the molecules to various perturbations results in conformations that are still in the folding-ensemble. The trajectory picture, per se, does not address ensemble behavior; again [

These issues can all be addressed from a fundamental physics starting point by considering the vanishing of the second variation of the action [

Before we proceed we need to define some terms. Recall that the action is a function on arcs. If the first variation with fixed endpoints is zero, then we call that arc a critical arc. If the second variation is positive for that arc, then that arc is a local minimum of the action. It is often convenient to regard the arc as part of a longer trajectory. If we fix one endpoint of the arc at a point of the trajectory and move the other along the trajectory away from the fixed endpoint, we may reach a point and thus determine an arc, for which the second variation is zero. If we move the movable endpoint even further away from the fixed endpoint, the second variation will become indefinite; that is, it can take on both positive and negative values. Typically when we have a family of trajectories starting from a fixed point or fixed initial curve or surface, they will form an envelope, that is, a curve or surface to which all the trajectories are tangent, and the points of tangency will be points along the trajectories at which the second variation vanishes. This envelope is referred to as “caustic." If the envelope happens to be a point, then we call that point a “focus." When we have such an envelope, it dominates the motion in the sense that all of the trajectories meet it or pass through it. (In a later section of this paper we cover this subject again in more mathematical detail.)

Some excellent examples of caustics in classical mechanical systems can be found in [

The concept of convergence is not, as we have just said, contained explicitly in individual trajectories. Rather, the concept of convergence or focusing of mechanical trajectories is best described by considering families of trajectories. If the dynamics of particles entails a caustic, then it is possible in principle to understand how a family of trajectories can behave in a coherent manner.

We next proceed to explain how powerful theorems of R. Thom and V. Arnold can be used to understand this behavior quantitatively.

We shall not attempt to define the stability of a shape in this paper. (We refer the reader to Section

We need two additional technical terms to be used in describing the action: state variable and control parameter. We do not require the mathematical definitions of these terms, but those definitions are readily available in the literature.

A simple way to look at state variable in a mechanical system is to think of space and time coordinates that are used to describe the motion. At a point in space and time, the description of the physical system will depend upon various control parameters. For example these may describe the interface with some apparatus. For our purposes, the control parameters in folding are not tightly defined. The shape of the caustic will be defined entirely in terms of control parameters. They are assumed to be constant after folding and may turn out to be measurable distances or angles in the native state. Excellent examples of state variables and control parameters can be found in the literature, for example, [

The mathematics tells us that under appropriate constraints there is a finite set of stable forms of the action near a critical point. Natural selection has evidently picked out these stable forms for biological molecules by choosing dynamics containing critical points. The stability arises because an ensemble of actions can change into one another as a result of a perturbation but the topology is not affected.

A simple example of this is the familiar cusp:

Finally, we note that the physics and mathematics show that, when critical points are present in a dynamic, the critical point dominates the motion.

We summarize what we have described so far.

We start with the principle of stationary action applied to the dynamics of protein molecules.

To account for folding we turn to a standard formalism for focusing.

Two types of focus appear in the formalism: stable and unstable. We assume that natural selection has eliminated unstable foci.

Thom’s theorems now tell us that there are just seven possible functional forms for the action at a focus. Thom also tells us that these actions are stable against perturbations.

This completes a descriptive introduction to the idea of a critical point in the molecular dynamics. Before continuing this subject in more detail, we next discuss the folding process that is under examination in this paper.

In this section, we set up the folding problem that we wish to address in a subsequent section.

We shall focus our analysis upon two-state folders; in particular, we are interested in the nonequilibrium transitions between the denatured and folded or intermediate states [

The molecules are not under overall tension, so transverse waves and resonances with wavelength comparable to the length of the chain are disfavored. Torsional motions, which might include some long range waves, are favored by the geometry. A plausible picture is that energy is released at various localized points resulting in waves of torsional contraction or expansion which propagate away from the production point, generally with attenuation.

The theory described here does not depend upon the torsional form of the waves.

The details will ultimately depend upon whether the waves scatter off one another. In an earlier work on a continuous backbone model, the present authors showed that solitons are a possibility. Solitons pass through one another without shape change [

As we have said, the action is a scalar which depends upon energy and upon the path taken by a particle. For the two state folders, the action will depend upon the path taken by a molecule from unfolded to folded states. This path may be thought of as occurring in dihedral angle space. The molecule starts with a set of dihedral angles. It changes conformation following a path through dihedral angle space for which the first variation of the action is zero.

At this juncture, we pause to introduce a toy model which is solely intended to illustrate our points (and not to address the hard realities of folding dynamics [

Let us simplify the torsional wave motion to just one axial degree of rotational freedom, that is, an angle,

It is common in simulations of folding to introduce angular spring potentials

If we place the critical point

Fortunately, for our purposes, a static version of this mechanical arrangement is well known in the catastrophe theory literature, where it is known as the Zeeman machine [

We get,

Obviously, at

If this were a valid theory of the torsional response to a wave passing through, then that response would be independent of modest perturbations other than the last two terms.

We could also use this potential to construct probability distributions and to derive statistical moments such as

We emphasize that this is a toy model which illustrates how a wave on a molecule can develop a critical point and be used in some calculations of measurable quantities and shows how the shape can be independent of perturbations, perhaps such as dissipative forces and energy rough spots.

The toy model has no detailed structure (i.e., no sequence). However, it has unsymmetrical forces that can give rise to critical points and thereby to stability. Note that the spring forces that are often used in simulations do not have these properties.

With our descriptive introduction complete, we can now address the four issues listed above. We start with the assumptions that there is a critical point in the molecular dynamics and that natural selection has picked out stable folds.

The first point, that trajectories converge, is a direct implication of the presence of the critical point.

The explanation of the remaining three issues (the energy landscape is apparently smooth, the folding process is stable under modest perturbations, and the initial conditions in the denatured state do not matter very much) follows from the insensitivity of the action to perturbations. The energy landscape may have many rough spots but if they are not too extreme, then they do not change the multitrajectory action and hence do not change the time to reach the folded state.

The time to the folded state (or to an intermediate state) for a short segment of the protein is a result of two important factors: (i) a Boltzmann factor describing the escape from a potential well into the transition state and (ii) a microscopic local rate factor,

This picture of motions, that occur along multiple similar trajectories until contact and bond formation take place, is compatible with the observations that the time to folding is roughly proportional to contact order [

As it can be seen directly from (

An observation that follows the semiquantitative description that we have presented so far is that some simplifications in folding result from the presence of a critical point in the molecular dynamics. For example, for two-state folders, the denatured and folded forms can both exist in equilibrium. The denaturing agent may impact the entropy but not the degrees of freedom associated with the folding. (This phenomenon is more general than protein folding. Catalysts that change the rate of a reaction by many orders of magnitude, by changing the heat flow to the thermal reservoir, without changing the reaction products, are well known [

Torsion waves on molecules in solution are expected to dissipate energy. The reliability of folding in the presence of agents that change the entropy or viscosity suggests that the degrees of freedom that participate in folding in an essential way are not impacted by dissipation. The theory presented here explains using a combination of critical points and natural selection.

Another application of our theory is to address the question of why biological proteins fold to unique final forms while random polymers do not. Our theory suggests qualitatively that the former have critical points in the dynamics and fold along specific sets of trajectories while the latter do not have critical points and fold diffusively to various end shapes. Another way to look at this is that the energy landscape may be rough for random polymers and they do not share the immunity to perturbations of biological proteins.

The topomer-sampling theory of Debe and collaborators [

There are several alternative explanations for why trajectories that pass through a caustic continue to the native state. One is that the caustic is small and it sits in a steep part of the energy funnel not far from the minimum. A similar phenomenon is the formation of an alpha helix. There is an initial energy barrier, but once that is passed, the helix quickly falls into place.

Our theory has neither reached the point in development where the sequence dependence can be pinned down nor identified the dynamical relationship between critical points and specific folds. However, some comments are in order. For torsion waves, this theory clearly requires that the sequence influences the mechanical parameters of torsional motion. An important feature is multitrajectory action. A single trajectory theory, like TST for gas phase reactions, will not develop critical points; critical points are of essence due to multiple alternative paths.

A major difference between our theory and others is that here the vanishing of the second variation of the action is utilized to make connections to the existence of envelopes, that is, caustics and hence to catastrophe theory.

This section is a concise treatment of physics and mathematics. We document this with references, especially books, where appropriate.

General references are as follows:

For mathematics is [

For catastrophe theory is [

For physics are [

For completeness we mention that the action principle, including the formation of caustics, can be derived as a short wavelength limit of quantum mechanics [

For protein science we suggest [

Early ideas underlying this work can be found in [

In order to apply Hamilton’s principle we need to introduce variations of a particular arc that will be denoted by

We get,

When the second variation is positive, meaning that

To appreciate the sign of the second variation one wants to consider families of trajectories of realized motions. The simplest situation to consider is that for which the action

The curves defined by

In typical applications of singularity theory, or catastrophe theory, one considers a given function of several variables, which are referred to as state variables and control variables. One focuses on the form of sets determined by setting to zero the first and second derivatives of the given function with respect to state variables and takes advantage of known generic solutions of those equations for certain numbers of control variables. The function

Noticing that geometric optics fails in this situation, Berry and Upstill comment that this failure is “catastrophic” because this is just the point at which catastrophe theory becomes applicable.

Where caustics are present we have a strong focusing of trajectories into a space that is very closely circumscribed by up to five control parameters [

The conditions on the partial derivatives just mentioned are the same as the conditions for catastrophe theory to obtain. There are many texts on catastrophe theory so we will just remark that stability of the catastrophe (caustic) against perturbations of the state variables (time and space) is the major result we have used.

Let us consider the limitations of this conjecture.

An important issue is the degree of sensitivity to perturbations. For example, the strength of the denaturing agents may be so great that our conjecture may not apply. There appears to be no general rule from catastrophe theory to quantify the limits of allowed perturbations; the answer is in the details.

Another limitation is the possible appearance or nonappearance of false minima. Further research will be required to understand this issue. In a special case, however, if there is a single, long-lived false minimum (with the molecules slowly leaking down to the thermodynamic minimum), then our conjecture may apply to the false minimum. We remark that prions may be such a case.

We emphasize that in this paper we are not addressing the issue of protein structure [

The calculations to check these ideas in model molecules are not simpler than traditional folding simulations. However, the results are different.

If the putative caustic appears somewhere along the folding path, then a simulation of the action up to that point can reveal it. The quantitative test is the vanishing (or near vanishing) of the Hessian determinate, as described in the previous section. The number of computations is formidable but, for simple molecules, not beyond supercomputer capacity.

Should a caustic be indicated along the folding path? This can be confirmed by searching for a saddle point in the action as the integration is continued to a point just beyond the caustic. Again, the computations are formidable but not impossible.

Once a caustic is found in a model, it will become possible to tune model parameters to optimize folding.

The phenomenon studied is the motion of protein molecules in a variety of initial conditions, in the presence of various perturbations, terminating in a unique final state in spite of the relatively little free energy available.

The principle of stationary action leads immediately to equations of motion. However, equations of motion describe the propagation of the molecule along trajectories in dihedral angle space and tend to obscure the behavior of groups of trajectories. Moreover, as noted originally by Levinthal, the number of conformations in dihedral angle space is of cosmological proportions.

By treating the problem directly using the principle of stationary action and putting the equations of motion aside, it is possible to treat groups of trajectories that behave in a similar manner, in particular, trajectories that converge either to a focus or to a caustic.

The resulting treatments narrow the number of possible paths through dihedral angle space because all trajectories pass through very narrow caustics located somewhere along the folding path.

The result of direct analysis from the action is that two types of focusing emerge: stable and unstable. We assume that natural selection has eliminated the unstable focusing. This treatment leads immediately to the strong stability of the process of folding. Many features of the folding process emerge directly.

As in most biological processes, protein folding entails a large number of complicated forces and parameters that change with conformation (i.e., with time during folding) and, as just mentioned, it takes place in a space of very high dimension. Yet folding does indeed lead to unique final forms even in the presence of denaturing agents that are chosen to disrupt the final shape.

This complexity might be enough to send any theorist back to his coffee pot. However, when this is approached from the viewpoint of the direct application of principle of stationary action, an idea takes shape rather naturally. The idea is that the fundamental dynamics of molecular motion contains critical points that dominate the motion and make the motion less vulnerable to disruption by various changing forces and conditions. This dominance of dynamical behavior by critical points is well established in physics.

We have shown how this idea emerges from the action principle and have given semiquantitative explanations for many of the phenomena that have been documented in laboratories and in simulations over the past five or six decades.

For theories that start with the molecule in its native state, unfold it in the lab or in simulation and then refold it; the acid test is prediction of the native form. By starting with the denatured state and applying physics and mathematics to study the stability of the folding process, we have only a germ of a full theory of folding and we cannot predict structures, not even approximately.

No theories, which are consistent with classical mechanics, are in contradiction with the least action and the vanishing of the first variation of the action for the dynamics of the molecule during folding. The fundamental departure embodied in this work is the putative vanishing of the second variation of the action, implying that various trajectories can be treated as a unit, and the role of natural selection in eliminating unstable folds.

What we have accomplished is to show that these putative critical points provide a level of quantitative understanding of many observed features as follows: rapid rates (TST like behavior), smoothness of the energy landscape, nonfolding of random polymers, insensitivity to many perturbations, and some qualitative insights into other features of folding such as the importance of topology and contact order.

The obvious next steps in the development of this theory are to learn exactly how the critical points emerge in terms of sequence and to learn how the critical points relate to specific structures.