Based on the observation that Ramachandran-type potential energy surfaces of single amino acid units in water are in good agreement with statistical structures of the corresponding amino acid residues in proteins, we recently developed a new all-atom force field called SAAP, in which the total energy function for a polypeptide is expressed basically as a sum of single amino acid potentials
In conformational analysis of short peptides, Monte Carlo (MC) and molecular dynamics (MD) simulation techniques have been widely applied [
On the other hand, we recently discovered the interesting feature of protein structures that Ramachandran-type potential energy surfaces of single amino acid units in water obtained by
A discovery of the similarity between the SAAP in water and the statistical structure of the amino acid residues in folded proteins has prompted us to develop a new force field called SAAP for polypeptide molecules [
Met-enkephalin and chignolin without N- and C-terminal protecting groups were employed as model short peptides. Met-enkephalin is a short peptide, which consists of five residues (Tyr–Gly–Gly–Phe–Met) as shown in Figure
Structure of Met-enkephalin.
Chignolin has a unique folded
Structure of chignolin. Three native hydrogen bonds (H-bonds I, II, and III) and a hydrophobic interaction between Tyr2 and Trp9 are indicated by arrows [
The SAAPFF parameters are comprised of a potential energy, atomic coordinates, atomic charges, and Lennard-Jones potential parameters for all possible conformations of the single amino acid unit (HCO-Xaa-NH2). The conformations are defined by dihedral angles,
First, the parameters for main-chain, Pro, and Val in water, which were obtained in the previous version [
Second, atomic charges of the SAAPFF parameters, which are variables as a function of the structure of an amino acid unit, were switched for all amino acids from Mulliken charges to electrostatic potential (ESP) charges [
Third, the parameters of potential energies for the main-chain unit were further improved as follows. The SAAPFF parameters for most amino acids, except for Gly, Ala, Pro, and Val, are divided into the main-chain and side-chain units according to the side-chain separation approximation method [
Conformational properties of Met-enkephalin and chignolin in water were studied by SAAP-MC and AMBER-MD [
For comparison, ten trajectories of AMBER-MD [
The 10,000 structures of Met-enkephalin that were extracted in every 10,000 step from the SAAP-MC simulation trajectory were classified to twenty structural clusters by using a clustering algorithm called the
For chignolin, the RMSD values with respect to the native structure were also calculated for the structures extracted from the SAAP-MC and AMBER-MD trajectories by using the amber-ptraj program [
Relative energies of the representative structures obtained for chignolin by the SAAP-MC simulation were calculated in water by the
Energetic trajectories of SAAP-MC and AMBER-MD simulations for Met-enkephalin were obtained in an implicit water model at 300 K (see Figure S1). In SAAPFF, the total energy
Histograms of the distance between C
Representative structures of Met-enkephalin obtained by the SAAP-MC simulation at 300 K in water (trajectory 1). The obtained 10,000 structures were analyzed by a structural clustering algorithm using the
Twenty trajectories were obtained for chignolin by SAAP-MC simulation. The structures in each trajectory were classified to ten clusters based on the main-chain RMSD by using the
Existence ratios (%) of representative structures of chignolin obtained by the SAAP-MC simulation at 300 K in water.
Trajectory number | A ( |
B | C | Other |
---|---|---|---|---|
1 | 58.5 (11.2) | 25.9 | 2.3 | 13.3 |
2 | 54.7 (11.3) | 26.6 | 6.6 | 12.1 |
3 | 50.4 (8.4) | 42.4 | 3.8 | 3.4 |
4 | 49.3 (11.4) | 30.7 | 2.9 | 17.1 |
5 | 45.4 (10.6) | 37.8 | 6.1 | 10.7 |
6 | 44.3 (8.9) | 47.9 | 0.0 | 7.8 |
7 | 37.9 (19.5) | 40.2 | 2.4 | 19.5 |
8 | 37.6 (8.5) | 27.9 | 0.0 | 34.5 |
9 | 36.5 (11.2) | 30.7 | 6.2 | 26.6 |
10 | 34.7 (6.6) | 27.3 | 4.2 | 33.8 |
11 | 32.1 (8.0) | 32.9 | 5.5 | 29.5 |
12 | 31.2 (9.7) | 23.5 | 0.0 | 45.3 |
13 | 19.8 (3.5) | 43.7 | 5.3 | 31.2 |
14 | 14.2 (2.5) | 31.8 | 0.0 | 54.0 |
15 | 13.6 (0.0) | 56.8 | 6.3 | 23.3 |
16 | 12.8 (2.6) | 60.0 | 7.9 | 19.3 |
17 | 10.5 (0.0) | 59.3 | 0.0 | 30.2 |
18 | 1.1 (0.0) | 65.1 | 0.0 | 33.8 |
19 | 0.0 (0.0) | 70.4 | 5.5 | 24.1 |
20 | 0.0 (0.0) | 83.2 | 10.2 | 6.6 |
| ||||
Average | 29.2 (6.7) | 43.2 | 3.8 | 23.8 |
Representative structures of chignolin obtained by the SAAP-MC simulation at 300 K in water. The obtained 20,000 structures were analyzed by a structural clustering algorithm using the
In the trajectories from 1 to 12, the existence ratio of structure A was high (31.2 to 58.5%), whereas the ratio was much lower than that of the misfolded structure B in the remaining trajectories. The mean ratio for structure A averaged for all trajectories was 29.2%. On the other hand, when the clustering analysis was performed based on all-atom RMSD, the native-like structure (A′) was obtained from 0 to 19.5% (Table
Traces of the main-chain RMSD obtained for chignolin by the SAAP-MC (trajectory 10) (a) and AMBER-MD (trajectory 1) (b) simulations at 300 K in water. The RMSD values were calculated with respect to the native structure.
The convergence of the SAAP-MC simulation is not clear from Table
Relative energies of structures A–C determined by
Structures | SCF energy |
Relative energy (kcal/mol) |
---|---|---|
A | −3800.42798 | 5.52 |
|
−3800.43678 | 0.00 |
B | −3800.39878 | 23.85 |
C | −3800.43002 | 4.42 |
The structure with the smallest all-atom RMSD (1.3 Å) obtained from the twenty trajectories is superimposed on the native structure in Figure
The structure of chignolin with the smallest all-atom RMSD (1.3 Å) obtained by the SAAP-MC simulation at 300 K in water (white) superimposed on the reference native structure (gray).
The Ramachandran-type free-energy surfaces obtained for Tyr2–Trp9 residues from all trajectories of the SAAP-MC simulation are shown in Figure
Ramachandran-type free-energy surfaces for Tyr2–Trp9 residues of chignolin obtained from all trajectories of the SAAP-MC simulation at 300 K in water along with the plots of the native structures determined by NMR [
The free-energy surface of chignolin projected on H-bonds I versus III plane and that projected on the main-chain RMSD versus H-bond II plane obtained from all trajectories are shown in Figure
Free-energy surfaces of chignolin projected on the hydrogen bonds I versus III plane (a) and on the hydrogen bond II versus main-chain RMSD plane (b) obtained from all trajectories of the SAAP-MC simulation at 300 K in water. Contour lines are drawn in an interval of 1 kcal/mol.
To evaluate relative stabilities of structures A, A′, B, and C more accurately, single-point
In this study, we have modified the SAAPFF parameters in the following points.
The clustering of the structures obtained for chignolin by SAAP-MC simulation using the improved parameters showed that structures A and B are dominant in the implicit water model as shown in Table
In the meantime, randomly fluctuating structure was obtained for Met-enkephalin by using both the previous [
There are a number of reports in the literature on the molecular simulation for chignolin by using conventional force fields to obtain the native structure from the unfolded state [
Previous studies applying an explicit water molecule model [
The efficiency for conformational sampling arises probably from a less number of variables used in the SAAPFF than that in the conventional force fields. In SAAPFF, the structure of peptides is defined only by the dihedral angles of the each amino acid unit, not by the cartesian coordinates of each atom. Acceleration of conformational sampling by reducing the number of structural parameters is a common strategy of coarse-grained force fields, such as united-atom force fields [
Although the accuracy of SAAPFF is not yet satisfied, the efficiency for conformational sampling would be advantageous for prediction of the stable structures of peptides in water. Therefore, we subsequently explored the application of the SAAP-MC method to predict the native structure of chignolin.
Indeed, it was found by single-point
As for prediction of the native structure, another simpler method would be possible based on the hardness (i.e., the number of interamino acid interactions or the depth of the potential hole on the free-energy surfaces) of the structures. As seen in Figure
The parameters of SAAPFF, which was previously developed to analyze the structures and folding of polypeptides, have been improved in several points in this study.
In the meantime, efficiency of SAAP-MC simulation for conformational sampling was demonstrated for Met-enkephalin as the SAAP-MC simulation afforded diverse structures. The feature, combined with the structural clustering analysis, was subsequently applied to the structure prediction of chignolin. Among the representative structures obtained by the clustering, structure A′ with a native fold was assigned to the most stable structure according to
This work was supported by Grant-in-Aid for Scientific Research on Innovative Areas (no. 2120005) from the Ministry of Education, Culture, Sports, Science and Technology. The SAAP force field parameters are available for download at