Critical assessment of performance of alternative molecular modeling methods depending on a specific object and goal of the investigation is a question of continuous interest. This prompted us to demonstrate the origin of the guidelines we have used for a rational choice and use of a proper low level calculation method (LLM) for an initial geometry optimization of generated conformers, with the aim of selecting a set for further optimization. What was performed herein was a comparison of LLMs: MM3, MM+, UFF, Dreiding, AM1, PM3, and PM6 on the optimization of conformers’ geometry of α-methoxyphenylacetic acid (MPA) 2-butyl esters as a set of typical diastereomeric esters of a chiral derivatizing agent. This set of esters calculated represents only compounds of this certain type in the current work. The LLM conformer energies were correlated with benchmark energies found by using higher level reference method B3LYP/6-311++G** on the geometries gained previously by optimization with LLMs. In an alternative treatment, the energy range to be covered and corresponding number of LLM optimized conformers obligatory for submitting to further optimization using a high level optimization cascade were considered on the basis of determination of the cut-off conformer (COFC).
1. Introduction
Critical assessment of performance of different alternative molecular modeling methods depending on the particular objects and specific goals is a question of continuous interest [1–7]. The aim of the current work is to demonstrate how the guidelines are derived for a rational choice of a proper low level calculation method (LLM) for the initial geometry optimization of generated conformers in the total conformational analysis of diastereomeric esters. The LLM chosen should allow selecting of a reliable set of conformers for further higher level optimization. Having previously followed such guidelines in the total conformational analysis of several esters with up to 11 dihedrals in their structure, the initial optimization of thousands of conformers all together have afforded reliable sets of conformers. The following higher level optimization of the selected conformers has led to the results that, in turn, have allowed calculation of the molecular shielding models in exceptional accordance with differential shielding effects in the NMR spectra [8].
The properties of organic compounds are dependent on their three-dimensional structures. Knowing the prevailing conformer [1] of certain compounds or population of reactive conformers of molecules of another type is often a key to understanding their spectral properties or stereochemistry of the reactions under various experimental conditions, respectively. The most frequently used approach is to study the conformers’ geometry and relative energy as well as their barriers to the rotational interconversion.
The theoretical conformational analysis enables a reliable estimation of the geometrical structure and energy of each conformer in a set of conformers and from such results various thermochemical parameters and spectral characteristics (NMR coupling constants and chemical shifts, IR vibration modes, etc.) can be derived. Density functional theory (DFT) [9] calculations have proved successful in numerous studies based on theoretical conformational analysis [10]. Although geometrical parameters such as bond lengths, angles and dihedrals can be reasonably accurately predicted on molecular mechanical and semiempirical levels of theory, the electron correlation effects are important for quantitative considerations. These effects significantly affect the conformer energies as well as rotational barriers between various conformers and should therefore certainly be taken into account in the total conformational analysis.
DFT and the hybrid B3LYP functional have allowed accurate computation of the geometries concerning the type of molecules under study [8]. This, in turn, has allowed calculating NMR and IR spectral characteristics in very good agreement with the experimental data [8, 11, 12]. The DF method using the B3LYP hybrid functional has been proven to give a good estimate of the electron correlation for some molecules, in particular, in conformational studies, when it is used with the 6-31G* or 6-311++G** basis set [13, 14].
Semiempirical methods (AM1, PM3) have been shown to overestimate the number of minima on the potential energy surface (PES) and underestimate the internal rotational barriers for substituted hydrazines [13]. Several force fields have been assessed by comparing calculated conformational energies in relation to MP2/aug-cc-pVTZ level benchmark energies for compounds with varying polarity. The quality of the results degraded along with increasing polarity of the molecule. This effect was attributed to insufficient description of the electrostatic term by using fixed partial charges in force fields [15].
The attempts to simplify conformational analysis are always problematic. In particular, limiting the number of conformations or setting certain geometrical constraints on the molecules under study can give a poor description of the PES leading to erroneous results. Therefore, if the subject molecule is conformationally flexible and contains many dihedrals with low potential energy barriers, a systematic search for conformers should be undertaken.
Identification of and involving all low energy conformers is important in a sense that compounds’ certain physicochemical properties, as a rule, depend on the whole conformational ensemble, not just on the conformer corresponding to the global minimum on the PES. However, investigation of only the energetically most favored conformer may give good results in some cases [1]. An example of this approach is reproducing NMR spectral parameters (chemical shifts, coupling constants) of flexible organic compounds at the DFT level using the Boltzmann distribution [16]. A systematic search for conformers and also taking into account the Boltzmann weighting of individual conformers have allowed the calculation of the circular dichroism spectra of macrocyclic lactones (by using the time dependent DFT method) in a very good correlation with experimental spectra [17]. The same approach has allowed an accurate estimation of the yield of hydroxybenzophenones in the Fries rearrangement of hydroxyphenyl benzoates [14].
Usual procedure of conformational analysis involves the generation of conformers by automated means using a conformational search algorithm (stochastic Monte Carlo, simulated annealing, genetic algorithm, etc.), followed by the initial optimization of the geometry of the generated structures with the help of molecular mechanical or a semiempirical method. The further refinement of conformers is carried out on higher theoretical levels, at the same time discarding higher energy conformers. After optimization of all structures at a certain level of theory, a geometrical analogy comparison is used to identify duplicates. This is carried out by RMS values of geometrical deviation of the positions of the atoms of one conformer in relation to other conformers [18]. The cascade of calculations is usually finalized by using a higher level DFT method.
The purpose of the present study is to show how the choice of the initial low level optimization method and the details of its use can influence the outcome of the total conformational analysis consisting of a cascade of optimizations, including relatively high-level calculations at the final stage. More specifically, the aim is to elucidate which initial method permits reducing most efficiently the number of conformers necessary to be optimized at the higher level. Another aim is to provide certain criteria that could be relied on in order to guarantee obtaining reliable results along with an optimized feasibility of calculations. In this regard, the key feature is the estimation of the conformer energy range for a certain initial calculation method marking the conformers to be selected for further calculations. This should guarantee that the analyzer does not miss any important conformer at the initial stage of optimization.
As to the contents, this paper will firstly tackle the question of main correlational parameters such as slope, intercept, and the coefficient of correlation between the results of tested LLMs and the benchmark energies calculated using higher level reference method (RefM1). A satisfactory correlation is obligatory for an LLM to be used in the initial stage of conformational analysis. However, these correlational parameters are still, by themselves, insufficient for a correct evaluation of the prospective quality of calculation results to be obtained just by using the LLM. Secondly, the estimation of the energy ranges for the initially LLM-optimized conformers to be selected for further optimization is considered.
In addition, a critical consideration is given to the approach within the systematic theoretical conformational analysis, according to which decisive conclusions about the relative energies and corresponding distribution of conformers are drawn on the basis of results obtained using merely an LLM. The accuracy of the LLMs for calculation of conformational energies as compared to the benchmark values has been assessed.
For the current research, α-methoxyphenylacetic acid (MPA) 2-butyl esters (Figure 1) were chosen as subject compounds. They are diastereomeric esters of the well-known chiral derivatizing agent for NMR spectroscopic analysis [8, 19–21]. In the current work they are mere model compounds for calculation (Figure 1). The MPA esters are not too elementary in conformational sense; yet the optimization of their conformers is possible by using higher level methods, such as B3LYP/6-311++G** included into the cascade protocol for optimization (RefM2) in the current work.
2-Butyl esters of MPA.
The calculations have been performed in gas phase. This is justified by very good accordance achieved in between the shielding models calculated in this environment and the NMR data measured in the CDCl_{3} [8].
2. Theoretical Approach2.1. Abbreviations
PES: Potential energy surface
MPA: α-Methoxyphenylacetic acid
LLM: Low level calculation method
RefM1: The first reference method is “the single point calculation of conformer energy using B3LYP/6-311++G** method” (Step 3, Figure 2); this is the single point calculation of conformer energy for the conformers’ geometries optimized with LLMs (Steps 1 and 2; Figure 2)
RefM2: The second reference method is “the high level conformational analysis using cascade procedure for optimization of conformers” (Step 4, Figure 2)
RefM3: B97D Grimme’s functional including dispersion; the basis set is the same as RefM2 (6-311++G**)
COFC: “Cut-off conformer”; this is a conformer characterized by the highest conformer energy in a series of LLM optimized conformers selected for further calculation (Steps 1 and 2, Figure 2); these conformers have been selected as corresponding to the lower energy conformers (up to 1.00 kcal/mol) that were found by optimization of all conformers using the high level RefM2 (Step 4, Figure 2). The COFC energy determines an energy range for conformers of an LLM potentially involving all lower energy conformers to be submitted for further optimization.
Flow chart of the calculations (Step 1: generation of conformers followed by the initial optimization using MM3 method; Step 2: optimization of the conformers by using LLMs; Step 3: single-point calculation of the conformer energies by using the reference method DFT B3LYP/6-311++G** (RefM1); Step 4: the sequential optimization cascade of conformational analysis (RefM2); Step 5: optimization of the refined conformers for their identification by using the LLMs) performed.
2.2. Computational Strategy
In order to achieve the goals of the research, the following five-step computational procedure was performed (Figure 2).
(1) The initial task was to find all the local minima over all configurations of the stereocenters of the MPA diastereomeric esters (Figure 1). The conformational landscape of the PES has been explored using the program Tinker (Figure 2) [22] (module SCAN). This program performs a general conformational search for the entire PES by using a basin hopping algorithm [23] which is able to overcome potential energy barriers. The basin hopping algorithm explores the PES along various normal modes from the initial local minimum. When all minima on the search list have been subjected to the normal mode activation without locating additional new minima, the program terminates. Thus, the scanning time for this program is not unlimited and it also minimizes the energy of each conformer after it has been generated. The MM3 force field chosen for this minimization is known to be well parameterized for many organic compounds [24].
(2) All the conformers generated by the Tinker program were optimized with MM3 (Step 1, Figure 2). The conformers obtained were further optimized also with all the other LLMs under test: molecular mechanical, MM+, UFF, Dreiding, and semiempirical methods, AM1, PM3, PM6 (Step 2, Figure 2). The initial optimization of the conformers in this work with MM3 may, in principle, bias the results towards this method. This is because conformations may be lost in this first reminimization step which would be included if separate conformation searches were done with each force field. However, this assumption is not valid. It has been proven by exceptional accordance of experimental NMR shielding data [8] with the results of the theoretical calculations performed ignoring the matter of the above assumption. Consequently, the MM3 conformer energies might be comparable without restrictions with the conformer energies obtained using other LLMs.
(3) The quality of the LLMs was initially evaluated (the conformer energies correlated) on the basis of single point DFT B3LYP/6-311++G** calculation of conformer energies (RefM1) (Step 3, Figure 2) on the previously optimized force field and semiempirical (LLM) geometries.
(4) The conformers resulting from the above conformational search (from Step 1, Figure 2) were further geometry optimized in a sequential treatment (Step 4, Figure 2) with the help of the Gaussian 09 program [25]. The initial computational level used was the Hartree-Fock method with an STO-3G basis set. These ab initio calculated structures were used as input structures for the density functional method using the hybrid B3LYP exchange correlation functional and 3-21G basis set (i.e., B3LYP/3-21G). Similarly, the output of the latter method was used as an input for the B3LYP/6-31G method. The optimization cascade was finalized at the B3LYP/6-311++G** level. This cascade methodology for conformational analysis is referred to in the current text as RefM2.
Every optimization step was followed by a geometrical comparison of conformers with the aim of extracting unique structures based on a geometrical RMS criterion (<2.0 Å). Progressing towards higher levels of calculation, some of the conformers converge on each other resulting in substantially reduced number of conformers to be optimized at the highest level.
Additionally, the frequency analysis was performed at the highest optimization level. This was done in order to verify that every optimized conformer corresponds to a local minimum on the PES, that is, to a stationary point with no imaginary frequencies.
(5) All the conformers obtained by means of the above optimization cascade (Step 4, Figure 2) that fall into energy range of 0–1.00 kcal/mol were subjected to the optimization with all the LLMs under study (Step 5, Figure 2). It should be emphasized that Step 5 is necessary for identification of the conformers optimized in Steps 1 and 2 among geometries obtained finally in Step 4.
(6) In order to explore the accuracy of the methodological treatment presented on scheme (Figure 2) with respect to other high level reference methods, an additional method, such as B97D (Grimme’s functional including dispersion) [26], is added using the same basis set (6-311++G**) as in RefM2. B97D method uses density functional constructed with a long-range dispersion correction. We label the latter reference method as RefM3 in our text.
For low level calculation using semiempirical methods [27] AM1 [28, 29], PM3 [30], and PM6 [31], molecular mechanical methods UFF [32] (UFF uses charges calculated by the method of charge equilibration [33]), and Dreiding [34], the Gaussian 09 program [25] was used while for the MM+ method [35] the Hyperchem program [36] and for the MM3 the Tinker Program [22] were used instead.
3. Results
Steps 1 and 2. The test set of conformers of MPA 2-butyl esters (including both the RR and RS diastereomers) were generated and initially optimized with MM3 method by using the Tinker program (Step 1, Figure 2). The MM3 optimized conformers (geometries) were firstly further subjected to optimization with other LLMs under test (Step 2, Figure 2).
Step 3 and Correlation of the Conformer Energies. LLMs are assessed by correlating their conformer energies (from Steps 1 and 2, Figure 2) with the energy values resulting from single point DFT B3LYP/6-31* calculations (Step 3, Figure 2) on the same LLM optimized geometries (Table 1, Figures 3, 4, 5, 6, 7, and 8). In this case the MM3 method yields the best correlation and also correctly predicts the lowest energy conformer.
Correlation of conformer energies resulted from minimization with LLMs (Steps 1 and 2, Figure 2) with single point energies calculated for the geometries optimized with LLMs by using high level RefM1 (Step 3, Figure 2).
Parameter^{a}
UFF^{b}
Dreiding^{b}
AM1^{b}
PM3^{b}
PM6
MM3^{b}
MM+^{b}
R2
0.76
0.14
0.60
0.04
0.436
0.89
0.65
RMS (kcal/mol)
4.17
5.38
3.80
7.31
3.54
2.78
3.32
Slope
1.17
0.24
0.46
0.04
0.49
1.12
0.56
Intercept
0.38
2.31
0.49
1.64
1.91
−0.15
1.42
aR2-correlation coefficient.
RMS-root mean square deviation of the energy values.
Slope and intercept characterized systematic deviation of the correlated energy values.
Ideally they should be 1.00 and 0.00, respectively.
^{
b}Correlation diagrams are presented in Figures 3–8.
Correlation between energies (kcal/mol) of MM3 optimized conformers (Step 1, Figure 2) and energies calculated with RefM1 (Step 3, Figure 2) for the same geometries.
Correlation between energies (kcal/mol) of MM+ optimized conformers (Step 2, Figure 2) and energies calculated with RefM1 (Step 3, Figure 2) for the same geometries.
Correlation between energies (kcal/mol) of UFF optimized conformers (Step 2, Figure 2) and energies calculated with RefM1 (Step 3, Figure 2) for the same geometries.
Correlation between energies (kcal/mol) of the Dreiding optimized conformers (Step 2, Figure 2) and energies calculated with RefM1 (Step 3, Figure 2) for the same geometries.
Correlation between energies (kcal/mol) of AM1 optimized conformers (Step 2, Figure 2) and energies calculated with RefM1 (Step 3, Figure 2) for the same geometries.
Correlation between energies (kcal/mol) of PM3 optimized conformers (Step 2, Figure 2) and energies calculated with RefM1 (Step 3, Figure 2) for the same geometries.
Step4. The results of the optimization cascade (Step 4, Figure 2) and the energies of the low energy conformers in the energy range of 0 to 1.00 kcal/mol are summarized in Table 2 along with conformer energies calculated for the same conformers by optimization with LLMs (Steps 1 and 2, Figure 2) and the HF/STO-3G method (Step 4, the first level, Figure 2).
Determination of the cut-off conformer (COFC) energy^{a} (given in bold typeface) within a series of optimized with LLMs and HF/STO-3G conformers (Steps 1 and 2 and Step 4 (the first level), resp., Figure 2).
Conformer
RefM2
RefM3
HF
UFF
Dreiding
AM1
PM3
PM6
MM3
MM+
a
0.00
0.40
0.09
0.80
0.99
1.43
0.35
3.37
0.03
0.42
b
0.09
0.00
0.09
0.60
0.64
1.31
1.56
3.24
0.00
0.32
c
0.56
0.41
0.27
1.13
0.95
1.40
0.60
4.29
3.05
0.44
d
0.57
0.57
0.89
2.36
1.87
1.43
1.48
3.27
0.90
1.09
e
0.59
0.73
0.86
2.13
2.03
1.39
1.48
3.16
0.89
0.91
f
0.60
0.52
0.24
1.31
1.44
1.26
1.50
4.18
3.09
0.78
g
0.67
0.91
0.91
2.20
2.16
1.45
1.45
3.16
0.99
0.68
h
0.69
1.00
0.88
2.32
2.23
1.35
1.44
3.27
0.96
0.58
i
0.72
0.74
0.01
0.29
1.06
2.11^{
a}
1.56
4.81
2.99
0.20
j
0.82
0.87
0.00
0.00
0.00
2.00
1.22
4.13
0.29
0.88
k
0.85
0.57
0.03
0.89
0.65
2.00
1.25
4.12
0.32
1.21
RMS^{b}
0.00
0.20
0.46
1.09
1.01
0.94
0.79
3.44
1.30
0.31
aThe COFC energy is the highest conformer energy in the series of conformers optimized with an LLM and selected as corresponding to more populated conformers in the energy range of 0–1.00 kcal/mol found by optimization with the cascade optimization method RefM2. The COFC energy determines the conformer energy range that potentially involves all the set of significant LLM conformers to be subjected to optimization at higher levels of theory.
^{
b}The RMS values indicate deviations of LLM energies from the energies calculated by using the RefM2.
The energies of the low energy conformers from all of the test set of conformers (of MPA 2-butyl esters including both the RR and RS diastereomers) were found by using the cascade of calculations (RefM2; Step 4, Figure 2) and are shown in Figure 9 in comparison with the energies resulting from optimization of just the same conformers with the LLMs under test (Steps 1 and 2, Figure 2) and also with HF/STO-3G (Step 4, the first level, Figure 2). The latter method is included in the RefM2 as the first level of calculations, thus the separated results for this method allow estimating the impact of the higher levels to the final result.
Correlation between energies (kcal/mol) of PM6 optimized conformers (Step 2, Figure 2) and energies calculated with RefM1 (Step 3, Figure 2) for the same geometries.
As noted, several initial conformers with different energies are converging to one certain conformer upon the sequential high level optimization procedure. The total number of conformers resulting from generation and initial optimization by using the Tinker program (Step 1, Figure 2) was 262. The further geometry optimization on subsequent higher levels reduces this number; for example, performing an HF/STO-3G step after the Tinker calculations (Figure 2) reduces it to 187. This tendency is observed for all higher level methods used.
The numbers of conformers obtained by using different LLMs (Steps 1 and 2, Figure 2) and HF/STO-3G (Step 4, first level) belonging to certain energy ranges in comparison with RefM2 are presented in Table 3.
The number of conformers found by optimization with LLMs and HF/STO-3G (Steps 1 and 2, Step 4 (the first level), resp., Figure 2) as well as with the RefM2 (entire Step 4, Figure 2) within certain energy ranges.
Energy range from 0.00 to X[kcal/mol]
RefM2
HF
UFF
Dreiding
AM1
PM3
PM6
MM3
MM+
COFC energy^{a}
11
11
35
97
87
94
76
55
37
2,50
46
54
35
111
109
159
28
39
71
2,00
43
42
23
81
78
132
27
29
64
1,50
33
42
10
49
60
89
14
26
51
1,00
11
11
6
17
32
41
10
11
26
0,75
9
4
4
13
18
26
10
4
16
aThe COFC energy values for LLMs are given in Table 2.
In the case of semiempirical methods (AM1, PM3, and PM6) the number of conformers in the lower energy region is larger than that found using the other methods. For instance, by optimization with MM3, 11 conformers within the energy range from 0 to 1.00 kcal/mol have been identified whereas AM1 and PM3 yielded 32 and 41 conformers, respectively. The semiempirical methods tend to overestimate the number of conformers.
Step 5. After performing the calculations of Step 5 (Figure 2), it is possible to directly compare the conformational energies of identified conformers (Figure 9), that is, of conformers optimized by the LLMs and HF/STO-3G with these minimized by RefM2 (Table 2). The energy level of the lowest energy conformer is set to 0 kcal/mol, as usual. It has been noticed that quite a good similarity between energies of the entire RefM2 and HF/STO-3G occurs (Figure 10).
The LLM and HF conformer energies (Steps 1, 2, and 4 (the first level), resp., Figure 2) compared to the energies refined by the RefM2 (entire Step 4, Figure 2).
4. Discussion
It is an unavoidable problem in the conformational analysis in general that the number of conformers initially generated is too high and its quality too low for justified optimization of all of them on time-consuming higher calculation levels. A significant part of conformers initially generated turn out to be high energy ones or saddle points on a PES. This shortcoming has to be overcome by using an initial optimization procedure with a less accurate but also less resource consuming LLM. The rational choice of a LLM for this purpose is an important question for refining the generated conformers in order to select a reliable set for further optimization using an ab initio or density functional method. The failure at the initial stage of systematic conformational analysis would lead to erroneous results that cannot usually be corrected in subsequent steps by using better methods. A common way to minimize the number of conformers for further optimization is neglecting the high energy conformers. However, this kind of simplification of the process may be unjustified if the choice of an LLM for the initial optimization as well as (or) the choice of the corresponding cut-off conformer is not right. Still, it has been difficult to guarantee that conformers which later upon optimization with RefM2 will turn out to be low energy conformers are not discarded.
In order to investigate the approach, none of the higher energy conformers has been discarded at any of the calculation steps (Figure 2) in the current work.
In sum, there are two requirements to the use of a LLM at the initial stage to guarantee obtaining high computational efficiency along with reliable results. Both of them depend on the quality of the LLM as well as on a rational protocol for performing the selection.
The LLMs that correlate with the RefM1 regarding the conformer energies are suitable for the initial stage of a conformational analysis. The suitable LLM should also be characterized by low (preferably the lowest) cut-off conformer energy.
In addition, the situation where the LLM results in a low energy conformer but a high level method gives a high energy value for the same conformer is not disastrous (i.e., it does not bring upon erroneous results), but rather makes the LLM less selective and therefore the number of the conformers to be optimized still remains too high, which is just unpractical. A preferable situation would be if a low and a high level method would correlate to a satisfactory extent.
The quality of the conformer energies minimized with different LLMs (Steps 1 and 2, Figure 2) is evaluated on the basis of single point DFT B3LYP/6-311++G** calculation (RefM1; Step 3, Figure 2) of energies corresponding to the molecular geometries optimized with LLMs under test (Steps 1 and 2, Figure 2) (correlation diagrams are presented in Figures 3–8). Table 1 presents correlation parameters between these two types of energy values.
The conformer energies are best reproduced with MM3 force field, followed by UFF (R2 values of 0.89 and 0.76, resp.) though both of them tend to slightly overestimate conformational energies, given by the slope value above 1.0 of the least square fit line. The MM+ force field with an RMS value of 3.3 kcal/mol tends to underestimate the conformational energies. In the series of LLMs under test, the Dreiding force field proved to be rather problematic with its very poor correlation and severe underestimation of conformational energies.
AM1 systematically underestimates the energies that are more pronounced in the middle-range DFT energy values. Although the AM1 method gives a fairly good regression fit (R2=0.6), PM3 shows no correlation with DFT relative energy values. Besides, the slope value close to zero indicates an underestimation of conformer energies. However, regarding energy differences, herein the PM3 method has correctly identified the energetically most favorable conformer, that is, a very valuable result. In particular, the use of the semiempirical methods for the conformational analysis of complex compounds may give valuable information. For instance, AM1 has been used for the conformational analysis of prostaglandins providing results (based on identification of the prevailing conformers) that qualitatively correlated with experimental characteristics [28].
Based on the above reasoning, the MM force fields can thus be suggested as quick preliminary methods for energy minimization, yielding satisfactory semiqualitative results regarding the energies of conformers. In addition, the examination of the results of conformational analysis with entire RefM2 indicates that molecular mechanics MM3 force field is able (as PM3 did) to predict correctly the lowest energy conformers (Figure 3; conformers a and b) obtained by this higher level procedure.
The semiempirical methods usually give less accurate results. This is because they are parameterized near the local minimum of the PES. At more prominent distances from these minima, the semiempirical methods are not able to reproduce correct energy values. Thus, optimizing the structure from this region of a PES might result in an incorrect optimized structure.
The COFC energies were found as follows. Table 2 lists the low energy (<1.00 kcal/mol) conformers obtained with a high level method RefM2. These conformer energies are compared with energies found for the same conformers by optimization with various LLMs. For instance, the COFC found with MM3 has energy of 3.09 kcal/mol, which corresponds to a conformer labeled with min this paper (Table 2). In the same way the UFF method yields the COFC, namely, n, with the energy of 2.36 kcal/mol. Repeating this procedure for the rest of the methods yielded all the COFCs for the methods tested herein and the results are presented in Table 2 (the COFC energies are in bold typeface).
The number of significant LLM conformers appeared to be different in case of different LLMs. The question related to computational efficiency in an attempt to find all significant low energy conformers (0.00–1.00 kcal/mol) is the following: which initial LLM gives a minimum number of conformers necessary to be optimized with a high level method? The numbers of conformers corresponding to the energy ranges are given in Table 3. All the conformers in a series within energy range up to COFC’s energy value (which are included in Table 3) have to be considered. On the basis of the results provided in Table 3 it can be concluded that the most efficient LLM is the UFF which permits to limit the amount of conformers to 35. The MM+ and MM3 methods select 37 and 55 conformers, respectively. Semiempirical methods AM1, PM3, and PM6 yield 87, 94, and 76 conformers, respectively, while Dreiding force field selects 97 conformers to be subjected to further calculation with a high level method. Replacing the high level reference method from RefM2 to RefM3 and performing a similar cascade like optimization scheme as performed in case of RefM2 (Figure 2, Step 4), no significant changes occur as can be seen from Table 4. The UFF selects 39 conformers; the other low level methods collect the numbers which are approximately at the same line as compared to RefM2.
The same as Table 3, but the high level reference method is replaced from B3LYP/6-311++G** (RefM2) to B97D/6-311++G** (RefM3).
Energy range from 0.00 to X[kcal/mol]
RefM3
HF
UFF
Dreiding
AM1
PM3
PM6
MM3
MM+
COFC energy
15
36
39
84
73
112
84
44
55
2,50
61
52
42
68
76
110
27
39
76
2,00
47
43
32
47
49
93
26
29
61
1,50
38
43
13
21
39
59
13
26
37
1,00
11
33
10
6
22
23
9
11
16
0,75
9
8
7
3
13
17
9
4
13
Special attention should be paid to the (low level ab initio) Hartree-Fock HF/STO-3G method. It could be positioned in between the low and high level computational methods. In this work HF/STO-3G is included into RefM2 as the first step of the cascade of higher level calculations (Step 4; Figure 2) and therefore comparison of the results of RefM2 with these of HF/STO-3G can only be illustrative. The accordance between the HF/STO-3G method and the entire RefM2 is pretty good; the energy of the corresponding COFC is the lowest found for LLMs, only 1.07 kcal/mol. As mentioned, HF/STO-3G is assumed to belong to LLMs in some respect and corresponding characteristics have been included in the Tables 1 and 2. This method yields only 11 conformers in the energy range up to the COFC energy.
In addition to evaluation of the quality of an LLM for the calculation of conformer energy, it is important to create and follow criteria for the use of an LLM. In this regard it is reasonable to pick up the RefM2 optimized conformers appearing within an energy range, for example, <1.00 kcal/mol (Figure 3). Approximately, this range is justified because of the Boltzmann law, according to which the higher energy conformers are less significant due to low populations. Among these selected conformers one should identify the conformer showing the highest LLM energy (Table 2). Herein, we label this conformer as a cut-off conformer (COFC) for this LLM and its energy defines the energy range involving the LLM conformers that have to be subjected to further consideration.
In order to assess the LLMs, we have to determine the COFCs and corresponding energies for all of the methods under test. The best LLM in this respect is characterized by the lowest COFC energy value. Thus, the LLMs can be ranked according to the COFC energies. However, this ranking should be very critically evaluated taking into account also the conformer energy correlation results (Table 1) and the number of conformers to be submitted to optimization on the higher levels of theory.
In sum, it is assumed that the following guidelines should be taken into account in assessing the suitability of the LLM.
Conformer energies found by using the high and low level method should correlate.
A COFC of an LLM selected for use should be characterized by the low energy in the manifold of LLMs, indicating probably a more preferable method. The COFC energy should be taken as a cut-off value when discarding high energy conformers after initial optimization of conformers generated.
It would be preferable if the lowest energy conformer calculated with a high level method would also appear as the lowest energy conformer obtained by using the chosen LLM.
The selectivity of the LLM should be sufficient, that is, the number of conformers remaining for the optimization on higher levels should be acceptable (as compared with other LLMs).
5. Summary
A comparative demonstration of the LLMs’ performance based on optimization of conformer geometries of the 2-butyl esters of MPA was carried out. Special focus was on the rational selection of the LLM for the initial optimization of generated conformers in total conformational analysis of esters. The methods assessed were MM3, MM+, UFF, Dreiding, AM1, PM3, and PM6. The performance of the low level ab initio calculation method HF/STO-3G was studied in the role of a lower step of the optimization cascade. The single point calculation of the conformer benchmark energies for the LLM optimized geometries with DFT B3LYP/6-311++G** was performed. The LLM conformer energies were correlated with the higher level benchmark single point energies; corresponding diagrams along with the correlation parameters are presented. In this approach, the MM3 force field appeared to yield the best correlation values.
It is also shown that the above correlation parameters, obligatory for the assessment of the quality of a LLM regarding the energy calculation, are neither sufficient to assess fully the quality of a LLM nor do they provide exact criteria for applying of a more proper LLM to the initial optimization of geometry of conformers. For the adequate assessment of an LLM, in addition to the energy correlation quality, the number of conformers that need to be fed to the further optimization in order to obtain reliable results should be determined and critically assessed in comparison with the other methods. This is the number of conformers up to the cut-off conformer (COFC) energy value; the lower these numbers are for a certain LLM, the better its quality is. And in reverse, high number of such conformers is indicative of poor selectivity of the LLM. Various methods have yielded different thresholds for the energy ranges (the COFC energies) and the number of conformers to be considered further.
Taking into account the assessment criteria, the MM3 force field has been chosen for the initial optimization of generated conformers in our investigations of esters [8].
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors thank the Estonian Ministry of Education and Science (Grants no. SF0140133s08, SF0690034s09, ETF8880, and IUT19-32) and EU European Regional Development Fund (3.2.0101.08-0017) for the financial support. The authors are also grateful to Tallinn University of Technology for the financial support through Grant no. B35.
AavR.PehkT.TampS.TammT.KudrjašovaM.ParveO.LoppM.Theoretical prediction and assignment of vicinal1H-1H coupling constants of diastereomeric 3-alkoxy-6,7-epoxy-2-oxabicyclo[3.3.0]octanesSattelmeyerK. W.Tirado-RivesJ.JorgensenW. L.Comparison of SCC-DFTB and NDDO-based semiempirical molecular orbital methods for organic moleculesGkionisK.HillJ. G.OldfieldS. P.PlattsJ. A.Performance of Becke's half-and-half functional for non-covalent interactions: energetics, geometries and electron densitiesPatonR. S.GoodmanJ. M.Hydrogen bonding and π-stacking: how reliable are force fields? A critical evaluation of force field descriptions of nonbonded interactionsWarrenG. L.AndrewsC. W.CapelliA.-M.ClarkeB.LaLondeJ.LambertM. H.LindvallM.NevinsN.SemusS. F.SengerS.TedescoG.WallI. D.WoolvenJ. M.PeishoffC. E.HeadM. S.A critical assessment of docking programs and scoring functionsWangR.LuY.WangS.Comparative evaluation of 11 scoring functions for molecular dockingChengT.LiX.LiY.LiuZ.WangR.Comparative assessment of scoring functions on a diverse test setTampS.DanilasK.KreenM.VaresL.KiirendE.VijaS.PehkT.ParveO.MetsalaA.A total conformational analysis of diastereomeric esters and calculation of their conformational shielding modelsParrR. G.YangW.AlverÖ.KayaM. F.BilgeM.ParlakC.Vibrational spectroscopic investigation and conformational analysis of methacrylamidoantipyrine: a comparative density functional studyWeiqunZ.BaolongL.YangC.YongZ.XujieL. L. Y.The structure and conformation analysis of N-2-fluorobenzoyl-N′-2-methoxyphenylthioureaHillM. W.WibergK. B.CastrejonH.BaileyW. F.OchterskiJ.SeminarioJ. M.TšubrikO.BurkP.PehkT.MäeorgU.Conformational analysis of 1-acetyl-2-methylhydrazineMetsalaA.UsinE.VallikiviI.VilloL.PehkT.ParveO.Quantum chemical evaluation of the yield of hydroxybenzophenones in the fries rearrangement of hydroxyphenyl benzoatesRasmussenT. D.JensenF.Force field modelling of conformational energiesStahlM.SchopferU.FrenkingG.HoffmannR. W.Conformational analysis with carbon-carbon coupling constants. A density functional and molecular mechanics studyVoloshinaE.FleischhauerJ.KraftP.Conformational analysis and CD calculations of methyl-substituted 13-tridecano-13-lactonesLeachA.TrostB. M.BelletireJ. L.GodleskiS.McDougalP. G.BalkovecJ. M.BaldwinJ. J.ChristyM. E.PonticelloG. S.VargaS. L.SpringerJ. P.On the use of the O-methylmandelate ester for establishment of absolute configuration of secondary alcoholsPehkT.LippmaaE.LoppM.PajuA.BorerB. C.TaylorR. J. K.Determination of the absolute configuration of chiral secondary alcohols; new advances using carbon-13 and 2D-NMR spectroscopySecoJ. M.QuiñoáE.RigueraR.The assignment of absolute configuration by NMRPonderJ. W.Program TINKER, (version 5.0), 2007, http://dasher.wustl.edu/tinker/PappuR. V.HartR. K.PonderJ. W.Analysis and application of potential energy smoothing and search methods for global optimizationYoungD. C.FrischM. J.TrucksG. W.SchlegelH. B.ScuseriaG. E.RobM. A.CheesemanJ. R.MontgomeryJ. A.VrevenT.KudinK. N.BurantJ. C.MillamJ. M.IyengarS. S.TomasiJ.BaroneV.MennucciB.CossiM.ScalmaniG.RegaN.PeterssonG. A.NakatsujiH.HadaM.EharaM.ToyotaK.FukudaR.HasegawaJ.IshidaM.NakajimaT.HondaY.KitaoO.NakaiH.KleneM.LiX.KnoxJ. E.HratchianH. P.CrossJ. B.BakkenV.AdamoC.JaramilloJ.GompertsR.StratmannR. E.YazyevO.AustinA. J.CammiR.PomelliC.OchterskiJ. W.AyalaP. Y.MorokumaK.VothG. A.SalvadorP.DannenbergJ. J.ZakrzewskiV. G.DapprichS.DanielsA. D.StrainM. C.FarkasO.MalickD. K.RabuckA. D.RaghavachariK.ForesmanJ. B.OrtizJ. V.CuiQ.BaboulA. G.CliffordS.CioslowskiJ.StefanovB. B.LiuG.LiashenkoA.PiskorzP.KomaromiI.MartinR. L.FoxD. J.KeithT.Al-LahamM. A.PengC. Y.NanayakkaraA.ChallacombeM.GillP. M. W.JohnsonB.ChenW.WongM. V.GonzalezC.Gaussian Inc.Gaussian 03 (Revision E. 01), Gaussian Inc., Wallingford, Conn, USA, 2009GrimmeS.Semiempirical GGA-type density functional constructed with a long-range dispersion correctionStewartJ. J. P.Optimization of parameters for semiempirical methods. II. ApplicationsDewarM. J. S.ZoebischE. G.HealyE. F.Development and use of quantum mechanical molecular models. 76. AM1: a new general purpose quantum mechanical molecular modelDavisL. P.GuidryR. M.WilliamsJ. R.DewarM. J. S.RzepaH. S.MNDO calculations for compounds containing aluminum and boronStewartJ. J. P.Optimization of parameters for semiempirical methods. I. MethodStewartJ. J. P.Optimization of parameters for semiempirical methods V: modification of NDDO approximations and application to 70 elementsRappéA. K.CasewitC. J.ColwellK. S.GoddardW. A.IIISkiffW. M.UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulationsRappéA. K.GoddardW. A.IIICharge equilibration for molecular dynamics simulationsMayoS. L.OlafsonB. D.GoddardW. A.IIIDREIDING: a generic force field for molecular simulationsHyperchem manualHYPERCHEM ver. 7, Hypercube Inc., http://www.hyper.com