On the Hydration State of Amino Acids and Their Derivatives at Different Ionization States: A Comparative Multinuclear NMR and Crystallographic Investigation

2D, 13C, 14N, and 17O NMR and crystallographic data from the literature were critically evaluated in order to provide a coherent hydration model of amino acids and selected derivatives at different ionization states. 17O shielding variations, longitudinal relaxation times (T 1) of 2D and 13C and line widths (Δν 1/2) of 14N and 17O, may be interpreted with the hypothesis that the cationic form of amino acids is more hydrated by 1 to 3 molecules of water than the zwitterionic form. Similar behaviour was also observed for N-acetylated derivatives of amino acids. An exhaustive search in crystal structure databases demonstrates the importance of six-membered hydrogen-bonded conjugated rings of both oxygens of the α-carboxylate group with a molecule of water in the vicinity. This type of hydrogen bond mode is absent in the case of the carboxylic groups. Moreover, a considerable number of structures was identified with the propensity to form intramolecular hydrogen bond both in the carboxylic acid (NH⋯O=C) and in the carboxylate (NH ⋯ O−) ionization state. In the presence of bound molecules of water this interaction is significantly reduced in the case of the carboxylate group whereas it is statistically negligible in the carboxylic group.


Introduction
Water plays a fundamental role in the conformation and activity of every biological macromolecule [1]. Peptide-and protein-hydration is the dominant factor in the stabilization of spatial molecular structure, in the process of protein folding by gating hydrophobic residues, and in the mechanisms of peptide and protein mediated reactions [1][2][3][4]. Water molecules, therefore, can be considered as an integral component of biomolecular systems with dynamic, functional, and structural roles [4][5][6][7]. Investigation of the structural and functional role of water molecules, bound to proteins and peptides, requires a sufficient understanding of the hydration process of their building blocks [1,2]. The hydration of amino acids and their derivatives at a molecular level, therefore, is of great importance and has been extensively studied with X-ray crystallography [1,3] and a variety of spectroscopic techniques including multinuclear magnetic resonance spectroscopy [2,[8][9][10][11][12][13], IR and Raman spectroscopy [14][15][16], ICR mass spectrometry [17], and laser ablation in combination with microwave spectroscopy [18].
We present here, for the first time in the international literature, a comparative investigation of literature 2 D, 13 C, 14 N, and 17 O NMR and crystallographic data in order to provide a coherent hydration model of amino acids and selected derivatives at different ionization states in aqueous solution and in the crystal state.

Results and Discussion
2.1. 17 O NMR Shieldings. 17 O NMR has received little attention in amino acid and peptide research [2,12,13,19,20]. This neglect is due to the fact that of the three naturally occurring oxygen isotopes, only 17 O possesses a nuclear spin (I = 5/2). Owing to its electric quadrupole moment (Qe = −2.6 × 10 −30 em 2 ) and, thus, broad line widths, and its low absolute sensitivity com-pared with that of 1 H (∼1.1 × 10 −5 ), the 17 O-isotope is one of the more difficult to observe by NMR spectroscopy [12,13,21,22]. 17 O NMR studies, therefore, of compounds at natural abundance require high concentrations (>0.1 M) and extensive signal averaging. Recording of spectra can be greatly facilitated by the use of 17 O enriched samples [23][24][25][26][27]. Figure 1(a) illustrates the natural abundance 17 O NMR spectrum of glutamic acid, 0.1 M in 17 O-depleted water at 40 • C. Despite the extensive signal averaging (number of scans (NS) =3 × 10 6 ) and the total experimental time of 4.2 hours, the achievable signal-to-noise (S/N) ratio is very poor and practically prohibitive for the accurate determination of chemical shifts and line widths. Figure 1(b) illustrates the clear advantages of working with 17 O-labelled glutamic acid ( 17 O enrichment 1 at.%) [23]. 17 O shieldings of various chemical functional groups are very sensitive for studying hydrogen bonding interactions because of the large chemical shift range of the 17 O nucleus [12,13]. The effect of solvent-induced hydrogen bonding interactions on δ( 17 O) of the carboxyl groups is, however, rather small compared with the substantial sensitivity of over 80 ppm to hydrogen bonding interaction of δ( 17 O) of amide and carbonyl oxygens [12,13]. Only a single 17 O resonance absorption is observed for the carboxylic group since the shifts of the individual resonance absorptions δ(C=O) and δ(OH) are averaged out by rapid intermolecular proton transfer with protic solvents, traces of H 2 O, and/or through hydrogen bonding aggregates of the COOH groups in organic solvents [12,13,23,24,26,28]. Reuben [29] from dilution studies of acetic acid in 1,2-dichloroethane estimated a deshielding effect of ∼12 ppm due to breaking of a hydrogen bond involving the carbonyl oxygen of the acid and a shielding effect of −6 ppm due to breaking of a OH· · · O hydrogen bond. Therefore, a total shift of only +6 ppm is expected for the monomeric acetic acid in apolar media (dichloroethane) compared with the dimeric form.
Despite the relatively low sensitivity of the 17 O shieldings of the carboxyl group to hydrogen bond interactions, Spisni and collaborators [9] attempted to estimate the solvation state of the α-carboxyl group of amino acids in the different ionization states. Figures 2(a) and 2(b) show the dependence of δ( 17 O) of L-alanine and L-proline as a function of molar fraction of DMSO in the pH range 7-8 and 12-13. Since DMSO cannot form a hydrogen bond interaction with the carboxylate group, contrary to the case of H 2 O, the shielding difference of 10-17 ppm between the two solvents was interpreted with the hypothesis that the carboxylate group of these amino acids is hydrated by two water molecules in aqueous solution with one hydrogen bond per carboxylate oxygen. In the acidic pH range (Figures 2(c), 2(d)), a nonlinear behaviour of the chemical shift at high DMSO molar fractions was observed. For DMSO molar fractions up to 0.6, a linear dependence of the chemical shift was observed which, on extrapolation to 100% DMSO, results in a shielding of 15-17 ppm, the same as in the neutral pH. This was interpreted with the hypothesis that two hydrogen bonds (one to each oxygen) are being ruptured. When the DMSO molar fraction is between 0.6 and 0.8, it was suggested that a third molecule of water, which is hydrogen bonded to the hydroxyl hydrogen, is dissociated due to the interaction with DMSO. This might explain the deflection from linearity and the plateau-like dependence of the 17 O shielding. The protonated form, therefore, of the carboxyl group of the amino acids is more hydrated with an access of a bound molecule of H 2 O than the deprotonated form. This conclusion is in qualitative agreement with multinuclear NMR relaxation data (see below).

Multinuclear NMR Relaxation Data.
For quadrupolar nuclei, such as 2 D, 15 N, and 17 O, the longitudinal (T 1 ) and transverse (T 2 ) relaxation times are essentially due to quadrupolar interaction where χ is the nuclear quadrupole coupling constant. The asymmetry parameter η varies from 0 to 1 and describes the deviation of the electric field gradient from axial symmetry, and f (ω, D) is the correlation function, which depends on the rotational diffusion constant D and its relative orientation with respect to the principal axes of the field gradient tensor [12,13]. When isotropic reorientation is assumed, f (ω, D) reduces to a single overall correlation time τ mol which is given by the Stokes-Debye formula where V m is the molecular volume, η ν the viscosity of the solution, k B the Boltzman constant, and T the absolute temperature. V m can be estimated as where N 0 is the Avogadro's number and MW and ρ are the molecular weight and the density of the solute (amino acid), respectively. Paramagnetic impurities shorten the relaxation times and might lead to erroneous results. The removal, therefore, of these impurities is necessary in studies of T 2 and T 1 relaxation times. Figure 3 illustrates the pH dependence of the 17 O line widths of 0.1 M glycine in H 2 O [24]. A broad minimum between pH 4 and 7 was observed. This 17 O line width minimum has been previously explained by a decrease of the molecular tumbling time attributable to a reduction in hydration and, thus, intermolecular association of glycine in the zwitterionic form [26]. In the high pH region, a broad maximum at pH ≈ 11 was observed.  the original solution resulted in no line width variation in the neutral and high pH region. It can, therefore, be concluded that this broad minimum at pH ≈ 11 should be attributed to the effect of paramagnetic impurities and not to a hydration change of glycine in the neutral and high pH region [24]. 2 D T 1 relaxation times of C α D 2 of glycine at acidic pH were shown to be shorter relative to those at neutral pH [8]. This shortening in T 1 implies an increase in τ mol and, thus, in the effective molecular weight MW ((1), (2a), and (2b)), which was interpreted with an increase in the hydration state in the cationic form.
Tritt Goc and Fiat [30] investigated in detail the viscosity and temperature dependence of the 17 O NMR line width of glycine, alanine, proline, leucine, histidine, and phenylalanine at pH 2, 7, and 12.5. The experimentally observed viscosity/temperature (η ν /T) dependence of the reorientation correlation time was compared with various hydrodynamic models. A model of the hydration state in the primary solvation sphere of the carboxylic group of amino acids in their cationic state was suggested in which two water molecules are hydrogen bonded to the oxygens and one to the hydrogen of the OH group. In the zwitterionic and anionic states, the hydration model of the carboxylate group can be presented by a structure in which one water molecule is hydrogen bonded to each of the oxygens [30].
The 17 O [10,11] and 14 N NMR [11] line widths of several protein amino acids were measured in aqueous solution to investigate the effect of molecular weight on the line widths ( Table 1). The 14 N and 17 O line widths, under composite proton decoupling, increase with the bulk of the amino acid, and increase at low pH. Assuming an isotropic molecular reorientation of a rigid sphere and, thus, a single correlation time from overall molecular reorientation (τ mol ), then, the line width Δν 1/2 can be expressed in the following form [11]: where MW is the molecular weight, α 1 is the contribution to the line width of the quadrupolar coupling constant, density and temperature, (1), (2a), and (2b), and α 0 is the solvent viscosity-independent contributions to the line width due to the primary hydration sphere of the amino acids. The linear correlation between Δν 1/2 and MW at pH 6 for both 14 N and 17 O nuclei (Figure 4) is in agreement with the hydrodynamic model of (3) [11]. Furthermore, the χ( 17 O) of the amino acid is independent of both the ionization and the degree of hydration of the carboxyl group [10]. The increase in the 17 O line widths at acidic pH (∼100 ± 31 Hz), relative to those at neutral pH, was interpreted by a change in the rotational correlation time and, thus, effective MW of the amino acids, (3). This implies that the cationic form of the amino acids is more hydrated by an access of 1.3 to 2.5 molecules of water relative to that in the zwitterionic form [11] with lifetimes that are longer than the overall molecular rotational correlation time, presumably 2-10 ps [10].
In the case of a stochastic diffusion of the amino and carboxyl groups comprising contributions from internal (τ int ) and overall (τ mol ) motions, the correlation time τ c for 14 N or 17 O is given by [31] τ c = τ mol A + (B + C) (12/r)τ int τ mol + (12/r)τ int (4a) where θ is the angle between the rotation axis and the main field gradient (r denotes an r-fold jump mechanism). Since the sum of A, B, and C is equal to 1, (4a) can be rewritten as where Equations (5a) and (5b) can be rewritten as Since A and τ i can be assumed to be constant for all the amino acids, (4a) and (4b) can be written as where α 0 − α 3 are constants. The minimization of (7) on the basis of the 17 O experimental data gave the mean difference of 35.8 ± 17.3 in MW between pH 0.5 and 6.0 for three different Δν 1/2 values: 250, 350 (Figure 4(b)), and 500 Hz. This was interpreted by an excess of 1-3 water molecules at pH = 0.5. The difference in the 14 N line widths at the two ionization states (Figure 4(a)) should be attributed to differences in the correlation times and to a decrease in the χ( 14 N) on deprotonation of the carboxyl group. In the case of the linear model, the influence of variations of values of the χ( 14 N) to the line width, Δν 1/2 , is less for small molecular weights. Therefore, for Δν 1/2 = 70 Hz (Figure 4(a)), the difference in MW will be a reasonable approximation of the difference in hydration in the two states. The calculated value was found to be 45.2 ± 7.4, which corresponds to an excess of 2-3 water molecules in the cationic form compared to that in  the zwitterionic form, in reasonable agreement with the 17 O NMR data [11].
More recently, Takis et al. [32] investigated the C 13 α C longitudinal relaxation times (T 1 ) and 14 N line widths (Δν 1/2 ) of amino acids and acetyl-amino acids in aqueous solutions at acid and neutral pH. Both 13 C α and 14 N values indicate that amino acids and acetyl-amino acids at acid pH interact with an access of one water molecule with respect to their deprotonated form at neutral pH. On the contrary, 13 C α and 14 N values of betaines (R 3 N + CH(R)COO − ) demonstrate no hydration differences in acid and neutral pH values.

Crystallographic Data and Statistics.
Crystal structure databases provide a rich source of information to extract details on the architectures and interactions of molecules. This kind of search provides the opportunity to examine the formation of intramolecular and intermolecular hydrogen bond in small molecule crystal structures [33,34]. Propensities for the hydration of the α-carboxylate group of amino acids and their derivatives were derived on the basis of exhaustive searches in the Cambridge Crystallographic Database (CSD). Since intermolecular hydrogen bonds are preferred when five-or six-membered conjugated rings are  formed [35], particular attention has been given to the hydrogen bond patterns in the vicinity of the carboxylate group that involves two simultaneous hydrogen acceptors. The concept of five-and six-membered conjugated rings, along with three-center (bifurcated) and 4-center (trifurcated) hydrogen bonds, has been acknowledged and accepted widely as an important factor in determining the structure and function of molecules ranging from inorganic to organic and biological molecules [1,[35][36][37][38][39]. Furthermore, Port and Pullman [40] studied theoretically the formate ion-water interaction as a prototype of the carboxylate group. Three energetically favourable hydration sites were obtained, two equivalent sites on the carboxylate oxygens at the exterior of the ion and one water bridging the two oxygen atoms. The ConQuest 1.13 program was used for all the statistical analysis described in this paper. Specifically, the CSD version 5.32 (November 2010) for small molecules was searched, with the following general search flags: R > 0.5, "3D coordinates", and "only organic".
In order to extract the number of entries present in the current database that form six-membered conjugated rings between the two oxygens of the α-carboxylate and the carboxylic group with a molecule of water in the vicinity, the following geometric cut-offs were used: upper limits d = 3Å for (O w )-H· · · O=C and (O w )-H· · · − O-C, and d = 3.5Å for O w · · · O=C and O w · · · − O-C. 44 hits were obtained for the carboxylate state ( Figure 5), whereas only one was derived for the protonated form. Therefore, the number of structures of carboxylates is sufficient to give reasonable statistics. Figure 5 demonstrates that the oxygen of water, O w , is reasonably close to the carboxylate oxygens and displays a significant preference for the O 1 -C-O 2 carboxylate plane.
There is a general correlation between hydrogen bond lengths and hydrogen bond angles ( Figure 6) similar to that observed by Jeffrey and Maluszynska [41]    Furthermore, crystallographic database searches were performed to identify the propensity for the formation of intramolecular hydrogen bond interaction in the carboxylate (NH· · · O − ) and the carboxylic acid (NH· · · O=C) state. Interestingly, 946 and 118 hits were retrieved for the carboxylate and 621 and 6 hits for the carboxylic form in the absence and presence of two molecules of water, respectively. It is evident from Figure 7 that in the presence of two bound water molecules there is a significant reduction in the number of structures with intramolecular hydrogen bond interaction for the carboxylate group and, concurrently, a significant increase in the distance (NH· · · O − ). It is important to note that no intramolecular + NH 3 · · · − OOC hydrogen bonds were observed for 82 amino acid carboxylates with sp 3 -hybridized C β -atoms [42] in agreement with an early survey of amino acid structures determined by neutron diffraction [43].

Conclusions
17 O shielding changes of amino acids as a function of molar fraction of DMSO/H 2 O, the decrease in the longitudinal relaxation times (T 1 ) of C α D and 13 C α , and the increase in line widths of 14 N and 17 O at acidic pH relative to those at neutral pH may be interpreted with the hypothesis that the cationic form of amino acids is more hydrated by 1 to 3 molecules of water than the zwitterionic form. Similar behaviour was also observed for acetylated derivatives of amino acids, but not for betaines, between the protonated and deprotonated carboxyl group. Although the precise hydration differences observed for various nuclei deviate somehow, it may be concluded that these hydrated complexes have lifetimes that are shorter than the NMR chemical shift time scale, but presumably longer than the overall molecular rotational correlation time of 2-10 ps. An exhaustive search in the Cambridge Crystallographic Database (CSD) demonstrates a strong tendency of the two oxygens of the deprotonated carboxylate group to form hydrogen bonds with a single molecule of water. Even though statistical analysis of structural parameters in crystals cannot be used in a straightforward way to derive quantitative structural models in solution, it is of interest to note that this mode of six-membered conjugated ring, which is absent in the case of the carboxylic group, might result in a more compact and, thus, less hydrated structure in aqueous solution, in accordance with the NMR data ( Figure 8).
Furthermore, it may be concluded that the bound molecules of water alleviate the NH· · · O − interaction and very probably this effect is even more pronounced in aqueous solution. From the above, it is evident that the reduced hydration of the carboxylate group, relative to the carboxylic group, should be attributed mainly to the strong tendency of the carboxylate group to form a six-membered conjugated ring with a single molecule of water. The NH· · · O − intramolecular hydrogen bond very probably plays an insignificant role. Constructively, the tentative models illustrated in Figure 8 should be further validated by in silico and experimental approaches. Computational methods complement the experimental results by providing information on the microscope and physicochemical details on the interplay between water and the biomolecule of interest [44][45][46][47]. For example, introduction of solvent effects into molecular dynamics can provide an atomic description of the folding and unfolding of a protein [47]. Furthermore, there is an array of theoretical approaches that have been utilized for treating NMR shieldings in solution [48], that can be classified as continuum models [49,50] and molecular dynamics simulations [51]. Experimental approaches could involve 17 O NMR both in powders and in the crystal state [52] with varying degrees of hydration.