Synchrotron IR microspectroscopy for protein structure analysis : Potential and questions

Synchrotron radiation-based Fourier transform infrared microspectroscopy (S-FTIR) has been developed as a rapid, direct, non-destructive, bioanalytical technique. This technique takes advantage of synchrotron light brightness and small effective source size and is capable of exploring the molecular chemical make-up within microstructures of a biological tissue without destruction of inherent structures at ultra-spatial resolutions within cellular dimension. To date there has been very little application of this advanced technique to the study of pure protein inherent structure at a cellular level in biological tissues. In this review, a novel approach was introduced to show the potential of the newly developed, advanced synchrotron-based analytical technology, which can be used to localize relatively “pure” protein in the plant tissues and relatively reveal protein inherent structure and protein molecular chemical make-up within intact tissue at cellular and subcellular levels. Several complex protein IR spectra data analytical techniques (Gaussian and Lorentzian multi-component peak modeling, univariate and multivariate analysis, principal component analysis (PCA), and hierarchical cluster analysis (CLA) are employed to relatively reveal features of protein inherent structure and distinguish protein inherent structure differences between varieties/species and treatments in plant tissues. By using a multi-peak modeling procedure, RELATIVE estimates (but not EXACT determinations) for protein secondary structure analysis can be made for comparison purpose. The issues of proand anti-multi-peaking modeling/fitting procedure for relative estimation of protein structure were discussed. By using the PCA and CLA analyses, the plant molecular structure can be qualitatively separate one group from another, statistically, even though the spectral assignments are not known. The synchrotron-based technology provides a new approach for protein structure research in biological tissues at ultraspatial resolutions.


The concept of synchrotron
A synchrotron is a source of brilliant light that scientists can use to view the structure of materials at the molecular level [4].In detail, a synchrotron is a giant particle accelerator that turns electrons into light.A synchrotron facility usually consists of six major components.These include the Electron Gun, Linear Accelerator (also called LINAC), Booster Ring, Storage Ring, Beamlines, and End Experimental Stations [4, 27,42].In a short definition, the synchrotron is a facility that produces extremely bright light, million times brighter than sunlight.The extremely bright synchrotron light makes it possible to detect molecular chemical structure within intact biological tissues at ultra-spatial resolution within cellular dimensions [6,7,[25][26][27][29][30][31].

Advanced synchrotron-based fourier transform infrared microspectroscopy
When globar-sourced Fourier Transform Infrared spectroscopy is combined with a microscope, it is called "FTIR microspectroscopy".This technique is able to identify molecular constituents in biological samples from their vibration spectra in mid-IR region [25,26,29,30,43,44].A drawback of FTIR microspectroscopy is resulting diffraction effects if the aperture is decreased to limit the field of view to a small region of interest.At the same time, less light overall reaches the detector, and hence the signal to noise ratio decreases [1,29,30,33,42,45].For study of structural-chemical features of plant tissues at the diffraction limit or a few microns in each spatial dimension, synchrotrons and free electron lasers can be used.The brightness of conventional bench top IR sources is simply too low by 2-3 orders of magnitude [1,33].
Synchrotron light is extremely bright.The beam is non-divergent, intense and extremely fine [29,30,33,42].When a synchrotron light (IR) source, FTIR spectroscopy and microscopy are combined together, it is called "synchrotron radiation-based FTIR microspectroscopy (S-FTIR).Recently, the S-FTIR has been developed as a rapid, direct, non-destructive and non-invasive bioanalytical technique.This technique taks advantage of synchrotron light brightness (which is usually 100-1000 times brighter than conventional globar source and has small effective source size), is capable of exploring the molecular chemistry within microstructures of biological samples with high signal to noise ratio at ultraspatial resolutions as fine as 3-10 µm [1,6,7,11,26,[29][30][31][32][33]39,42,45,50,51,53].This technique is possible to provide information relating to the quantity, composition, structure and distribution of chemical constituents and functional groups in a biological tissue and can encompass a wider spectral range so that more detailed structural information can be extracted.The technique can be used to increase the fundamental understanding of plant structures at the cellular level and bring a new level of understanding of analytical information [2,32,46,58].

Advantages of synchrotron-based fourier transform infrared microspectroscopy
Using standard globar (conventional thermal IR) sourced FTIR microspectroscopy, it is not possible to reveal chemical features of micro-biomaterials, which are of the order of <20 to 50 µm (depending on the type of infrared microspectrometer).The normal plant cell size is around 5-30 µm.With globar source, very poor signal to noise ratio within plant cellular dimensions is obtained [6,29,30,42].Synchrotron based FTIR microspectroscopy allows a very small area to be explored, providing higher accuracy and precision, allows faster data collection, reaches diffraction limit as a few µm and provides very good signal to noise ratio with highly ultraspatial resolution [6,14,26,29,30].It can reveal plant structural-chemical features within cellular dimensions [32,42,44,50,51,[53][54][55][56].This research also shows that synchrotron IR source does not damage any biological tissue.The detailed comparisons between the conventional globar and synchrotron sourced FTIR microspectroscopy have been reported by [6,14,29,30,42].

Protein secondary structures
Protein is one of most important nutrients in human and animal diet.An understanding of the structure of the whole protein is often vital to understanding its digestive behavior, nutritive quality, utilization and availability in animals and humans [66].Studying the secondary structure of proteins leads to an understanding of the components that make up a whole protein.Protein secondary structures include mainly α-helix and β-sheet and small amounts of β-turn and random coil [3,17].The protein secondary structure profiles may influence protein quality, nutrient utilization, availability or digestive behavior [34,46,56,59].High percentage of β-sheet structure may partly cause low access to gastrointestinal digestive enzymes, which may result in a low protein value and low protein availability, such as feather protein.
Research showed that different protein sources had different profiles and ratios of α-helix and β-sheet in protein secondary structures using a relatively estimated method -multi-peak modeling method [56].This study demonstrated that the differences in protein secondary structure profiles and αhelix to βsheet ratio may partially explain the differences in protein digestive behaviors, nutrient utilization and availability in animals.Other studies found that the protein secondary structure profiles influence beef quality, plant desiccation tolerance, long term stability and low temperature tolerance [9,22,37,47,48,60].

Protein structures affected by heat treatments and food processing
Heat processing has been used to improve utilization and availability of protein [49] and inactivate anti-nutrition factors [38] by reducing fermentation and metabolism in the rumen, increasing the amounts of protein entering the small intestine for absorption and digestion [49], and reducing conjugated linoleic acid hydrogenation in the rumen and increasing the amount of conjugated linoleic acid available in the small intestine [36].
The mechanism of altering the degradation and digestive behavior of protein [8] with heat processing is still not clear.It may involve denaturation, which is a disorganization of the overall molecular shape of a protein, unfolding or uncoiling of a coiled or pleated structure, or the separation of the protein into its subunits, which may then unfold or uncoil [15].Any temperature change in the environment of the protein which can influence the non-covalent interactions involved in the structure may lead to an alteration of the protein structure [8] including protein secondary structures.New research [34] shows that heat processing affected protein secondary structure profiles and changed the α-helix to β-sheet ratio.These changes affected nutritive quality.The effects of heat processing on protein nutritive value, utilization and availability and performance in animals are equivocal [57], it is also true in humans.Part of reason is that heating conditions may not be optimal, the feed or food being either underheated or overheated [57].A new approach to check effects of heat processing on protein value and nutrient availability is to look at the magnitude of changes of protein secondary structures in the intrinsic protein structures in terms of α-helix to β-sheet ratio within the intact tissue [59].Most studies in the literature have focused on total protein composition or amino acid affected by heat processing using traditional "wet" chemical analysis without consideration of any inherent structural effects.

Protein IR principal
Total energy of a molecule consists of the translational, rotational, vibrational and electronic energies [19].Different radiation of electro-magnetic (EM) spectrum will result in different energy transitions in a molecule.Organic molecules possess bonds and functional groups.These functional groups vibrate independently of each other.Without any EM radiation effect, these molecules (functional groups) vibrate independently at their equilibrium position.However, when IR radiation is applied to organic molecules (functional groups), it breaks down the molecule's equilibrium (position) state, causing two energy transitions in a molecule.It promotes transitions in a molecule between rotational and vibrational energy.When transitions between rotational and vibrational energy levels occur that causes a net change in the dipole moment, the molecule will absorb IR.Therefore an IR absorption profile is unique to a specific molecular vibration frequency.When IR passes through a sample, some of IR is absorbed by the sample and some of IR is passed through (transmitted).Resulting spectrum represents the molecular absorption / transmission, which creates a molecular fingerprint of the sample.Identification of molecular functional groups is the major application of IR spectrometry [12,19,21,28].

Protein unique IR fingerprint bands
Each biological component has an unique molecular chemical-structural feature, each has its own unique IR spectrum.The characteristic of protein structure is unique in the peptide bond.The peptide bond contains C O, C N and N H.The protein IR spectrum has two primary features.The protein amide I bond is primarily C O stretching vibration (80%) plus C N stretching vibration [18].Protein amide I absorbs at ca. 1650 cm −1 (base region from ca. 1600 to 1700 cm −1 ).Protein amide II which absorbs at ca. 1550 cm −1 (base region from ca. 1480-1560 cm −1 ) consists primarily of N H bending vibrations (60%) coupled with C N stretching vibrations (40%) [18].The vibrational frequency of the protein amide I band is particularly sensitive to protein secondary structure [18,21,25,27,29,30,46] and can be used to predict protein secondary structure.For α-helix, the protein amide I is typically in the range of 1648-1660 cm −1 .For β-sheet, the peak can be found within the range of ca.1625-1640 cm −1 .The amide II (predominantly an N H bending vibration coupled to C N stretching) is also used to assess protein conformation.However, as it arises from complex vibrations involving multiple functional groups they are less useful for protein structure prediction than the amide I band [18].

Plant biological component and heterogeneous protein distribution in plant tissues
Complex plant tissue contains several biological components: protein, lipid, structural and nonstructural carbohydrates, and lignin.The distribution of these biological components are very heterogeneous, including protein [55,60,61].With ultraspatially resolved S-FTIR, the structural-chemical features of the plant tissues within cellular dimensions can be mapped.The results show that the protein distribution and intensity, even within cellular dimensions in the plant tissues, is unequally distributed [46,55].

Identification of relatively "pure" protein in plant tissue at a cellular level with synchrotron technology
When studying protein structures including protein secondary structure, we need to use relatively "pure" protein tissues to reduce the effects of other biological components.Because the protein distribution and intensity, even within cellular dimensions in the plant tissues is unequal, this justifies a need to select areas with relatively "pure" protein in the plant for the study of protein secondary structures with minimum disturbance by other biological components such as carbohydrates scattering effect [46].
With the newly advanced synchrotron technology (S-FTIR), localization of relatively pure protein in seeds is achievable.The relatively pure protein structural features in a plant tissue can be revealed at a cellular level [46,60].It is not possible to localize the pure protein in the tissue using conventional globarsourced FTIR spectroscopy or microspectroscopy, because they are not able to detect tissue features within cellular dimensions.

Methodology for protein secondary structure analysis on relative basis
To demonstrate how the S-FTIR can be used for RELATIVE analysis of protein secondary structure (not EXACT determinations), the following is a review of methodology used in applying this techniques.

Plant tissue ir window or slide preparation for synchrotron transmission or reflection infrared microspectroscopy
Plant samples were cut into thin cross-sections (ca.6 µm thick) using a microtome.The unstained cross-sections were mounted onto Low-e IR microscope slides (Kevley Technologies, Chesterland, OH) for S-FTIR in reflectance mode or transferred to BaF 2 windows (size: 13×1 mm disk; Spectral Systems, NY) for transmission mode synchrotron FTIR microspectroscopic work.

Synchrotron light sources and synchrotron-based FTIR microspectroscopy
The spectroscopic images were recorded using a Nicolet Magna 860 FTIR (Thermo Nicolet, USA) equipped with a Continuµm IR microscope (Spectra Tech, USA), mapping stage controller, 32× objective and a mercury cadmium telluride (MCT-A) detector at NSLS (New York).The bench was configured with a collimated synchrotron light beamline served as an external input to the Thermo Nicolet Instruments -Nicolet Magna 860.The modulated light was passed through the IR microscope to perform transmission or reflection microscopy.The spectra were collected in the mid-infrared range of 4000-800 cm −1 at a resolution of 4 cm −1 with 64-128 scans co-added and an aperture setting of ca. 10 µm × 10 µm.The reasons for the chosen aperture size of 10 µm × 10 µm were: 1) The size was within cellular dimension; 2).The 10 µm ×10 µm aperture size was the best for getting good signal to noise ratio spectrum mapping of feed tissues.To minimize IR absorption by CO 2 and water vapour in ambient air, the optics were purged using dry N 2 .A background spectroscopic image file was collected from an area free of sample.The mapping steps were equal to aperture size.Stage control, data collection and processing were performed using OMNIC (Thermo-Nicolet, Madison, Wisconsin).

Chemical imaging and protein IR spectrum data analysis
In the microspectroscopic area mapping, the spatial information was obtained by translating the tissue along the xand y-axis, and positioning different parts of the designated tissues in the synchrotron IR beam of the microspectroscope.The motorized computer-control stage was programmed to trace the designated areas in the tissues.After measuring the spectra of all parts of interest, the spectral information was related to the visible images.As a result, spectral data sets were formed with the xy surface, corresponding to the scanned area of the sample, and z direction, which contained the spectral information.Functional group images (such as amide I, aromatic compound, carbonyl ester) were generated by plotting the intensity of synchrotron IR absorption bands as a function of xy position [2].Different coverage of the sample with measurements could also be achieved by varying the step size and the dimensions of the image mask (aperture size).
The data can be displayed either as a series of spectroscopic images collected at individual wavelengths, or as a collection of infrared spectra obtained at each pixel position in the image.Functional group bands in plant tissues (e.g.Fig. 1) were according to reports [13,18,25,[29][30][31]42,45].Chemical imaging of functional groups was determined by the OMNIC software (Thermo-Nicolet, Madison, Wisconsin) or Cytospec (2004).Peak ratio images were obtained by the height or area under one functional group band (such as amide I 1650 cm −1 ) divided by the height or area under another functional group band (such as starch 1025 cm −1 ) at each pixel (pixel size 10×10 µm), which represented biological component ratio intensity and distribution in the plant tissue.False color maps derived from the area under particular spectral features were used to represent distribution and intensity across the area of interest [30,31,55,62].

Gaussian and lorentzian multi-peaks modeling and relative estimation of model-fitted α-helix to β-sheet ratio in protein secondary structures
The chemical mapping of protein and protein to starch ratio provided spectra data with relatively "pure" protein for modeling protein amide I component peaks.The procedure of the relatively "pure" protein in plant tissues followed the methodology published by Wetzel et al. [46].The brief procedure is as follows: Within protein or protein to starch chemical ratio image of plant tissues, the relatively pure protein spectrum data with high quality were selected for relative quantification of protein secondary structures (α-helix to β-sheet ratio) [46].This is because the plant protein is heterogeneous.In order to eliminate carbohydrate scattering effect on the protein spectrum, the relatively "pure" protein areas were selected for protein secondary structure analysis on a relative basis.
Because protein amide I component bands were overlapped (Fig. 2), a specific multi-peak fitting or modeling procedure was required [63].To estimate the RELATIVE amount of model-fitted α-helix to β-sheet ratio in the protein secondary structures, two steps were applied.The first step was using Fourier self-deconvolution (FSD) in OMNIC to obtain the FSD spectrum in protein amide I region to identify protein amide I component peak frequencies (Fig. 2).It needs to be mentioned that the peaks from ca. 1610-1560 cm −1 region are not strictly amide I peaks and could be influenced by other components.That's why we only analyzed FSD spectrum at ca. 1700-1620 cm −1 region.The detailed concepts and algorithm of FSD (FSD: a method for resolving intrinsically overlapped bands) were described in Kauppinen et al. [20] and Griffths and Pariente [10].The second step was using multi-peak fitting program with Gaussian and Lorentzian functions (Fig. 3) using Origin data analysis software to quantify the    multi-component peak areas in protein amide I bands.The detailed descriptions were reported in Origin in terms of peak shape, centre, offset, wide and areas.The comparison of Gaussian with Lorentzian analytical method for plant and seed molecular biology and chemistry research was reported in Yu [63].The relative α-helix to β-sheet ratio based on model-fitted component peak areas was calculated according to the report generated by the software [59].It needs to be mentioned that using a multi-peak modeling/fitting method can only make a relative estimate (not exact determinations) of protein secondary structure ratio and model-fitted amount of protein secondary structure.High quality of protein amide IR spectrum data is a key to successfully multi-peak modeling/fitting procedure.More discussion on pro and anti-multi-peaks modeling procedure is in Section 5.

Univariate and multivariate analysis for protein ir spectrum
Statistical approaches to analyze spectral data collected under the S-FTIR usually include uni-and multivariate statistical methods.The univariate methods of analysis consist of various mapping displays of spectral data.Usually the researcher may select band intensities, integrated intensities, band frequencies, band intensity ratios etc., to construct false color maps of the spectral data [30].The multivariate methods of data analysis create spectral corrections and maps by including not just one intensity or frequency point of a spectrum, but by utilizing the entire spectral information.The methods include Hierarchical cluster analysis (CLA), which is a technique which clusters infrared spectra in a map based on similarity with other spectra in the same map and principal component analysis (PCA) [64].The big advantage of CLA and PCA analysis is that you do not need to know or don't care what the spectral assignments are, just want to qualitatively separate one group from another, statistically.

Hierarchical cluster analysis for protein ir spectrum to detect protein structural difference
An agglomerative hierarchical cluster analysis was used to perform a cluster analysis of protein IR spectra data sets and displayed the results of CLA as images or as dendrograms [5] when a biological component band has more or less interference with other bands.First, it calculates a distance matrix, which contains information on the similarity of spectra.Then, in hierarchical clustering, the algorithm searches within the distance matrix for the two most similar IR spectra (minimal distance).These spectra are combined into a new object (called a "cluster" or "hierarchical group").The spectral distances between all remaining spectra and the new cluster are re-calculated [5].It is a technique which clusters IR spectra based on similarity with other spectra.The Ward's algorithm method was used without any prior parameterization of the spectral data (used original protein amide I FSD spectral data) in the protein IR region.This method gave results that shows it is possible to discriminate the different varieties or treatments of plant tissues [64,65].

Principal component analysis for protein ir spectrum to detect protein structural difference
The second multivariate analyses involved is the PCA analysis which is a statistical data reduction method (Jackson, M., personal contact).It transforms the original set of variables to a new set of uncorrelated variables called principal components (PCs).The first few PCs will typically account for >95% variance.The purpose of PCA analysis is to derive a small number of independent linear combinations (PCs) of a set of variables that retain as much of the information in the original variables as possible.This analysis allows studying globally the relationships between p quantitative characters (e.g. chemical functional groups) observed on n samples (e.g. protein IR spectra).The basic idea is to extract, in a multiple variable system, one, two or sometimes more PCs that carry maximum information.These components are independent (orthogonal) of each other and the first factor generally represents maximum variance.As factors are extracted, they account for less and less variability and the decision of when to stop basically depends on the point when there is only very little significant variability left, or merely random noise.Thus, reduction of data provides a new coordinate system where axis (eigenvectors) represent the characteristic structure information of the data and the spectra may then be simply described as a function of specific properties, and no longer as a function of intensities.The outcome of such an analysis can be presented either as a 2D (two PCs) or 3D (three PCs) scatter plots [35].The PCA analysis can be used to identify the main sources of variation in the synchrotron-based protein amide I FSD spectra of different plant tissues at a cellular level and identify features that differ among varieties or treatments of plant tissues [64].

Application 1: Using synchrotron FTIR microspectroscopy to reveal protein IR spectral detail in a heterogeneous matrix dominated by starch in wheat
Wheat endosperm may be described as a high population of starch granules in a sea of protein.The strategy used in the study by Wetzel et al. [46] was to use high spatially resolved infrared microspectroscopy with synchrotron radiation to reveal good in situ spectra of the protein located interstitially includes the following: the use of spatial resolution achievable with synchrotron infrared microspectroscopy and sorting out protein dominated spectra among adjacent mapping pixels to minimize the starch absorption and scattering contributions.A numerical estimate of α-helix population relative to other secondary protein structures is done from the position and shape of the amide I absorption band in both hard and soft classes of wheat.Data treatments include deconvolution and modeling of the resulting bands in the region to enable measurement of relative amounts of both forms.The relative percent of α-helix difference between hard and soft wheat classes reported was in agreement with those recently reported using Raman microspectroscopy.

Application 2: Comparison between yellow-(Brassica Rapa) and brown-seeded (Brassica Napus) canola in protein composition profile
This study was designed to use synchrotron FTIR microspectroscopy to identify the molecular structural differences of protein between yellow-vs.brown-seeded Brassica canola.The study showed that the yellow-seeded canola contained a relatively lower ratio of model-fitted α-helix to β-sheet (1.3 vs. 1.9) than the brown-seeded canola based on multi-peaking modeling estimation.The results indicated that protein quality of the yellow-seeded canola may be lower than that of the browned-seeded canola because of a relatively higher ratio of model-fitted β-sheet to α-helix in protein secondary structure.The CLA and PCA analysis did not show distinguished differences between the yellow-and brown-seeded canola tissues in protein amide I structures, indicating they are related to each other.This new approach can be used to provide information for canola breeding to maintain and improve high quality of protein in canola seeds.

Application 3: Use of synchrotron-based ftir microspectroscopy to study protein secondary structures of raw and heat treated brown and golden flaxseeds
In this study, Yu et al. [59] used the synchrotron-based technology to reveal ultra-structural chemical features of protein secondary structures of flaxseed tissues affected by variety and heat processing (raw and roasting) within intact tissue, in relation to protein nutritive value, utilization and availability and to estimate protein secondary structures using multi-component peak modeling methods (Gaussian and Lorentzian).The purpose of this study was not to determine the exact amount of protein secondary structure.The detailed comparison of the two multi-peak modeling techniques in detecting plant tissue inherent structure has also been reported in Yu [63].The study showed that with the synchrotron-based technology, the protein molecular chemical makeup of the flaxseed tissues could be revealed at ultraspatial resolution within cellular dimensions [59].The protein secondary structure differed between the golden and brown flaxseed tissues in terms of ratio of model-fitted αhelix and β-sheet.By using multi-component peak modeling at the protein amide I region, the results show that the golden flaxseed contained relatively higher ratio of model-fitted α-helix to β-sheet (1.3 vs. 0.8), indicating potential high protein value, high nutrient utilization and availability in the golden flaxseed.Roasting reduced the relative percentage of model fitted α-helix, increased relative percentage of model fitted β-sheet and reduced relative α-helix to β-sheet ratio (1.3 to 0.7) of the golden flaxseed.However, roasting did not affect ratio of model-fitted α-helix and β-sheet in the brown flaxseed tissue.These results indicated different sensitivities of protein secondary structure to the heat processing between the flaxseed varieties and that roasting may affect protein value, nutrient utilization and availability in the golden seeds but not in the brown.In this study, multivariate statistical analysis was also used to discriminate and classify inherent protein structures of the raw and roasted brown and golden flaxseed protein tissues.Although the cluster analysis showed that the four flaxseed treatments (F1 to F4) cannot be fully distinguished, however, when brown and golden flaxseed were combined together and emphasized on the raw and roasting effects, the PCA could fully distinguish between the raw and roasted flaxseeds in protein amide I FSD spectrum (for Theory about FSD see [20]) and the raw and roasted can be grouped in separate ellipses.The first three PCs explain 36, 19, and 17% of the variation in the protein amide FSD spectrum data set [59].

Application 4: Shining light on the difference in molecular structural chemical make-up and the cause of distinct degradation among barley varieties: A novel approach
The NRC chemical approach with conventional "wet" chemical analysis can determine total chemical composition, but fails to detect the plant intrinsic structures and biological component matrix.Plant feed and seed quality, digestive behavior and nutrient availability are closely related to not only total chemical composition, but also feed molecular structural chemical make-up.Harrington (malting-type) and Valier (feed-type) barley contain the same chemical composition but have significantly different degradation kinetics and behaviors in ruminant digestive system [52].This study (Yu et al., unpublished) used S-FTIR as a novel approach to identify the differences in protein and carbohydrate molecular structural chemical make-up between the two barleys using multivariate analysis -PCA and CLS and illustrated what are reasons to cause the significantly different degradation kinetics and behavior.The items we assessed in this study included: 1) molecular structural differences in protein amide I to amide II intensities and their ratio within cellular dimensions; 2) molecular structural differences in protein secondary structure and their ratios; 3) molecular structural differences in carbohydrate component peak profile.The hypothesis was that barley molecular structural chemical make-up affects the barley quality, fermentation and degradation behavior in animals.The results showed that with the newly developed advanced synchrotronbased analytical technique (S-FTIR), the protein and carbohydrate molecular structural chemical makeup of barley could be revealed and identified as follows: (1) There were structural differences in protein amide I to amide II ratio between the barleys: Protein amide I and II profile depends on protein molecular structural chemical make-up, which can be affected by processing and variety.Recent medical research shows that compared with normal tissue, disease tissues (such as Alzheimer, Prion tissue) change the protein amide I to amide II ratio with a lower ratio of protein amide I to amide II [23,30,40].In plant structure research, there has not been any research published on protein amide I and amide II profile and their ratio and how this information relates to nutritive value and availability in animals and humans.Yu et al. (unpublished) compared the Valier and Harrington barley, Harrington barley showed significant difference in protein amides profile.Harrington was lower in protein amide I (53 vs. 72) and amide II (19 vs. 24), and also lower in protein amide I to amide II ratio (2.8 vs. 3.0).These results indicated that the two barleys had different protein molecular structural chemical make-up.The different protein amide I to amide II profile and their ratio may be part of reason to cause different degradation behavior between the two barleys [52].
(2) The protein molecular structural chemical make-up significantly differed between the two barleys, including protein secondary structure profile and α-helix to β-sheet ratio: Yu et al. (unpublished) also found that: 1) within the barley varieties, there existed differences in protein secondary structure conformation, indicating differences in protein molecular structural chemical make-up and features.These protein structural differences may have some impact on barley protein utilization and availability.The PCA and CLA analysis were able to discriminate and classify protein inherent structure chemical make-up between Harrington and Valier barley.Figure 4 displays the results in the form of a dendrogram.From this diagram, two classes can be distinguished below a linkage distance less than 15, with Valier group forming a separate group.Depending on the aggregation level (horizontal axis) different explanations can be inferred.The Valier group forms one distinct group just In other words, the Valier can be grouped together with the Harrington group for a linkage distance equal to about 16. Figure 4 shows significantly different protein spectral clusters between Harrington and Valier barley, indicating that 1) the Valier and Harrington barley can be almost distinguished, and 2) their protein structural chemical make-up were different, except two cases of Valier protein spectrum (Fig. 4). Figure 5 shows results from PCA analysis of the synchrotron FTIR spectrum data obtained from Valier and Harrington (V, H).First three PCs (obtained after data reduction) were plotted (Fig. 5a, b).These show that the PCA analysis could fully distinguish between the Valier and Harington in protein amide I FSD spectrum and the Valier and Harrington can be grouped in separate ellipses (Fig. 5a, b) and no overlapping of each group.The first three PCs explain 36, 23, and 16% of the variation in the protein amide FSD spectrum data set.
(3) But there was no difference in carbohydrate molecular structural chemical make-up: Figures 6 and 7 indicate that the carbohydrate molecular chemical make-up and inherent structure were similar between the two barleys.The study (Yu et al., unpublished) may implicate that the only protein, not carbohydrate, molecular structural chemical make-up may play a major role to cause different degradation kinetics and behavior between the two barleys.

Several important points, issues and arguments in protein secondary structure analysis using multi-peak modeling procedures
Using a multi-peak modeling procedure to study protein secondary structure still exist controversy (pro-and anti-) in scientific research community.From the survey I did, most research scientists support multi-peak modeling procedure for relatively estimation of protein secondary structures.However, some reject this procedure.The following are some important points and arguments needed to be pointed out and discussed.

Important point one: Relative estimates and not exact determinations
It needs to be pointed out that using a multi-peak modeling for protein secondary structure analysis is only fine for making RELATIVE estimates and not EXACT determinations (Dr.Robert Julian, Synchrotron Radiation Center (SRC), Wisconsin, Personal contact).As with any mathematical modeling, we have to be careful of all the variables involved.However, as long as everything is treated the same, good relative estimates can be made (Robert Julian, personal contact).In our plant/seed/feed research program, the purpose is to show the relative differences of the ratio and percentage of model fitted αhelix and β-sheet between or among different plant/seed proteins.Our purpose is not to determine exact protein secondary structure content and only for relative comparison purpose between plants or seeds in terms of the ratio of model fitted α-helix and β-sheet.We need to remember that tt is impossible to determine exact amount of protein secondary structure a multi-peaking modeling/fitting.

Important point two: Different molar absorptivity in α-helix and β-sheet indicates model-fitted amount of protein secondary structures are different from the real
A leading senior scientist Dr M. Jackson (NRC, Canada, personal contact) pointed out that how valuable peak modeling/fitting is really dependent on what you are trying to find out.If you want to find absolute amounts of a particular compound, it can be problematic, as it requires the following assumptions: 1) that you know the correct number of bands to fit; 2) that you know the correct shape of the bands; 3) usually it is assumed that all bands have the same shape; 4) For complex systems with multiple absorptions coming from the same chromophore you assume that the molar absorptivity (amount of light absorbed per mole of chromophore) of each group giving rise to a band is the same.
Item number 4 is the most critical for IR studies of proteins, as it assumes that C O groups in αhelices and β-sheets have the same molar absorptivity.Jackson, Haris and Chapman (1989) have shown that this is not so, α-helices, β-sheets and unordered structures have very different molar absorptivities.So, if having 10 C O groups each in each of the 3 structures, the bands from each will not be the same intensity but will vary.So, if you just integrate the area and assume equal molar absorptivities, you will get an overestimate of some features and an underestimate of others.However, if you are more interested in an estimate of the RELATIVE amounts of material present.In this case of protein secondary structure study, multi-peak modeling/fitting is acceptable as all you are interested in is the ratio and how this changes.As long as you are not making claims about absolute amounts, this approach is just fine (Dr.M. Jackson, NRC, Canada, Personal contact).

Important point three: Protein regions chosen for modeling are different
It needs to be mentioned that protein IR spectrum region that researchers use for relative protein secondary structure analysis are different on relative estimation based.There is no standard for protein amide region chosen for protein secondary structure study.Some scientists use whole both protein amide I and amide II region (ca.1720-1480 cm −1 region) for multi-peak modeling to relative estimate protein secondary structure (Drs Paul Dumas, Lisa Miller (NSLS), Personal contact).The reason for choosing whole protein amide I and II region is that it is difficult to define a correct baseline after the amide I band because it does not go to zero before the amide II band appears (Dr.Lisa Miller, Personal contact).They think that the best way to account for this is to curve-fit the amide II band along with the amide I band.However, some researchers use protein amide I region only for multi-peak modeling to relatively estimate protein secondary structure.Even in protein amide I region, the region range chosen are also different.Some researchers use protein made I region from 1700-1620 cm −1 and some choose protein amide I region from 1700-1560 cm −1 for multi-peak modeling to relatively estimate protein secondary structure.Therefore, when researcher choose different protein amide I region even using the same protein spectrum, the relative estimates for protein secondary structure will be different.However, as long as everything is treated the same, good relative estimates for comparison of different varieties and treatments can be relatively made in biological tissues.Some researchers think that the peaks from ca. 1610-1560 cm −1 region are not strictly amide I peaks and could be influenced by other components.More basic research needs to be done in this area.

Important point four: Protein original spectrum vs. protein FSD spectrum for multi-peak modeling
It needs to be mentioned that there is no standard method that all scientists agree to use for relative estimates of protein secondary structure ratios.Some researchers used original protein IR spectrum for analysis.Some use protein FSD spectrum for protein secondary structure ratio analysis.The detailed concepts and algorithm of FSD (FSD: a method for resolving intrinsically overlapped bands) were described in Kauppinen et al. [20] and Griffths and Pariente [10].The disadvantage of using protein original spectrum for multi-peak modeling is that it is required to know the correct number and correct shape of bands to fit.Usually it is assumed that all bands have the same shape [16].The advantage of using protein FSD spectrum for relatively estimating protein secondary structure ratio is that there is no assumption for equal component band regions for each component band.The component band are determined automatically, there is no human assumption in this method that all component band have the same shape and same regions.Although it is not possible to determine exact amount, researchers are only interested in differences between the treatments.The protein FSD spectrum analysis can be used for protein secondary structure ratio estimation.

Important point five: A signal to noise ratio vs. true protein component peaks
When using FSD in protein amide I region (ca.1700-1620 cm −1 or 1700-1580 cm −1 ) or protein amide I and II (ca.1720-1480 cm −1 ) region to identify protein amide I component peak frequencies, we need to distinguish whether peaks are true protein IR FSD component peaks or just noises.The method to find identify peak or noise is to look at the spectrum around 2000 cm −1 region.In a biological tissue, there is no any peak around 2000 cm −1 region.We can also estimate noise level by looking at the spectrum in the 2000 cm −1 region.It is NOT correct to report more significant digits than are permitted by the noise level (Dr.Gough, K.M., personal contact).The quality of protein spectrum is crucial for multi-peak modeling work.We have to use high quality protein spectrum for protein secondary structure study using a "cherry pick-up" method reported in Wetzel et al. [46].

Important point six: How many protein component bands?
To relatively estimate protein secondary structure, we need to find how many protein amide I bands in a plant tissue.Usually we using the FSD method to determine protein amide I component bands.In a plant tissue, it has been found that a plant protein tissue contain up to 8 multi-component bands.Protein amide I with 5 multi-component peaks are usually at ca. 1691, 1676, 1656, 1634, and 1597 cm −1 ; Protein amide I with 6 multi-component peaks are usually at ca. 1691, 1677, 1656, 1634, 1608, and 1589 cm −1 ; Protein amide I with 7 multi-component peaks at ca. 1689, 1675, 1654, 1636, 1620, 1602, and 1587 cm −1 ; and Protein amide I with 8 multi-component peaks at ca. 1688, 1673, 1657, 1640, 1626, 1614, 1602, and 1590 cm −1 .The band assignments for the all component bands are not very clear.Further more basic research needs to be done in this area.

Important point seven: Issues raised from anti-multi-peak modeling/fitting researchers
Because the multi-peak modeling/fitting is very complicated and involves in several assumptions, some researchers suggest not use multi-peaking modeling procedure for protein secondary structure study.They believe that the protein amide I band is not a sum of 4 or 5 bands.It is composed of literally millions of bands, all falling in the same spectral region (Dr.Gough K.M., personal contact).To get a ratio of α-helix and β-sheet, these researchers use original spectrum data to estimate.The assignment of the amide I regions to α-helix and β-sheet is not a scientifically fixed process and the maxima are different in different tissues.Therefore, relative α-helix to β-sheet ratio will depend hugely on the protein amide baseline and protein secondary structure (such as α-helix and β-sheet) regions they choose.The method by using original spectrum analysis for the estimates of relative α-helix to β-sheet ratio easily results in human bias and error.

Conclusions, implications, and future research
All above studies demonstrate the potential of the ultraspatially resolved synchrotron-based technology (S-FTIR) to localize relatively "pure" protein in the plant tissues and reveal protein molecular structure in terms of amide I and amide II profile and protein secondary structure profile on a relative basis at a cellular level without destruction of the inherent structure of seed tissue, which may provide an indication of plant fermentation characteristics, and nutritive values.
To detect the relative differences in terms of the ratio of model-fitted alpha-helix to beta-sheet between different varieties species and treatments, not for exact determinations and only for relative comparison purpose between plant/feed tissues, the multi-peak modeling procedure can be one of methods we can be used to relatively detect protein structure difference between species and varieties, and between different treatments (physical, chemical and biological treatments) which affect plant tissues.However, the arguments still exist in using the multi-peak modeling procedure for protein secondary structure study.The CLA and PCA analyses in protein secondary structure study are used in order to qualitatively separate one group from another, statistically.The big advantage is that the researchers do not need to know what the spectral assignments are.
Further more study is needed to understand and further quantify the relationship between protein molecular structural chemical make-up and nutritive value, digestive or fermentation behavior affected by plant varieties/species and processing.More study is needed for assignments of protein amide component bands.It is believed that with the advanced synchrotron technology (S-FTIR), it will make a significant step and an important contribution to protein molecular structural-chemical research.

Fig. 1 .
Fig. 1.Spectra of pericarp and endosperm of cereal grain seed tissues selected from corresponding area from the visible images, showing that similar morphological parts exhibit similar spectral characteristics (A: pericarp; B: seed coat; C: aleurone; D: endosperm) (functional group bands: protein amide I at 1650 cm −1 ; amide II at 1650 cm −1 ; starch at 1025 cm −1 ).

Fig. 2 .
Fig. 2. Typical synchrotron FTIR spectrum and Fourier self-deconvolution (FSD) spectrum of cereal grain seed tissues at a cellular level (pixel size 10 µm × 10 µm): a: FSD spectrum of protein amide I with 5 multi-component peaks; b: FSD spectrum of protein amide I with 6 multi-component peaks; c: FSD spectrum of protein amide I with 7 multi-component peaks; d: FSD spectrum of protein amide I with 8 multi-component peaks.

Fig. 3 .
Fig. 3. Gaussian and Lorentzian functions used for multi-peaks fitting (or modeling) in protein amide I regions.

Fig. 5 .
Fig. 5. Scatter plot of the 1 st principal component vs. the 2 nd principal component and the 3 rd principal component of PCA analysis of synchrotron-based protein FSD protein amide I spectrum (1710 to 1576 cm −1 ) obtained from Harrington and Valier barley at a cellular level (pixel size 10 µm×10 µm): The 1 st , 2 nd and 3 rd principal component explains 36.1, 22.6 and 16.01% of the total variance, respectively.

Fig. 7 .
Fig. 7. Scatter plot of the 1st principal component vs. the 2nd principal component and the 3rd principal component of PCA analysis of synchrotron-based carbohydrate spectrum (1180 to 800 cm −1 ) obtained from Harrington and Valier barley at a cellular level (pixel size 10 µm × 10 µm): The 1st, 2nd and 3rd principal component explains 85.3, 8.1 and 4.6% of the total variance, respectively.